fix(sdk): resize Anthropic many-image inputs by Zheng-Lu · Pull Request #2552 · OpenHands/software-agent-sdk

Zheng-Lu · 2026-03-23T22:51:01Z

Summary

Reproduces and fixes the Anthropic many-image failure by resizing oversized base64 images during LLM message formatting.

What Changed

Added an Anthropic-only resize path in LLM.format_messages_for_llm
Resize only triggers when the outgoing request crosses the many-image threshold
Preserves aspect ratio and leaves URL images unchanged
Added pillow as a runtime dependency for in-memory image resizing

Validation

pytest tests/sdk/llm/test_llm_image_resizing.py Passed
pytest tests/sdk/llm/test_llm_image_resizing.py tests/sdk/llm/test_vision_support.py Passed
ruff check openhands-sdk/openhands/sdk/llm/llm.py tests/sdk/llm/test_llm_image_resizing.py Passed
pyright openhands-sdk/openhands/sdk/llm/llm.py tests/sdk/llm/test_llm_image_resizing.py Passed

Proof

Now the multiple-images request with at least one image > 2000px doesn't throw the error litellm.BadRequestError

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww · 2026-03-27T06:46:54Z

@OpenHands pls merge from main, resolve all conflicts. Then do /codereview-roasted /github-pr-review

openhands-ai · 2026-03-27T06:47:16Z

I'm on it! xingyaoww can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww

Taste Rating: 🟡 Acceptable — Works, but the structure needs improvement

Linus's Three Questions:

Is this solving a real problem? — Yes. Anthropic's many-image limit is a real production failure.
Is there a simpler way? — Yes. This is ~80 lines of image manipulation code jammed into a 1500-line god-class. Extract it.
What will this break? — Adding pillow as a hard runtime dependency to the core SDK is the biggest concern. Every user now pays for PIL whether they use images or not.

VERDICT:
❌ Needs rework — The fix is directionally correct, but the dependency strategy and code placement need redesign before merging.

KEY INSIGHT:
The core problem is treating PIL as a hard SDK dependency and stuffing image-processing plumbing into the LLM class, when this should be a lazy-loaded utility module.

xingyaoww · 2026-03-27T06:52:21Z

+    def _apply_outgoing_image_resize(
+        self, messages: list[Message], *, vision_enabled: bool
+    ) -> None:
+        max_dimension = self._get_outgoing_image_max_dimension(
+            messages=messages, vision_enabled=vision_enabled
+        )
+        if max_dimension is None:
+            return
+
+        for message in messages:
+            for content_item in message.content:
+                if isinstance(content_item, ImageContent):
+                    content_item.image_urls = [
+                        self._resize_base64_data_image_url(
+                            url, max_dimension=max_dimension
+                        )
+                        for url in content_item.image_urls
+                    ]
+
+    def _get_outgoing_image_max_dimension(
+        self, messages: list[Message], *, vision_enabled: bool
+    ) -> int | None:
+        if not vision_enabled or self._infer_litellm_provider() != "anthropic":
+            return None
+
+        total_images = sum(
+            len(content_item.image_urls)
+            for message in messages
+            for content_item in message.content
+            if isinstance(content_item, ImageContent)
+        )
+        if total_images <= ANTHROPIC_MANY_IMAGE_THRESHOLD:
+            return None
+
+        return ANTHROPIC_MANY_IMAGE_MAX_DIMENSION
+
+    @staticmethod
+    def _resize_base64_data_image_url(url: str, *, max_dimension: int) -> str:
+        if not url.startswith("data:image/"):
+            return url
+
+        header, sep, encoded = url.partition(";base64,")
+        if not sep:
+            return url
+
+        mime_type = header.removeprefix("data:")
+
+        try:
+            raw_bytes = base64.b64decode(encoded)
+            with Image.open(io.BytesIO(raw_bytes)) as image:
+                if max(image.size) <= max_dimension:
+                    return url
+
+                resized_image = image.copy()
+                resized_image.thumbnail(
+                    (max_dimension, max_dimension), Image.Resampling.LANCZOS
+                )
+                image_format = image.format or mime_type.split("/", 1)[1].upper()
+
+            if image_format == "JPG":
+                image_format = "JPEG"
+
+            if image_format == "JPEG" and resized_image.mode not in ("RGB", "L"):
+                resized_image = resized_image.convert("RGB")
+
+            buffer = io.BytesIO()
+            resized_image.save(buffer, format=image_format)
+        except Exception:
+            logger.warning(
+                "Failed to resize base64 data image for outgoing LLM request",
+                exc_info=True,
+            )
+            return url
+
+        resized_encoded = base64.b64encode(buffer.getvalue()).decode("ascii")
+        return f"data:{mime_type};base64,{resized_encoded}"


🟠 Important — 80 lines of image manipulation don't belong in LLM

llm.py is already 1500+ lines. These three methods (_apply_outgoing_image_resize, _get_outgoing_image_max_dimension, _resize_base64_data_image_url) are pure image-processing utilities with zero dependency on self state (one is already a @staticmethod, the other two only call _infer_litellm_provider()).

Extract to a standalone module, e.g. openhands/sdk/llm/utils/image_resize.py:

def resize_base64_data_url(url: str, *, max_dimension: int) -> str: ... def maybe_resize_images(messages, provider, vision_enabled): ...

Then the LLM method becomes a one-liner call. Keep the god-class from getting godlier.

HUMAN: ^agree with the judgement here

@Zheng-Lu is this fixed in any commit? 👀

Yes, fixed in last commit.

I have moved the image-resize logic out of llm.py into openhands/sdk/llm/utils/image_resize.py, including the dimension selection and base64 resize helper.

@Zheng-Lu did you push it up to this PR? I didn't see the changes here..

@xingyaoww Sorry, I thought I did but I actually didn't, now the change is committed

openhands-ai · 2026-03-27T06:53:31Z

OpenHands encountered an error: Request timeout after 30 seconds to https://ypvqnvwqmvbumzax.prod-runtime.all-hands.dev/api/conversations/9650ebc3-b326-46ce-88ee-b683ce63e259/ask_agent

See the conversation for more information.

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww · 2026-04-10T15:35:53Z

@OpenHands do /codereview-roasted /github-pr-review

openhands-ai · 2026-04-10T15:36:19Z

I'm on it! xingyaoww can track my progress at all-hands.dev

xingyaoww

🟡 Acceptable — Core logic is correct and the tests are solid, but there are design issues worth addressing before merge.

Linus-Style Analysis:

The fundamental idea is sound: intercept oversized base64 images before they hit Anthropic's API limits. The Anthropic docs research is correct (20-image threshold, 2000px vs 8000px caps), and the resize logic itself is clean.

But the plumbing around it — threading a PIL module as Any through three functions, silent in-place mutation — adds unnecessary complexity that a simpler design would eliminate entirely.

⚠️ Eval-risk note: This PR changes image processing in the LLM message pipeline, which could affect agent behavior on vision benchmarks. Flagging for a human maintainer to decide after running lightweight evals.

This review was generated by an AI agent (OpenHands).

openhands-ai · 2026-04-10T15:40:40Z

Summary

The request was to perform a /codereview-roasted and /github-pr-review on PR #2552 ("fix(sdk): resize Anthropic many-image inputs"). Here's what was done:

Checklist

✅ Retrieved PR diff and context — Fetched the full diff (5 files changed), PR description, and linked issue SDK should auto-resize oversized images before sending to LLM providers #2467
✅ Performed roasted code review — Analyzed the code through the Linus Torvalds-style lens focusing on data structures, complexity, pragmatism, and breaking changes
✅ Posted structured GitHub PR review — Submitted a single COMMENT review with 6 inline comments using priority labels (🟠🟠🟡🟡🟢🟢) via the GitHub API
✅ Used correct review event — Left as COMMENT (not APPROVE) because the PR changes image processing in the LLM message pipeline, which falls under the eval-risk policy requiring human maintainer sign-off
✅ No code modifications made — Review-only, as required by the codereview skill

Key Findings Posted

🟠 image_module: Any threading — Unnecessary complexity; Python's sys.modules cache makes repeated imports free
🟠 Silent in-place mutation — maybe_resize_messages_for_provider mutates inputs with no return value, relying on an invisible deepcopy contract
🟡 pillow>=12.1.1 floor too high — The APIs used are stable since Pillow 9.1+
🟡 Leaky public API — resize_base64_data_url exposes an Any-typed PIL parameter
🟢 Good dimension logic — Clean early returns, correct Anthropic doc mirroring
🟢 Solid tests — Real image creation and dimension assertions, not mock-only

No extraneous changes were made — this was purely a review action with no code modifications.

all-hands-bot · 2026-04-16T12:39:15Z

[Automatic Post]: This PR seems to be currently waiting for review. @xingyaoww @Zheng-Lu @openhands-ai[bot], could you please take a look when you have a chance?

Co-authored-by: openhands <openhands@all-hands.dev>

fix(sdk): resize Anthropic many-image inputs

19c998b

Co-authored-by: openhands <openhands@all-hands.dev>

Merge branch 'main' into fix/2467-image-downscale

b3f67af

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww reviewed Mar 27, 2026

View reviewed changes

Zheng-Lu and others added 3 commits March 28, 2026 00:29

Merge branch 'OpenHands:main' into fix/2467-image-downscale

7dbe9e0

Merge branch 'OpenHands:main' into fix/2467-image-downscale

079183f

fix(sdk): handle Anthropic single-image limits

4d047ed

Co-authored-by: openhands <openhands@all-hands.dev>

Zheng-Lu requested a review from xingyaoww April 6, 2026 22:31

Zheng-Lu and others added 2 commits April 6, 2026 23:32

Merge branch 'OpenHands:main' into fix/2467-image-downscale

487da93

Merge branch 'main' into fix/2467-image-downscale

82447d2

xingyaoww reviewed Apr 10, 2026

View reviewed changes

Merge branch 'main' into fix/2467-image-downscale

aef2b73

xingyaoww reviewed Apr 16, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/llm/utils/image_resize.py Outdated

Zheng-Lu and others added 2 commits April 19, 2026 00:08

fix(sdk): simplify image resize utilities

e2ed9aa

Co-authored-by: openhands <openhands@all-hands.dev>

Merge branch 'OpenHands:main' into fix/2467-image-downscale

7fce88f

Zheng-Lu requested a review from xingyaoww April 19, 2026 00:00

fix(sdk): require pillow for image resizing

64f0104

Co-authored-by: openhands <openhands@all-hands.dev>

Conversation

Zheng-Lu commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Validation

Proof

Uh oh!

xingyaoww commented Mar 27, 2026

Uh oh!

openhands-ai Bot commented Mar 27, 2026

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Taste Rating: 🟡 Acceptable — Works, but the structure needs improvement

Uh oh!

Uh oh!

Uh oh!

xingyaoww Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Zheng-Lu Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Zheng-Lu Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openhands-ai Bot commented Mar 27, 2026

Uh oh!

xingyaoww commented Apr 10, 2026

Uh oh!

openhands-ai Bot commented Apr 10, 2026

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openhands-ai Bot commented Apr 10, 2026

Summary

Checklist

Key Findings Posted

Uh oh!

all-hands-bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Zheng-Lu commented Mar 23, 2026 •

edited

Loading

Zheng-Lu Apr 2, 2026 •

edited

Loading