fix(streaming): defer Pydantic parsing until terminal status is known#3284
fix(streaming): defer Pydantic parsing until terminal status is known#3284vanhci wants to merge 1 commit into
Conversation
…tus is known Previously, when using client.responses.stream() with a Pydantic text_format, the SDK parsed output text on response.output_text.done event BEFORE the terminal response.incomplete status was known. If the API later emitted response.incomplete (e.g., due to max_output_tokens truncation), the SDK raised a Pydantic JSON validation error that masked the real upstream failure. Changes: - src/openai/_exceptions.py: Add IncompleteResponseError exception that provides clear error message with response_id and incomplete_details.reason - src/openai/__init__.py: Export IncompleteResponseError - src/openai/lib/streaming/responses/_responses.py: - Don't parse text on response.output_text.done (set parsed=None instead) - Handle response.incomplete by raising IncompleteResponseError - Parsing is now deferred until response.completed via parse_response() Fixes openai#3263
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 54ded80baa
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| elif event.type == "response.incomplete": | ||
| # Raise an error for incomplete responses instead of letting | ||
| # Pydantic JSON validation errors bubble up later | ||
| raise IncompleteResponseError( | ||
| response_id=event.response.id, | ||
| incomplete_details_reason=event.response.incomplete_details.reason if event.response.incomplete_details else None, | ||
| ) |
There was a problem hiding this comment.
Preserve
response.incomplete events for non-parsed streams
Only text_format=PydanticModel needed special handling, but this change now raises IncompleteResponseError for every response.incomplete event, including plain client.responses.stream() usage with no parsing. That is a behavioral regression: callers that previously consumed partial output and handled response.incomplete as a normal terminal event now get an exception and lose the event path entirely. Please gate this exception to parse-mode streams (or still emit the event) so non-structured streaming remains backward-compatible.
Useful? React with 👍 / 👎.
What
When using
client.responses.stream()withtext_format=PydanticModel, the SDK eagerly parses output text onresponse.output_text.doneBEFORE the terminalresponse.incompletestatus is known. If the response is truncated (e.g. bymax_output_tokens), this raises apydantic.ValidationErroron the incomplete JSON, masking the real upstream truncation failure.This change defers Pydantic parsing in the streaming responses helper until a terminal status (
completedorincomplete) is reached. Onincomplete, anIncompleteResponseErroris raised with clear context instead of a confusing ValidationError.Changes
IncompleteResponseErrorprovidingresponse_idandincomplete_details.reasonIncompleteResponseErrorresponse.output_text.done(defer until terminal status)response.incomplete: raiseIncompleteResponseErrorparse_response()at terminal statusFixes #3263