fix: filter Gemini thinking parts from user-facing message chain#7196
Open
he-yufeng wants to merge 1 commit intoAstrBotDevs:masterfrom
Open
fix: filter Gemini thinking parts from user-facing message chain#7196he-yufeng wants to merge 1 commit intoAstrBotDevs:masterfrom
he-yufeng wants to merge 1 commit intoAstrBotDevs:masterfrom
Conversation
Gemini 3 models return thinking parts (part.thought=True) alongside the actual response text. _process_content_parts was including these thinking parts in the message chain sent to the user, effectively leaking internal reasoning into the output. On platforms that split long messages (e.g. aiocqhttp with realtime segmenting), this caused duplicate or triple replies since the thinking text often mirrors the actual response. The streaming path already handled this correctly via chunk.text which skips thinking parts, but the non-streaming path and the final-chunk processing in streaming both went through _process_content_parts. Also switch the Gemini 3 model name matching from an exhaustive list to prefix matching (gemini-3- / gemini-3.) so new variants like gemini-3.1 get proper thinkingLevel config without code changes. Fixes AstrBotDevs#7183
Contributor
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- When checking
any(model_name.startswith(p) for p in (...)), consider normalizingmodel_name(e.g.model_name.lower()) so the thinking-level logic remains robust to case or formatting differences in model IDs coming from configuration or upstream changes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- When checking `any(model_name.startswith(p) for p in (...))`, consider normalizing `model_name` (e.g. `model_name.lower()`) so the thinking-level logic remains robust to case or formatting differences in model IDs coming from configuration or upstream changes.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates the Gemini provider to use prefix matching for Gemini 3 models, improving maintainability for future versions. It also modifies content processing to exclude "thinking" parts from the final message output to prevent reasoning leakage and duplicate replies. I have no feedback to provide.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When using Gemini 3 series models (e.g. gemini-3-pro, gemini-3-flash), the bot sends duplicate or triple replies for a single user message. Only one API request is made (confirmed via Google AI Studio), but the response text appears multiple times in chat.
Root Cause
Gemini 3 models with thinking enabled return response parts that include both thinking (
part.thought=True) and actual text parts. In_process_content_parts, the loop that builds the user-facing message chain didn't filter out thinking parts:This leaked the model's internal reasoning into the message chain. Since thinking text often mirrors the actual response, the user sees what appears to be the same content repeated. On platforms that split long messages by sentence (e.g. aiocqhttp with realtime segmenting), this turns into multiple separate replies.
The streaming path was unaffected because
chunk.text(the SDK's convenience property) already skips thinking parts internally. But non-streaming requests and the final-chunk processing in streaming both go through_process_content_parts.Fix
Filter thinking parts from chain: Add
and not part.thoughtcheck when building the chain. Reasoning text is already captured separately via_extract_reasoning_content.Use prefix matching for Gemini 3 model names: The hardcoded model name list (
gemini-3-pro,gemini-3-flash, etc.) missed variants likegemini-3.1-pro-preview. Switched to prefix matching (gemini-3-/gemini-3.) so new model variants get properthinkingLevelconfig automatically.Fixes #7183
Summary by Sourcery
Prevent Gemini 3 model reasoning content from leaking into user-facing messages and ensure thinking configuration applies consistently across current and future Gemini 3 variants.
Bug Fixes:
Enhancements: