Skip to content

fix: filter Gemini thinking parts from user-facing message chain#7196

Open
he-yufeng wants to merge 1 commit intoAstrBotDevs:masterfrom
he-yufeng:fix/gemini-thinking-text-leak
Open

fix: filter Gemini thinking parts from user-facing message chain#7196
he-yufeng wants to merge 1 commit intoAstrBotDevs:masterfrom
he-yufeng:fix/gemini-thinking-text-leak

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

@he-yufeng he-yufeng commented Mar 30, 2026

Problem

When using Gemini 3 series models (e.g. gemini-3-pro, gemini-3-flash), the bot sends duplicate or triple replies for a single user message. Only one API request is made (confirmed via Google AI Studio), but the response text appears multiple times in chat.

Root Cause

Gemini 3 models with thinking enabled return response parts that include both thinking (part.thought=True) and actual text parts. In _process_content_parts, the loop that builds the user-facing message chain didn't filter out thinking parts:

for part in result_parts:
    if part.text:  # includes thinking parts!
        chain.append(Comp.Plain(part.text))

This leaked the model's internal reasoning into the message chain. Since thinking text often mirrors the actual response, the user sees what appears to be the same content repeated. On platforms that split long messages by sentence (e.g. aiocqhttp with realtime segmenting), this turns into multiple separate replies.

The streaming path was unaffected because chunk.text (the SDK's convenience property) already skips thinking parts internally. But non-streaming requests and the final-chunk processing in streaming both go through _process_content_parts.

Fix

  1. Filter thinking parts from chain: Add and not part.thought check when building the chain. Reasoning text is already captured separately via _extract_reasoning_content.

  2. Use prefix matching for Gemini 3 model names: The hardcoded model name list (gemini-3-pro, gemini-3-flash, etc.) missed variants like gemini-3.1-pro-preview. Switched to prefix matching (gemini-3- / gemini-3.) so new model variants get proper thinkingLevel config automatically.

Fixes #7183

Summary by Sourcery

Prevent Gemini 3 model reasoning content from leaking into user-facing messages and ensure thinking configuration applies consistently across current and future Gemini 3 variants.

Bug Fixes:

  • Exclude Gemini 3 thinking parts from the constructed user-facing message chain to stop duplicate or repeated replies.

Enhancements:

  • Apply Gemini 3 thinking configuration based on model name prefixes rather than a hardcoded list so new model variants are automatically supported.

Gemini 3 models return thinking parts (part.thought=True) alongside the
actual response text.  _process_content_parts was including these thinking
parts in the message chain sent to the user, effectively leaking internal
reasoning into the output.  On platforms that split long messages (e.g.
aiocqhttp with realtime segmenting), this caused duplicate or triple
replies since the thinking text often mirrors the actual response.

The streaming path already handled this correctly via chunk.text which
skips thinking parts, but the non-streaming path and the final-chunk
processing in streaming both went through _process_content_parts.

Also switch the Gemini 3 model name matching from an exhaustive list to
prefix matching (gemini-3- / gemini-3.) so new variants like gemini-3.1
get proper thinkingLevel config without code changes.

Fixes AstrBotDevs#7183
@auto-assign auto-assign bot requested review from Soulter and anka-afk March 30, 2026 13:36
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Mar 30, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • When checking any(model_name.startswith(p) for p in (...)), consider normalizing model_name (e.g. model_name.lower()) so the thinking-level logic remains robust to case or formatting differences in model IDs coming from configuration or upstream changes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When checking `any(model_name.startswith(p) for p in (...))`, consider normalizing `model_name` (e.g. `model_name.lower()`) so the thinking-level logic remains robust to case or formatting differences in model IDs coming from configuration or upstream changes.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Gemini provider to use prefix matching for Gemini 3 models, improving maintainability for future versions. It also modifies content processing to exclude "thinking" parts from the final message output to prevent reasoning leakage and duplicate replies. I have no feedback to provide.

@dosubot dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] 使用Gemini供应商时AI会对同一个内容发送三次响应

1 participant