fix: filter Gemini thinking parts from user-facing message chain by he-yufeng · Pull Request #7196 · AstrBotDevs/AstrBot

he-yufeng · 2026-03-30T13:36:06Z

Problem

When using Gemini 3 series models (e.g. gemini-3-pro, gemini-3-flash), the bot sends duplicate or triple replies for a single user message. Only one API request is made (confirmed via Google AI Studio), but the response text appears multiple times in chat.

Root Cause

Gemini 3 models with thinking enabled return response parts that include both thinking (part.thought=True) and actual text parts. In _process_content_parts, the loop that builds the user-facing message chain didn't filter out thinking parts:

for part in result_parts:
    if part.text:  # includes thinking parts!
        chain.append(Comp.Plain(part.text))

This leaked the model's internal reasoning into the message chain. Since thinking text often mirrors the actual response, the user sees what appears to be the same content repeated. On platforms that split long messages by sentence (e.g. aiocqhttp with realtime segmenting), this turns into multiple separate replies.

The streaming path was unaffected because chunk.text (the SDK's convenience property) already skips thinking parts internally. But non-streaming requests and the final-chunk processing in streaming both go through _process_content_parts.

Fix

Filter thinking parts from chain: Add and not part.thought check when building the chain. Reasoning text is already captured separately via _extract_reasoning_content.
Use prefix matching for Gemini 3 model names: The hardcoded model name list (gemini-3-pro, gemini-3-flash, etc.) missed variants like gemini-3.1-pro-preview. Switched to prefix matching (gemini-3- / gemini-3.) so new model variants get proper thinkingLevel config automatically.

Fixes #7183

Summary by Sourcery

Prevent Gemini 3 model reasoning content from leaking into user-facing messages and ensure thinking configuration applies consistently across current and future Gemini 3 variants.

Bug Fixes:

Exclude Gemini 3 thinking parts from the constructed user-facing message chain to stop duplicate or repeated replies.

Enhancements:

Apply Gemini 3 thinking configuration based on model name prefixes rather than a hardcoded list so new model variants are automatically supported.

Gemini 3 models return thinking parts (part.thought=True) alongside the actual response text. _process_content_parts was including these thinking parts in the message chain sent to the user, effectively leaking internal reasoning into the output. On platforms that split long messages (e.g. aiocqhttp with realtime segmenting), this caused duplicate or triple replies since the thinking text often mirrors the actual response. The streaming path already handled this correctly via chunk.text which skips thinking parts, but the non-streaming path and the final-chunk processing in streaming both went through _process_content_parts. Also switch the Gemini 3 model name matching from an exhaustive list to prefix matching (gemini-3- / gemini-3.) so new variants like gemini-3.1 get proper thinkingLevel config without code changes. Fixes AstrBotDevs#7183

sourcery-ai

Hey - I've left some high level feedback:

When checking any(model_name.startswith(p) for p in (...)), consider normalizing model_name (e.g. model_name.lower()) so the thinking-level logic remains robust to case or formatting differences in model IDs coming from configuration or upstream changes.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- When checking `any(model_name.startswith(p) for p in (...))`, consider normalizing `model_name` (e.g. `model_name.lower()`) so the thinking-level logic remains robust to case or formatting differences in model IDs coming from configuration or upstream changes.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist

Code Review

This pull request updates the Gemini provider to use prefix matching for Gemini 3 models, improving maintainability for future versions. It also modifies content processing to exclude "thinking" parts from the final message output to prevent reasoning leakage and duplicate replies. I have no feedback to provide.

auto-assign bot requested review from Soulter and anka-afk March 30, 2026 13:36

dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Mar 30, 2026

sourcery-ai bot reviewed Mar 30, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 30, 2026

View reviewed changes

dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: filter Gemini thinking parts from user-facing message chain#7196

fix: filter Gemini thinking parts from user-facing message chain#7196
he-yufeng wants to merge 1 commit intoAstrBotDevs:masterfrom
he-yufeng:fix/gemini-thinking-text-leak

he-yufeng commented Mar 30, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

he-yufeng commented Mar 30, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Fix

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

he-yufeng commented Mar 30, 2026 •

edited by sourcery-ai bot

Loading