<snip>
======================================================================
[Run 7/20] PATH 2: Native Google GenAI SDK — streaming + tools + thinking + MALFORMED retry
======================================================================
model : gemini-2.5-flash
thinking_budget: 8192
query : 'give me a line chart of vulnerabilities over the last 4 weeks'
Both GOOGLE_API_KEY and GEMINI_API_KEY are set. Using GOOGLE_API_KEY.
Attempt 1 — dynamic thinking (no explicit budget, matches Torana):
--------------------------------------------------
[chunk 1] finish_reason = 'MALFORMED_FUNCTION_CALL'
--------------------------------------------------
chunks received : 1
finish_reason : 'MALFORMED_FUNCTION_CALL'
tool calls : 0
⚠️ MALFORMED_FUNCTION_CALL detected on attempt 1
→ Retrying with thinking_budget=0 (CachingGemini workaround)
Attempt 2 — retry with thinking_budget=0:
--------------------------------------------------
[chunk 1] finish_reason = 'STOP'
[chunk 1] tool_call = 'ui_line_chart'
--------------------------------------------------
chunks received : 1
finish_reason : 'STOP'
tool calls : 1
✅ RETRY SUCCEEDED: 'ui_line_chart' called with 5 top-level args
labels : ['Week 1', 'Week 2', 'Week 3', 'Week 4']
datasets : 4 series
- 'Critical': [120, 110, 90, 80]
- 'High': [250, 220, 180, 150]
- 'Medium': [400, 380, 350, 320]
- 'Low': [600, 550, 500, 450]
title : 'Vulnerabilities Over Last 4 Weeks'
[fail-fast] Bug reproduced — stopping. Use --no-fail-fast to continue.
======================================================================
STATISTICS (0 runs)
======================================================================
PATH 2 — Native SDK (with retry):
Attempt 1 MALFORMED : 1 / 7 (14%)
→ retry succeeded : 1 / 1 (100%)
Success on attempt 1 : 6 / 7 (86%)
Overall success : 7 / 7 (100%)
Environment details
Steps to reproduce
Repro logs:
Note: i have also raised a bug with litellm to handle this scenario better: BerriAI/litellm#21744
gemini_malformed_reproducer.py