Skip to content

Fix 5 bugs: PDF crash, unused env var, missing state field, hardcoded model, typo#260

Open
sanjibani wants to merge 1 commit intoThe-OpenROAD-Project:masterfrom
sanjibani:fix/misc-bugs-found-in-review
Open

Fix 5 bugs: PDF crash, unused env var, missing state field, hardcoded model, typo#260
sanjibani wants to merge 1 commit intoThe-OpenROAD-Project:masterfrom
sanjibani:fix/misc-bugs-found-in-review

Conversation

@sanjibani
Copy link
Copy Markdown

Summary

Fixes #259 — five bugs found during code review, all small and isolated.

  • PDF crash: process_pdf_docs() hits UnboundLocalError when a corrupted PDF raises PdfStreamError — the except block didn't return, so documents was used unassigned. Now returns [].
  • Unused env var: FAISS_DB_PATH was defined in .env.example but get_db_path() always used a hardcoded path. Now reads the env var first, falls back to the computed path.
  • Missing state field: context_list was returned by retriever tool nodes but wasn't in the AgentState TypedDict, so LangGraph silently dropped it. Added the field.
  • Hardcoded model: helpers.py hardcoded gemini-2.0-flash instead of reading GOOGLE_GEMINI env var like the main endpoint does. Now uses the same model mapping.
  • Prompt typo: "avaiable""available" (plus grammar fix) in the main RAG prompt.

Also updates test_faiss_vectorstore.py to test both the default path and the env var path.

Test plan

  • All 18 unit tests pass
  • All 81 related tests pass (retriever, graph, helpers, prompt, PDF, FAISS)
  • Manual: process a corrupted PDF — should log error and continue instead of crashing
  • Manual: set FAISS_DB_PATH=/custom/path — vectorstore should use that path
  • Manual: set GOOGLE_GEMINI=2.5_flash — helpers endpoint should use gemini-2.5-flash

🤖 Generated with Claude Code

- Fix UnboundLocalError crash in process_pdf_docs when PDF is
  corrupted: return empty list on PdfStreamError instead of
  referencing unassigned `documents` variable.
- Honor FAISS_DB_PATH env var in FAISSVectorDatabase.get_db_path()
  instead of always using a hardcoded relative path.
- Add missing `context_list` field to AgentState TypedDict so
  retriever tool output is no longer silently dropped.
- Read GOOGLE_GEMINI env var in helpers.py instead of hardcoding
  gemini-2.0-flash, keeping model choice consistent with the
  main conversations endpoint.
- Fix typo "avaiable" -> "available" in the summarise prompt
  template.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 bugs found during code review: crash, ignored env var, missing state field, hardcoded model, typo

1 participant