This project provides a minimal backend system for generating Detailed Diagnostic Reports (DDR) by extracting information from two PDF documents (Inspection Report and Thermal Report) and leveraging an LLM to structure the summary.
- Python (>=3.8)
- FastAPI
- PyMuPDF (fitz)
- OpenAI (GPT) or any LLM via environment key
ddr-ai-system/
│
├── main.py
├── pdf_parser.py
├── image_extractor.py
├── ai_processor.py
├── report_generator.py
├── requirements.txt
├── README.md
└── uploads/ # stores uploaded PDFs and extracted images
-
Clone or copy this repository to your local machine.
-
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # macOS/Linux venv\Scripts\activate # Windows
-
Install dependencies:
pip install -r requirements.txt
-
Set your LLM API key.
-
Option A (recommended): create a
.envfile in the project root.Example
.env:GEMINI_API_KEY=your_key_here AI_PROVIDER=gemini
-
Option B (environment variables):
For OpenAI:
export OPENAI_API_KEY="your_key_here" # macOS/Linux setx OPENAI_API_KEY "your_key_here" # Windows (requires new shell)
For Gemini:
export AI_PROVIDER=gemini export GEMINI_API_KEY="your_key_here"
Note: the server must be started after setting env vars so they are visible to the running process.
-
-
Run the server:
uvicorn main:app --reload
-
Use the endpoint:
- POST
/generate-ddrwith form-data fieldsinspection_reportandthermal_reportboth as file uploads. - Returns JSON containing extracted text, image paths, and the generated DDR report.
- POST
curl -X POST "http://127.0.0.1:8000/generate-ddr" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "inspection_report=@inspection.pdf;type=application/pdf" \
-F "thermal_report=@thermal.pdf;type=application/pdf"You can also launch a basic upload form and display results with Streamlit.
streamlit run ui.py --server.maxUploadSize=200The interface allows you to choose two PDFs and shows the DDR output inline. (Note: Streamlit upload size is controlled via command line/config, not in Python code.)
- Image extraction uses
PyMuPDF; if the PDFs have no images, theimageslist will be empty. - The LLM prompt enforces structure and rules; if the model returns non-JSON, the raw output is returned under the
rawkey. - Uploaded files and extracted images are stored in the
uploads/directory.
Feel free to extend functionality, add authentication, or swap in a different PDF parser or LLM provider.