feat: scaffold Week 3 assignment — validated ingestion pipeline#2
Merged
Conversation
Remove the task-1/ and task-2/ folder split. All pipeline files now live
at the repo root, matching the Deliverables layout in the assignment chapter
exactly. Students no longer see "task-1/" and wonder if that maps to
"Task 1" in the assignment instructions.
- Moved task-1/{models,ingest_api,ingest_files,validate,database,pipeline}.py → root
- Moved task-1/data/ → data/
- Moved task-1/output/ → output/
- Moved task-1/.env.example and requirements.txt → root
- Moved task-2/AI_DEBUG.md → root
- Updated .gitignore, devcontainer.json, .hyf/test.sh, and README accordingly
…o all Python files - README opens with a 'Why no task folders?' explanation and a step-by-step table (Step 1 = models.py through Step 6 = pipeline.py) so students know where to start without numbered folders to lean on - Every Python file now has a 2-3 line header comment naming the step, the task, and the role that file plays in the pipeline - Scoring ladder rewritten as a table for scannability - Student-redirect callout moved to the bottom (instructors read the top)
There was a problem hiding this comment.
Pull request overview
This PR scaffolds the Week 3 validated ingestion pipeline assignment as a flat root-level project and updates the grader/docs to match that structure.
Changes:
- Adds starter Python modules for API ingestion, CSV ingestion, validation, SQLite storage, and orchestration.
- Adds assignment templates/data for Azure comparison and AI debugging.
- Updates the autograder, README paths, devcontainer install path, and ignore rules for the root-level layout.
Reviewed changes
Copilot reviewed 9 out of 15 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
models.py |
Adds the starter WeatherReading Pydantic model. |
ingest_api.py |
Adds API fetch/retry starter functions. |
ingest_files.py |
Adds CSV ingestion starter function. |
validate.py |
Adds batch validation starter function. |
database.py |
Adds SQLite helper starter functions. |
pipeline.py |
Adds the orchestration scaffold and expected pipeline steps. |
data/weather_stations.csv |
Adds the messy input dataset. |
output/azure_compare.md |
Adds the Azure comparison response template. |
AI_DEBUG.md |
Adds the AI debugging report template. |
.hyf/test.sh |
Updates grading paths and scoring checks for the flat structure. |
.gitignore |
Updates generated artifact ignore paths for the flat structure. |
README.md |
Updates assignment layout and local grader instructions. |
requirements.txt |
Adds Python dependencies. |
.env.example |
Adds environment variable guidance. |
.devcontainer/devcontainer.json |
Updates devcontainer dependency installation path. |
AZURE_LOGIN.md |
Provides Azure login guidance. |
.github/workflows/grade-assignment.yml |
Provides the grading workflow entry point. |
Comments suppressed due to low confidence (2)
models.py:13
- With Pydantic v2, a
@field_validatorwithoutmode="before"runs after themin_length=1constraint. If students implement the TODO as “strip and title-case,” a whitespace-only station like" "can pass the length check first and then be stored as an empty string after stripping.
pipeline.py:24 - This step tells students to validate “all records” together, but
validate_recordsaccepts a singlesourcevalue for every error it returns. If API and CSV rows are combined before validation, the error report cannot accurately identify whether each failed row came fromapiorcsv; the instructions should make it explicit to validate each source separately or attach source per record.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ach run
- Parameterized query check now passes for both inline (execute('...?...')) and
multi-line/variable-assignment SQL forms: checks for '?' anywhere in database.py
AND an .execute call, rather than requiring both on the same physical line
- Remove weather.db and output/error_report.json at grader start so local reruns
cannot inflate the score with stale artifacts from a prior successful run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
task-1/task-2/confusion)test.sh) covers the full scoring ladder: 10 → 20 → 40 → 50 → 70 pts for the pipeline + 15 pts Azure + 15 pts AI debugWhat's included
models.py,ingest_api.py,ingest_files.py,validate.py,database.py,pipeline.py— student starters withraise NotImplementedErrorTODOsdata/weather_stations.csv— messy dataset with 6 intentional validation failures and 4 valid rows (including a duplicate that exercises the upsert path)output/azure_compare.md— blank template (students must write >1200 chars to earn 15 pts)AI_DEBUG.md— four-section template at the root (students must write >1800 chars for full marks).hyf/test.sh— autograder with incremental scoring, idempotency gate, and code-pattern introspection.devcontainer/devcontainer.json— Python 3.11 + Azure CLI pre-installedAZURE_LOGIN.md— Codespaces login guide for Task 7.github/workflows/grade-assignment.yml— CI that runs the grader on every PRTest plan
az login, correctly scores 0 without credentials🤖 Generated with Claude Code