Skip to content

feat: scaffold Week 3 assignment — validated ingestion pipeline#1

Merged
lassebenni merged 2 commits into
mainfrom
feat/scaffold-week3-assignment
May 18, 2026
Merged

feat: scaffold Week 3 assignment — validated ingestion pipeline#1
lassebenni merged 2 commits into
mainfrom
feat/scaffold-week3-assignment

Conversation

@lassebenni
Copy link
Copy Markdown
Collaborator

Summary

  • Adds all starter template files for the Week 3 assignment (Build a Validated Ingestion Pipeline)
  • Replaces the placeholder task 1 files / task 2 files stubs with real skeleton code, dataset, autograder, and devcontainer
  • Follows the Week 2 template pattern (README with scoring ladder, incremental autograder, Azure CLI support)

What's included

File Purpose
README.md Student redirect, task table, repo layout, scoring ladder, grader instructions
task-1/models.py Pydantic WeatherReading skeleton with @field_validator TODO
task-1/ingest_api.py fetch_with_retry + fetch_api_records skeletons
task-1/ingest_files.py CSV ingestion skeleton
task-1/validate.py Batch validation skeleton
task-1/database.py SQLite create/upsert/query skeleton
task-1/pipeline.py Orchestrator skeleton with step-by-step TODOs
task-1/data/weather_stations.csv Messy 10-row dataset with intentional errors (empty station, N/A temp, humidity >100, duplicate, out-of-range temp, missing timestamp)
task-1/output/azure_compare.md Template for Task 7 comparison write-up
task-1/requirements.txt pydantic>=2.0, requests>=2.28
task-1/.env.example No secrets needed; documents where to add keys if students extend the project
task-2/AI_DEBUG.md Template with 4 required sections for Task 8
.hyf/test.sh Full autograder: 100 pts, passing=60, incremental ladder (0→10→20→40→50→70 for pipeline, +15 Azure, +15 AI debug)
.devcontainer/devcontainer.json Python 3.11 + Azure CLI Codespace
AZURE_LOGIN.md Device-code login instructions for Codespaces and local dev
.gitignore Python patterns + generated pipeline outputs excluded

Test plan

  • Run bash .hyf/test.sh locally against the template — should score 0 (all TODOs raise NotImplementedError)
  • Implement the pipeline end-to-end and verify the grader reaches 70/70 for Tasks 1-6
  • Verify Task 7 checks (azure_resource_groups.json + azure_compare.md)
  • Verify Task 8 check (AI_DEBUG.md word count threshold)
  • Confirm devcontainer.json installs dependencies correctly in a Codespace

🤖 Generated with Claude Code

Adds all template files for the Week 3 assignment:

- README.md with task table, repo layout, scoring ladder, local grader instructions
- task-1/: 6 skeleton Python modules with TODO comments (models, ingest_api,
  ingest_files, validate, database, pipeline), messy weather_stations.csv,
  requirements.txt, .env.example, and azure_compare.md template in output/
- task-2/: AI_DEBUG.md template with 4 required sections
- .hyf/test.sh: full autograder (100 pts, passing=60) with incremental
  scoring ladder for the pipeline (0→10→20→40→50→70), Task 7 Azure output
  checks (15 pts), and Task 8 AI debug report checks (15 pts)
- .devcontainer/devcontainer.json: Python 3.11 + Azure CLI Codespace
- AZURE_LOGIN.md: device-code login instructions for Codespaces and local
- .gitignore: Python patterns + generated pipeline outputs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

This comment was marked as outdated.

- Nest code-pattern introspection inside idempotency gate so scores
  can't jump from 40 to 70 without passing the 50-point upsert check
- Replace grep patterns that matched scaffold docstrings with patterns
  that only match actual code: execute(.*? for parameterised queries,
  time\.sleep for backoff (avoids the function name fetch_with_retry),
  ON CONFLICT check unchanged but removed from docstrings
- Remove "ON CONFLICT" and "? placeholders" from database.py docstrings
  so they no longer satisfy the code-pattern greps on the unimplemented scaffold
- Remove time.sleep hint comment from ingest_api.py for the same reason
- Initialize score.json to 0/false at script start so a set -e crash
  before the final write does not leave a stale passing score
- Raise azure_compare.md fill-in threshold from 600 to 1200 chars and
  shorten the committed template to 233 bytes so students must write
  actual prose to earn the 15-point tier
- Fix pipeline.py example summary: invalid records 8 -> 6 (the duplicate
  Copenhagen row is valid and exercises upsert, not validation); add note
  that API count varies by time of day (up to 168 hourly records)
- Move logging.basicConfig into the __main__ guard to avoid root-logger
  side effects on import
- Add missing import os to .env.example code snippet
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 18 changed files in this pull request and generated no new comments.

@lassebenni lassebenni merged commit 4994270 into main May 18, 2026
4 checks passed
@lassebenni lassebenni deleted the feat/scaffold-week3-assignment branch May 18, 2026 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants