This repo contains a small data pipeline for the Oil & Gas Asset Monitoring (OAM) system we discussed in the design exercise.
The pipeline reads satellite observations of floating-roof oil tanks, computes volumes from fill-level measurements, and produces a gap-filled daily timeseries for a client dashboard.
Time budget: ~20 minutes total.
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txtRun the pipeline and the tests to check everything works:
python pipeline/ingest.py
pytest tests/ -vdata/
tanks.csv 5 monitored tanks with dimensions
observations.csv 50 satellite observations (fill-level measurements)
pipeline/
ingest.py Pipeline logic ← focus here
models.py Dataclasses (Tank, Observation) — reference only
tests/
test_ingest.py Test suite
Read through pipeline/ingest.py. This was written as a quick prototype — it works on the current data, but we'd like to harden it before deploying to production.
As you read, think about:
- Robustness — what assumptions does the code make that might break?
- Scale — we track 5 tanks today, but want to grow to 300 sites. Any concerns?
- Observability — would you be comfortable operating this in production?
Talk through your thinking as you go.
Add input validation to the pipeline. Write a function:
def validate_observations(df: pd.DataFrame) -> pd.DataFrame:It should:
-
Check required columns — raise
ValueErrorif any of these are missing:site_id,tank_id,observed_at,confidence_score,image_id -
Drop bad rows — remove (and log) any rows where
image_idis null or empty, since those observations can't be traced back to their source image. -
Return the cleaned DataFrame.
Call it at the top of run_pipeline(), right after loading the data.
This is collaborative — your interviewer may suggest additional checks based on the review discussion. Feel free to add tests if time allows.
- Use any reference material you normally would (docs, search, etc.)
- There's no single right answer — working, readable, defensive code is the goal
- If you get stuck, talking through your approach is just as valuable