Conversation
|
Split this work into two draft PRs so the general calibration-contract changes can be reviewed independently from the provisional OACT source package. The stacked follow-up is #670. |
|
Follow-up from the late-tail investigation:
Substantively, the new diagnostics clarify the late-year problem:
So the current read is:
I have not pushed the experimental dense approximate entropy fallback yet. The first prototype failed numerically on |
|
Late-tail update from the microsim-only investigation:
Important caveat: this is not the whole tail fix by itself. The LP approximate fallback is still overly sparse, and the deeper ESS/concentration problem remains. But this change improves the late-year feasible set with a defensible repeated-cross-section adjustment rather than another hidden tolerance bump. I have not included the standalone support-profiling script in this commit yet; it is still local-only while I decide whether it belongs in the repo. |
|
Follow-up pushed as This extends the calibration contract beyond target error / negative weights and adds explicit late-tail support-quality gates:
Those thresholds are applied in both classification and validation. On the sampled years:
That is intentional: the runner should now fail fast on a microsim support collapse instead of saving a misleading late-year artifact. Focused verification still passes: I also started a one-year |
|
Added a support-augmentation diagnostic pass in What landed:
Key result at
Interpretation:
|
|
Follow-up support-expansion result in I added a composite-household diagnostic path in
I also added a focused composite-augmentation test in What the
Interpretation:
So the next support-expansion step has to be more structural than whole-household cloning or payroll grafting. The late frontier does not appear to be missing only “older payroll intensity” in a way that can be fixed by splicing current-household components together. |
|
Added appended synthetic-sample diagnostics in What changed:
Key result at
Interpretation:
|
|
Pushed This change moves long-run TOB out of the hard calibration target bundle and into post-calibration benchmarking:
Validation:
|
|
Pushed
Focused verification still passes:
Empirical result: this is better than raw LP, but still not enough for publishable late-tail microsim support. Current diagnostics under the no-TOB-hard-target profile:
So this removes the pathological |
|
Pushed What changed:
Verification:
I also started a one-year late-tail smoke run with |
|
Pushed Included in this commit:
Verification:
The branch is now in a good state for another external review if we want one; the remaining risk looks primarily methodological rather than hidden harness bugs. |
|
Pushed I used it to check the publishable cutoff under the current Boundary result:
Milestone diagnostics from the same tool:
So the current evidence points to a publishable microsim horizon of through 2074, with |
|
Pushed This profile appends synthetic mixed-age households by taking an older beneficiary household and adding a younger payroll-rich donor person as a separate subunit in the same household. That changes the household age/payroll direction, unlike the earlier age-shift and payroll-graft rules. I ran: uv run python policyengine_us_data/datasets/cps/long_term/evaluate_support_augmentation.py 2091 --profile ss-payroll --target-source trustees_2025_current_law --support-augmentation late-mixed-household-v1Result at
So even a genuinely mixed-age household augmentation barely moves the frontier. That makes the current conclusion stronger: the late-tail issue is not just “missing older workers” or “missing older + younger co-resident households” in a simple sense. If 2100 microsim is a hard requirement, we likely need a much more radical synthetic-support generation path than support grafting onto the 2024 CPS donor geometry. |
|
Pushed
Key result at
The synthetic composition it wants is informative:
Compared with the actual 2024 support count mix, the largest gaps are:
Notably, once we drop TOB from the hard target set and just target age + SS + payroll, this minimal synthetic solution uses zero pension/dividend income and still wants an average taxable-benefits proxy share of So this doesn’t solve 2100 microsim, but it does sharpen what the synthetic support generator would need to add. |
Summary
What changed
CalibrationProfilecontracts for long-run age/SS/payroll/TOB calibration, including year-bounded approximate windowscalibration_quality,max_constraint_pct_error, and target-source metadataassess_calibration_frontier.pyfor checking where exact nonnegative calibration remains feasiblerebuild_calibration_manifest.pyto backfill manifests/sidecars with the new contract datatrustees_2025_current_lawlong-run target-source package instead of relying on an implicit legacy file pathWhy
The old long-run workflow depended on implicit flag combinations, silent fallback behavior, and ambiguous target-source provenance. This PR makes the calibration contract explicit and inspectable so downstream consumers can reject mismatched artifacts instead of trusting them implicitly.
Validation
uv run pytest policyengine_us_data/tests/test_long_term_calibration_contract.py -qpython3 -m py_compile policyengine_us_data/datasets/cps/long_term/calibration.py policyengine_us_data/datasets/cps/long_term/calibration_profiles.py policyengine_us_data/datasets/cps/long_term/calibration_artifacts.py policyengine_us_data/datasets/cps/long_term/run_household_projection.py policyengine_us_data/datasets/cps/long_term/ssa_data.py policyengine_us_data/datasets/cps/long_term/rebuild_calibration_manifest.py policyengine_us_data/datasets/cps/long_term/assess_calibration_frontier.pyFollow-up
A stacked follow-up PR will add the provisional OACT target-source package and builder script on top of this contract work.