Summary
state_taxable_income in policyengine-us does not look safe as a generic cross-state public variable.
At the moment it is:
- effectively unused inside core
policyengine-us
- incomplete relative to actual modeled state tax bases
- coupled to
policyengine-taxsim, which maps TAXSIM v36 to state_taxable_income
- misleading for downstream benchmarking/analysis because it silently returns
0 in some states that clearly have modeled state tax base logic
This looks less like a bug in state tax calculations themselves and more like an interface/ownership problem around the umbrella variable.
PE-US findings
1. state_taxable_income is just a hand-maintained umbrella list
The variable itself is only:
policyengine_us/variables/gov/states/tax/income/state_taxable_income.py
class state_taxable_income(Variable):
...
adds = "gov.states.household.state_taxable_incomes"
The real behavior comes from:
policyengine_us/parameters/gov/states/household/state_taxable_incomes.yaml
2. The umbrella list is incomplete relative to actual state variables
The clearest concrete bug is New Hampshire:
policyengine_us/variables/gov/states/nh/tax/income/nh_taxable_income.py exists
policyengine_us/variables/gov/states/nh/tax/income/nh_income_tax_before_refundable_credits.py directly uses nh_taxable_income
- but
nh_taxable_income is omitted from state_taxable_incomes.yaml
So the generic state_taxable_income umbrella is definitely missing at least one real modeled taxable-income variable.
Pennsylvania also looks wrong/inconsistent:
policyengine_us/variables/gov/states/pa/tax/income/taxable_income/pa_total_taxable_income.py exists
policyengine_us/variables/gov/states/pa/tax/income/taxable_income/pa_adjusted_taxable_income.py exists
policyengine_us/variables/gov/states/pa/tax/income/forgiveness/pa_income_tax_before_forgiveness.py uses pa_adjusted_taxable_income
- but neither PA taxable-income variable is included in
state_taxable_incomes.yaml
That means the generic umbrella currently reports 0 for PA even though PE-US clearly has state taxable-income concepts and uses them in the tax path.
3. Some omissions are probably legitimate
Massachusetts looks like an intentional omission:
state_taxable_incomes.yaml explicitly comments that MA has multiple taxable income variables
policyengine_us/variables/gov/states/ma/tax/income/ma_income_tax_before_credits.py taxes multiple bases (ma_part_a_taxable_dividend_income, ma_part_a_taxable_capital_gains_income, ma_part_b_taxable_income, ma_part_c_taxable_income)
Washington also seems legitimate:
policyengine_us/variables/gov/states/wa/tax/income/wa_income_tax_before_refundable_credits.py is just wa_capital_gains_tax
- there may not be a coherent generic WA "taxable income" concept to expose
So this is not just "add every omitted state".
4. The core state tax path does not use state_taxable_income
I could not find any internal PE-US use of state_taxable_income besides its own definition.
By contrast, the actual cross-state tax aggregators are:
policyengine_us/parameters/gov/states/household/state_income_tax_before_refundable_credits.yaml
policyengine_us/parameters/gov/states/household/state_refundable_credits.yaml
and final state tax is built in:
policyengine_us/variables/household/income/household/household_state_income_tax.py
So state_taxable_income does not appear to be part of core PE-US tax computation.
policyengine-taxsim coupling
policyengine-taxsim currently depends on this variable for TAXSIM output v36:
policyengine_taxsim/config/variable_mappings.yaml
v36 -> state_taxable_income
It is also surfaced in docs/UI:
dashboard/src/constants/index.js
README.md
There is also an older policyengine-taxsim emulator that assumes a "{state}_taxable_income" naming convention:
policyengine-taxsim/taxsim_emulator.py
So the current situation is:
- PE-US does not depend on
state_taxable_income
- TAXSIM compatibility code does
Downstream / benchmarking impact
This came up while investigating state-tax targets in PolicyBench.
Empirically, the variable is misleading in current outputs:
- in one 100-household sample, there were
23 households where state_taxable_income == 0 while state_income_tax_before_refundable_credits != 0
- every sampled PA household had
state_taxable_income = 0 while pre-credit state tax was nonzero
- every sampled MA household had
state_taxable_income = 0 while pre-credit state tax was nonzero
That makes state_taxable_income hard to interpret as a cross-state benchmark target.
Suggested decision / audit
I think this needs an explicit decision rather than a small patch.
Questions to resolve:
- Should
state_taxable_income remain a public generic PE-US variable at all?
- If yes, what is its intended semantics for states with:
- multiple taxable-income bases (MA)
- special tax bases (WA capital gains)
- state-specific adjusted taxable income paths (PA)
- If no, should TAXSIM
v36 logic live in policyengine-taxsim instead as a dedicated compatibility adapter, e.g. taxsim_v36 / taxsim_state_taxable_income?
Strategy options
Option A: Keep state_taxable_income, but audit and define it properly
- Fix clear omissions like NH
- Decide what PA should map to
- Document which states intentionally return
0 / are undefined
- Clarify semantics for states with multiple or nonstandard tax bases
Option B: Deprecate/remove state_taxable_income from PE-US as a universal concept
- Treat it as not a real cross-state PE-US variable
- Move TAXSIM
v36 semantics into policyengine-taxsim
- Keep PE-US focused on real policy concepts, not TAXSIM compatibility abstractions
Option C: Split concepts
- Keep PE-US state-specific variables only
- Add a separate adapter-level variable for TAXSIM / comparison workflows
My current lean
I lean toward B or C, not A.
Reason:
- there is already at least one objective omission bug (NH)
- but some omissions are legitimate because the concept itself is not universal
- PE-US core logic does not need this variable
policyengine-taxsim appears to be the only meaningful consumer
So the cleanest architecture may be:
- deprecate
state_taxable_income as a generic PE-US variable
- implement TAXSIM
v36 semantics explicitly in policyengine-taxsim
- only keep/add PE-US public variables that correspond to actual cross-state policy concepts
If maintainers prefer, this can be split into:
- one PE-US issue for
state_taxable_income
- one
policyengine-taxsim issue for v36 ownership/mapping
Summary
state_taxable_incomeinpolicyengine-usdoes not look safe as a generic cross-state public variable.At the moment it is:
policyengine-uspolicyengine-taxsim, which maps TAXSIMv36tostate_taxable_income0in some states that clearly have modeled state tax base logicThis looks less like a bug in state tax calculations themselves and more like an interface/ownership problem around the umbrella variable.
PE-US findings
1.
state_taxable_incomeis just a hand-maintained umbrella listThe variable itself is only:
policyengine_us/variables/gov/states/tax/income/state_taxable_income.pyThe real behavior comes from:
policyengine_us/parameters/gov/states/household/state_taxable_incomes.yaml2. The umbrella list is incomplete relative to actual state variables
The clearest concrete bug is New Hampshire:
policyengine_us/variables/gov/states/nh/tax/income/nh_taxable_income.pyexistspolicyengine_us/variables/gov/states/nh/tax/income/nh_income_tax_before_refundable_credits.pydirectly usesnh_taxable_incomenh_taxable_incomeis omitted fromstate_taxable_incomes.yamlSo the generic
state_taxable_incomeumbrella is definitely missing at least one real modeled taxable-income variable.Pennsylvania also looks wrong/inconsistent:
policyengine_us/variables/gov/states/pa/tax/income/taxable_income/pa_total_taxable_income.pyexistspolicyengine_us/variables/gov/states/pa/tax/income/taxable_income/pa_adjusted_taxable_income.pyexistspolicyengine_us/variables/gov/states/pa/tax/income/forgiveness/pa_income_tax_before_forgiveness.pyusespa_adjusted_taxable_incomestate_taxable_incomes.yamlThat means the generic umbrella currently reports
0for PA even though PE-US clearly has state taxable-income concepts and uses them in the tax path.3. Some omissions are probably legitimate
Massachusetts looks like an intentional omission:
state_taxable_incomes.yamlexplicitly comments that MA has multiple taxable income variablespolicyengine_us/variables/gov/states/ma/tax/income/ma_income_tax_before_credits.pytaxes multiple bases (ma_part_a_taxable_dividend_income,ma_part_a_taxable_capital_gains_income,ma_part_b_taxable_income,ma_part_c_taxable_income)Washington also seems legitimate:
policyengine_us/variables/gov/states/wa/tax/income/wa_income_tax_before_refundable_credits.pyis justwa_capital_gains_taxSo this is not just "add every omitted state".
4. The core state tax path does not use
state_taxable_incomeI could not find any internal PE-US use of
state_taxable_incomebesides its own definition.By contrast, the actual cross-state tax aggregators are:
policyengine_us/parameters/gov/states/household/state_income_tax_before_refundable_credits.yamlpolicyengine_us/parameters/gov/states/household/state_refundable_credits.yamland final state tax is built in:
policyengine_us/variables/household/income/household/household_state_income_tax.pySo
state_taxable_incomedoes not appear to be part of core PE-US tax computation.policyengine-taxsim coupling
policyengine-taxsimcurrently depends on this variable for TAXSIM outputv36:policyengine_taxsim/config/variable_mappings.yamlv36 -> state_taxable_incomeIt is also surfaced in docs/UI:
dashboard/src/constants/index.jsREADME.mdThere is also an older
policyengine-taxsimemulator that assumes a"{state}_taxable_income"naming convention:policyengine-taxsim/taxsim_emulator.pySo the current situation is:
state_taxable_incomeDownstream / benchmarking impact
This came up while investigating state-tax targets in PolicyBench.
Empirically, the variable is misleading in current outputs:
23households wherestate_taxable_income == 0whilestate_income_tax_before_refundable_credits != 0state_taxable_income = 0while pre-credit state tax was nonzerostate_taxable_income = 0while pre-credit state tax was nonzeroThat makes
state_taxable_incomehard to interpret as a cross-state benchmark target.Suggested decision / audit
I think this needs an explicit decision rather than a small patch.
Questions to resolve:
state_taxable_incomeremain a public generic PE-US variable at all?v36logic live inpolicyengine-taxsiminstead as a dedicated compatibility adapter, e.g.taxsim_v36/taxsim_state_taxable_income?Strategy options
Option A: Keep
state_taxable_income, but audit and define it properly0/ are undefinedOption B: Deprecate/remove
state_taxable_incomefrom PE-US as a universal conceptv36semantics intopolicyengine-taxsimOption C: Split concepts
My current lean
I lean toward B or C, not A.
Reason:
policyengine-taxsimappears to be the only meaningful consumerSo the cleanest architecture may be:
state_taxable_incomeas a generic PE-US variablev36semantics explicitly inpolicyengine-taxsimIf maintainers prefer, this can be split into:
state_taxable_incomepolicyengine-taxsimissue forv36ownership/mapping