Skip to content

Add plot_reward_shapes.py#47

Merged
rbx merged 2 commits intomasterfrom
plot-reward-shapes
Apr 7, 2026
Merged

Add plot_reward_shapes.py#47
rbx merged 2 commits intomasterfrom
plot-reward-shapes

Conversation

@rbx
Copy link
Copy Markdown
Member

@rbx rbx commented Apr 7, 2026

No description provided.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fd70d0d0-6d15-4bd9-8673-c531c3866c7a

📥 Commits

Reviewing files that changed from the base of the PR and between 08c6e1f and 8f4bc0d.

📒 Files selected for processing (1)
  • src/reward_calculation.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/reward_calculation.py

📝 Walkthrough

Walkthrough

Adds a new standalone plotting script that visualizes reward/penalty component shapes and refactors the drop-penalty computation in RewardCalculator into a bounded tanh-based helper method, updating internal calculation to use it.

Changes

Cohort / File(s) Summary
New Visualization Script
plot_reward_shapes.py
Added an executable script that builds a RewardCalculator instance (using a _MockPrices shim) and renders a 3×2 grid of plots. Six panel helpers sweep inputs (utilization, price, idle nodes, job age, dropped jobs, drop-penalty) and support CLI args -o/--output and --context-avg.
Reward Calculation Refactor
src/reward_calculation.py
Introduced RewardCalculator.DROP_PENALTY_TAU and a new RewardCalculator._penalty_drop(self, num_dropped: int) -> float implementing -tanh(num_dropped / DROP_PENALTY_TAU). Replaced inline drop-penalty logic in calculate() to call the new method; removed PENALTY_DROPPED_JOB usage/import and updated comments/docstring formatting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • RewardCalculator #6 — Modifies and uses the same RewardCalculator class; likely related changes to penalty calculation and usages.
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary change: adding a new executable script for visualizing reward component shapes.
Description check ✅ Passed No description was provided by the author, but this is acceptable for straightforward changes like adding a visualization script.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
plot_reward_shapes.py (2)

165-165: Minor: Ambiguous Unicode character in docstring.

The × (multiplication sign U+00D7) may cause linting warnings. Consider using ASCII x.

Proposed fix
 def plot_reward_shapes(output: str | None = None, context_avg: float = 80.0) -> None:
-    """Generate 3×2 grid of reward component shape plots."""
+    """Generate 3x2 grid of reward component shape plots."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plot_reward_shapes.py` at line 165, The docstring containing "Generate 3×2
grid of reward component shape plots." uses the Unicode multiplication sign (×)
which can trigger lint warnings; update the docstring in the function or module
that describes the 3x2 grid (the string "Generate 3×2 grid of reward component
shape plots.") to use the ASCII character 'x' instead ("Generate 3x2 grid of
reward component shape plots.").

64-83: Consider adding strict=True to zip().

Per static analysis, zip() without strict= can silently truncate if iterables differ in length.

Proposed fix
-    for eq_nodes, color in zip(eq_nodes_list, colors):
+    for eq_nodes, color in zip(eq_nodes_list, colors, strict=True):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plot_reward_shapes.py` around lines 64 - 83, The loop in _plot_price uses
zip(eq_nodes_list, colors) which can silently truncate if lengths differ; change
the iteration to zip(eq_nodes_list, colors, strict=True) so Python raises an
error on length mismatch and prevents a hidden bug—update the call in
_plot_price where eq_nodes_list and colors are zipped (and ensure running on
Python 3.10+ or add a comment if backporting is needed).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plot_reward_shapes.py`:
- Around line 137-157: The plot references a non-existent attribute
calc.DROP_PENALTY_TAU in _plot_drop causing an AttributeError; either remove the
ax.axvline call and its label (delete the reference to calc.DROP_PENALTY_TAU) so
the plot uses the existing linear _penalty_drop implementation, or if saturation
is intended, add DROP_PENALTY_TAU to RewardCalculator and update the
_penalty_drop implementation to use a tanh-style curve that depends on
DROP_PENALTY_TAU; locate symbols _plot_drop, RewardCalculator, and _penalty_drop
to apply the chosen fix.

In `@src/reward_calculation.py`:
- Around line 318-321: The docstring describes a tanh saturation but
_penalty_drop currently returns a linear cap; implement the tanh version to
match documentation and other code (e.g., plot_reward_shapes.py) by computing a
negative-saturated tanh: return -abs(PENALTY_DROPPED_JOB) *
math.tanh(num_dropped / DROP_PENALTY_TAU) (import math if needed), ensuring the
output is bounded in [-abs(PENALTY_DROPPED_JOB), 0] (or adjust to exactly [-1,0]
if PENALTY_DROPPED_JOB is a scaling factor), and keep the docstring stating the
tanh saturation bounded in [-1, 0]; reference symbols: _penalty_drop,
PENALTY_DROPPED_JOB, DROP_PENALTY_TAU.

---

Nitpick comments:
In `@plot_reward_shapes.py`:
- Line 165: The docstring containing "Generate 3×2 grid of reward component
shape plots." uses the Unicode multiplication sign (×) which can trigger lint
warnings; update the docstring in the function or module that describes the 3x2
grid (the string "Generate 3×2 grid of reward component shape plots.") to use
the ASCII character 'x' instead ("Generate 3x2 grid of reward component shape
plots.").
- Around line 64-83: The loop in _plot_price uses zip(eq_nodes_list, colors)
which can silently truncate if lengths differ; change the iteration to
zip(eq_nodes_list, colors, strict=True) so Python raises an error on length
mismatch and prevents a hidden bug—update the call in _plot_price where
eq_nodes_list and colors are zipped (and ensure running on Python 3.10+ or add a
comment if backporting is needed).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 83f0b9d4-e93a-4c7a-a217-de23f87e481c

📥 Commits

Reviewing files that changed from the base of the PR and between 3a5fd54 and 08c6e1f.

📒 Files selected for processing (2)
  • plot_reward_shapes.py
  • src/reward_calculation.py

@rbx rbx merged commit 24dd71a into master Apr 7, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant