feat: Add VeriHop multimodal multi-hop environment by nevasini1 · Pull Request #1049 · PrimeIntellect-ai/verifiers

nevasini1 · 2026-03-21T15:05:07Z

Why this PR

Verifiers already had solid text RLVR patterns and a basic multimodal path (e.g. single-turn MMMU-style prompts), but there was no first-class example of multi-hop visual reasoning: same image, dependent questions, optional tool use, and rewards that can reflect process (per-hop answers and grounding) as well as a final verifiable outcome.

VeriHop is that reference environment. It is meant to be easy to extend (more hop types, real image sources, curriculum) and to train the kinds of habits papers like HopChain emphasize—re-grounding, dependency across steps, long CoT stability—without tying the repo to a single benchmark or a brittle port.

What you get

A small core helper (add_image) so environment authors can append OpenAI-style image blocks to user messages without duplicating MMMU-style base64 boilerplate.
A packaged environment under environments/verihop: procedural scenes, a fixed 3-hop task family (count → count → combine), two rollout modes (plain multi-turn vs tool-augmented), and a rubric that can weight final boxed answer vs per-hop behavior (including optional grounding tags).
Docs, tests, and pytest wiring so the env is discoverable and CI can import it like other first-party envs.

Intent for reviewers

This is intentionally a vertical slice: enough to run real rollouts and iterate, not a claim of completeness. Follow-ups could add hub listing, reference docs, more image sources, variable hop counts, or splitting the core add_image change into its own PR if you prefer a minimal merge.

Notes

Happy to adjust scope, naming, or documentation to match how you want multimodal RLVR positioned in the project.

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-21T15:11:24Z

+(or publish to the hub). Use `name = "verihop"` / your env id and pass `use_tools` via the
+environment args your runner supports.
+
+See `examples/train_with_prime_rl.py` for a commented template.


New environment not added to environments README

Medium Severity

This PR adds a new verihop environment to environments/ but does not update environments/README.md to list it. The rule requires that any PR adding or removing an environment from the environments/ folder must update environments/README.md to reflect the change, including listing it under the appropriate category/pattern section.

^{Triggered by project rule: BugBot Instructions}

cursor · 2026-03-21T15:11:24Z

 from .types import DatasetBuilder  # noqa # isort: skip
 from .parsers.parser import Parser  # noqa # isort: skip
 from .rubrics.rubric import Rubric  # noqa # isort: skip
+from .messages import add_image


Core add_image API missing from reference docs

Low Severity

add_image is added as a new core user-facing function exported from verifiers/__init__.py and __all__, but docs/reference.md is not updated to document it. The rule requires that PRs adding core user-facing functionality update the relevant documentation, including docs/reference.md.

Additional Locations (1)

verifiers/messages.py#L13-L51

^{Triggered by project rule: BugBot Instructions}

cursor · 2026-03-21T15:11:24Z

+
+def _norm_num(s: str) -> str:
+    m = re.search(r"-?\d+", s)
+    return m.group(0) if m else s.strip().lower()


_norm_num string comparison fails on leading zeros

Low Severity

_norm_num extracts the first digit sequence via regex and compares as a raw string, so numerically equivalent values like "07" and "7" are treated as unequal. If a model produces \boxed{07} or <hop_answer>07</hop_answer>, both outcome_reward and process_reward would incorrectly score as 0.0. Converting through int() (e.g., str(int(m.group(0)))) would fix this.

- Add verifiers.messages.add_image for OpenAI-style image_url parts - New environments/verihop: procedural scene synthesis, VeriHopEnv, VeriHopToolEnv (StatefulToolEnv + hop advancement on text turns), VeriHopRubric (outcome + per-hop process), PIL tool helpers - Docs and pytest coverage; pytest pythonpath for local verihop imports Made-with: Cursor

cursor bot reviewed Mar 21, 2026

View reviewed changes

nevasini1 force-pushed the feature/verihop-multimodal-env branch from af6fe29 to 6e48b1b Compare March 21, 2026 15:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add VeriHop multimodal multi-hop environment#1049

feat: Add VeriHop multimodal multi-hop environment#1049
nevasini1 wants to merge 1 commit intoPrimeIntellect-ai:mainfrom
nevasini1:feature/verihop-multimodal-env

nevasini1 commented Mar 21, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 21, 2026

Uh oh!

cursor bot Mar 21, 2026

Uh oh!

cursor bot Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nevasini1 commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this PR

What you get

Intent for reviewers

Notes

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 21, 2026

Choose a reason for hiding this comment

New environment not added to environments README

Uh oh!

cursor bot Mar 21, 2026

Choose a reason for hiding this comment

Core add_image API missing from reference docs

Uh oh!

cursor bot Mar 21, 2026

Choose a reason for hiding this comment

_norm_num string comparison fails on leading zeros

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nevasini1 commented Mar 21, 2026 •

edited

Loading

Core `add_image` API missing from reference docs

`_norm_num` string comparison fails on leading zeros