fix: flamegraph regression#14
Closed
not-matthias wants to merge 4 commits into
Closed
Conversation
Member
not-matthias
commented
May 29, 2026
- test(callgrind): C reproducer for cascading underflow obj-skip leak
- fix(callgrind): aggregate (cxt==0) and underflow leaks under a sentinel cxt
- fix(callgrind): check obj-skip on every BB entry, not only jk_Call
- fix(callgrind): drop BBCCs whose top context fn is skip-flagged
Triggers the call-stack-underflow leak channel observed in the Python case (28 underflow events / run, almost all libpython interpreter frames). Mechanism: - Lib runs recursive skipme_recurse(N) with instrumentation OFF, so callgrind never sees the calls and its csp stays at 0. - At the deepest frame (n==0), CALLGRIND_START_INSTRUMENTATION fires. - Each RET on the way back hits csp == 0, triggers handleUnderflow, resets cxt to 0, and force-pushes the fn we're returning into. - Because that fn is in the skipped lib, it leaks as a top-level fn= block in the dump — N times for an N-deep recursion. With depth=5 the diagnostic logs show 1 (cxt==0) push + 6 underflow resets (5x skipme_recurse + 1x skipme_run), and the .out has fn=skipme_run and fn=skipme_recurse as top-level blocks.
…el cxt When setup_bbcc's (cxt==0) clause or handleUnderflow would force-push a skipped fn into the current context, push a synthetic (skipped) fn instead. The skipped fn keeps its costs (routed normally through the sentinel cxt) but never surfaces as its own fn= block — the dump shows a single ob=??? fl=(callgrind-internal) fn=(skipped) block aggregating all leaked frames. The sentinel itself has skip=False so the (cxt==0 && skip) substitution doesn't recurse on it. Created lazily on first need via a singleton in fn.c, attached to the anonymous '???' obj. Verified against both C reproducers (runtime_obj_skip_c and runtime_obj_skip_underflow): no skipme_* fn= blocks appear, totals are preserved. Verified against a non-skipped-attribution test that main / do_main_work still emit normally; the sentinel only engages on the leak paths.
The obj-skip check was gated on jmpkind == jk_Call. When a function in a skipped object was entered via jk_Jump or fall-through (interpreter dispatch, tail calls, perf trampoline, JIT), the skip flag never latched and the function leaked into the dump as its own fn= block. Also instrument the cxt==0 forced push_cxt path with a diagnostic line so we can measure the residual leak when a skipped fn is forced into a top-level context after an instrumentation start or call-stack underflow.
When the (cxt == 0) clause in setup_bbcc force-pushes a skipped fn — e.g. the first BB after CALLGRIND_START_INSTRUMENTATION lives in a skipped object — the BBCC ends up with cxt->fn[0]->skip == True. Without filtering, those BBCCs emit a top-level fn= block and the skipped fn leaks into the dump. Filter them out at dump time in print_bbccs_of_thread, right before print_fn_pos would emit the ob=/fl=/fn= header. The call edges from non-skipped callers into skipped fns (cfn=) are unaffected because they're emitted from the caller's BBCC, whose context is not skipped. Also broadens the runtime_obj_skip_c post-check to grep for any fn=skipme*, since the actual leaked fn in the repro is skipme_run (the one calling START_INSTRUMENTATION), not skipme_func.
Merging this PR will improve performance by 13.8%
Performance Changes
Tip Curious why this is faster? Comment Comparing Footnotes
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.