Skip to content

Added 2:4 sparsity to skip softmax method#1019

Draft
rohansjoshi wants to merge 1 commit intomainfrom
rohjoshi/sparse24-plus-skipsoftmax
Draft

Added 2:4 sparsity to skip softmax method#1019
rohansjoshi wants to merge 1 commit intomainfrom
rohjoshi/sparse24-plus-skipsoftmax

Conversation

@rohansjoshi
Copy link
Contributor

Summary

Adds an apply_sparse24: bool config option to the existing flash_skip_softmax method. When enabled, a 2:4 structured sparsity mask (top-2 of every 4 elements along seq_k) is AND-ed with the skip-softmax block mask in
both prefill and decode phases.

This is a pure PyTorch-level feature for research and analysis — not a performance optimization. It allows studying the interaction between block-level and 2:4 structured sparsity patterns.

Changes

  • config.py — New apply_sparse24 field on SparseAttentionAttributeConfig; new SPARSE24_SKIP_SOFTMAX and SPARSE24_SKIP_SOFTMAX_CALIB preset configs.
  • methods/flash_skip_softmax.py — Reads the flag and applies the 2:4 mask inside calc_correction_factor_and_p.
  • hf_sa.py — Exposes --sparse_attn sparse24_skip_softmax and --sparse_attn sparse24_skip_softmax_calib as CLI choices.

Signed-off-by: Rohan Joshi <rohjoshi@nvidia.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 11, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1ec983e1-1088-4af5-861d-c1574a8bdfe1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rohjoshi/sparse24-plus-skipsoftmax

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 34.78261% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.22%. Comparing base (fe83270) to head (6afa360).

Files with missing lines Patch % Lines
...y/attention_sparsity/methods/flash_skip_softmax.py 21.05% 15 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1019      +/-   ##
==========================================
- Coverage   70.25%   70.22%   -0.04%     
==========================================
  Files         220      220              
  Lines       25368    25391      +23     
==========================================
+ Hits        17822    17830       +8     
- Misses       7546     7561      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant