Add Triton unified attention kernel with HuggingFace integration#1034
Add Triton unified attention kernel with HuggingFace integration#1034
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip CodeRabbit can use your project's `pylint` configuration to improve the quality of Python code reviews.Add a pylint configuration file to your project to customize how CodeRabbit runs |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1034 +/- ##
==========================================
- Coverage 70.11% 70.08% -0.03%
==========================================
Files 221 221
Lines 25459 25471 +12
==========================================
+ Hits 17851 17852 +1
- Misses 7608 7619 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
2af0b56 to
bc3d973
Compare
Add a Triton Flash Attention kernel that supports variable-length batching, GQA, causal/non-causal masking, and autograd-compatible forward/backward. Register it as attn_implementation="modelopt_triton" for HuggingFace models. Signed-off-by: Kai Xu <kaix@nvidia.com>
bc3d973 to
94cf742
Compare
Add a Triton Flash Attention kernel that supports variable-length batching, GQA, causal/non-causal masking, and autograd-compatible forward/backward. Register it as attn_implementation="modelopt_triton" for HuggingFace models.
What does this PR do?
Type of change: ?
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information