Repository Issues

xlite-dev/SageAttention

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Stars
 (0 stars)
Forks
 (0 forks)
Indexed issues
 (0 indexed issues)
open beginner issues
 (0 open beginner issues)
Latest indexed
Not indexed yet
Last GitHub push
Jan 17, 2026
License
No license data
Contributing guide
No contributing guide
Code of conduct
No code of conduct
Dominant language
Cuda
PR merge metrics
 (PR metrics pending)
Beginner labels
No beginner labels indexed

Issues

0 open indexed issues

No open indexed issues found for this repository.