sglang-flash-attn3
Pre-built Flash Attention 3 (forward-only) CUDA kernels from sgl-flash-attn
Kernel source: kernels-community/sgl-flash-attn3
Usage
pip install kernels
from kernels import get_kernel
fa3 = get_kernel("sgl-project/sgl-flash-attn3", version=1)
fa3.flash_attn_varlen_func(q, k, v, cu_seqlens_q, cu_seqlens_k, causal=True)
fa3.flash_attn_with_kvcache(q, k_cache, v_cache, cache_seqlens=cache_seqlens, causal=True)
fa3.is_fa3_supported() # True on H100/H200
Credits
- Tri Dao - Flash Attention 3
- SGLang -
sgl_kernelFA3 implementation - HuggingFace - kernel-builder infrastructure
- Downloads last month
- 28
kernels
bsd-3-clause
Supported hardwares new
CUDA 8.09.0a
- OS
- linux
- Arch
- x86_64



