#flash-linear-attention

FlashKDA: Moonshot's CUTLASS Kernel for Kimi Linear (2026)

May 2, 2026

Moonshot open-sourced FlashKDA, a CUTLASS CUDA kernel for Kimi Delta Attention. Drop-in for flash-linear-attention with up to 2.22x prefill speedup on H20 GPUs.

#FlashKDA #Kimi Delta Attention

FREE WEEKLY NEWSLETTER

Stay on the Nerd Track

One email per week — courses, deep dives, tools, and AI experiments.

No spam. Unsubscribe anytime.