FlashKDA: Moonshot's CUTLASS Kernel for Kimi Linear (2026)
May 2, 2026
Moonshot open-sourced FlashKDA, a CUTLASS CUDA kernel for Kimi Delta Attention. Drop-in for flash-linear-attention with up to 2.22x prefill speedup on H20 GPUs.
Moonshot open-sourced FlashKDA, a CUTLASS CUDA kernel for Kimi Delta Attention. Drop-in for flash-linear-attention with up to 2.22x prefill speedup on H20 GPUs.
One email per week — courses, deep dives, tools, and AI experiments.
No spam. Unsubscribe anytime.