Flash Gated Delta Rule kernels in Pallas #1292

Calvin-Xu · 2025-11-04T21:56:33Z

This draft PR adds "flash" implementations of recurrent and chunk gated delta rules, based on the Triton kernels from https://github.com/fla-org/flash-linear-attention. Notable improvements are:

2D tiling on (num_heads, num_V_tiles) grid, with K dim tiled internally in a loop
fusing kernels, recomputing in backward, and other optimizations from the official FLA version
support for varlen (ragged inputs)

Existing tests were parameterized with @pytest.mark.parametrize("use_flash", [True, False]) and all pass on CPU pl.pallas_call(..., interpret=True). Additional work is in progress to 1. make them work correctly on TPUs (https://docs.jax.dev/en/latest/pallas/tpu/index.html) and 2. refactor and cleanup, in particular move the flash code to a separate file.

… calvin/gated_deltanet_sharding_fix

Calvin-Xu added 12 commits November 1, 2025 05:42

Add recurrent_gated_delta_rule_flash

4e387d3

Merge branch 'main' of https://github.com/stanford-crfm/levanter into…

a9cdd41

… calvin/gated_deltanet_sharding_fix

Don't test chunk w/ use_flash yet

4713a97

Add initial Pallas kernels (missing tiling & other features)

258abff

KV tiled recurrent kernel

5c740b5

Add initial tiled chunk kernel; precision issues

6e2735f

Add tiled chunk kernel (no backward yet)

5e9d371

Add initial flash chunk backward

d81bb27

Fusing chunk backward

eaecd06

debugging

0c279d1

Add working fused GDN chunk Backward

9b6cb6c

Pass on CPU (interpreted Pallas); need to fix TPU BlockSpecs

76afe49

Calvin-Xu self-assigned this Nov 4, 2025

Parameterize use_flash for layer tests & tweaks

dc70b8e

Calvin-Xu mentioned this pull request Nov 11, 2025

Create Pallas kernel that implement tiled versions of gated delta rules marin-community/marin#1884

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flash Gated Delta Rule kernels in Pallas #1292

Flash Gated Delta Rule kernels in Pallas #1292

Uh oh!

Calvin-Xu commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Flash Gated Delta Rule kernels in Pallas #1292

Are you sure you want to change the base?

Flash Gated Delta Rule kernels in Pallas #1292

Uh oh!

Conversation

Calvin-Xu commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants