Skip to content

v0.2.13

Compare
Choose a tag to compare
@yongwww yongwww released this 20 Aug 23:23
· 99 commits to main since this release
0973054

What's Changed

  • test: add top_k_sampling_with_variable_k test by @JasonJ2021 in #1505
  • benchmark: add moe to benchmark by @nv-yunzheq in #1497
  • update allreduce to match trtllm by @nvjullin in #1507
  • Support cuda<12.8 built for trtllm_allreduce_fusion. by @strgrb in #1508
  • gpt-oss: Add MXFP8 x MXFP4 CUTLASS MOE for SM100 and BF16 x MXFP4 CUTLASS for SM90 + SwigluBias Activation by @djmmoss in #1396
  • tuner: Trtllm-gen Fp4 MoE Autotunner by @IwakuraRein in #1475
  • refactor fp4 masked gemm cute-dsl implementation and add manual cache by @yzh119 in #1521
  • fix: add missing 'requests' when building the package with AOT by @EmilienM in #1517
  • Fix cuda-python v13.0 import compatibility by @yongwww in #1455
  • misc: add license of spdlog for packaging by @yzh119 in #1522
  • Fix linking errors with CUDA 13 by @yongwww in #1523
  • release: bump version to v0.2.13 by @yongwww in #1524

New Contributors

Full Changelog: v0.2.12...v0.2.13