Skip to content
Change the repository type filter

All

    Repositories list

    • llmq

      Public
      Quantized LLM training in pure CUDA/C++.
      C++
      1422400Updated Dec 19, 2025Dec 19, 2025
    • This repo allows you to run Platinum Bench evals via vLLM.
      Python
      0001Updated Dec 19, 2025Dec 19, 2025
    • MoE-Quant

      Public
      Code for data-aware compression of DeepSeek models
      Python
      106631Updated Dec 11, 2025Dec 11, 2025
    • QuEST

      Public
      Work in progress.
      Jupyter Notebook
      77620Updated Nov 25, 2025Nov 25, 2025
    • EvoPress

      Public
      Python
      43810Updated Nov 22, 2025Nov 22, 2025
    • Quartet

      Public
      Jupyter Notebook
      1111320Updated Nov 18, 2025Nov 18, 2025
    • FP-Quant

      Public
      Python
      138973Updated Nov 16, 2025Nov 16, 2025
    • GridSearcher simplifies running grid searches for machine learning projects in Python, emphasizing parallel execution and GPU scheduling without dependencies on SLURM or other workload managers.
      Python
      0300Updated Nov 14, 2025Nov 14, 2025
    • nanochat

      Public
      The best ChatGPT that $100 can buy.
      Python
      5k000Updated Nov 12, 2025Nov 12, 2025
    • CAGE

      Public
      Python
      0000Updated Nov 11, 2025Nov 11, 2025
    • qutlass

      Public
      QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning
      C++
      1414920Updated Nov 11, 2025Nov 11, 2025
    • A PyTorch native platform for training generative AI models
      Python
      653000Updated Nov 11, 2025Nov 11, 2025
    • The best ChatGPT that $100 can buy.
      Python
      5k101Updated Oct 31, 2025Oct 31, 2025
    • CAGE-ao

      Public
      PyTorch native quantization and sparsity for training and inference
      Python
      389000Updated Oct 23, 2025Oct 23, 2025
    • This repository contains the code for the "Unified Scaling Laws for Compressed Representations" study.
      Python
      0000Updated Oct 23, 2025Oct 23, 2025
    • Python
      01100Updated Oct 8, 2025Oct 8, 2025
    • Efficient non-uniform quantization with GPTQ for GGUF
      Python
      45701Updated Sep 17, 2025Sep 17, 2025
    • Example of YOLOv8 pose detection (estimation) on browser. It shows implementations powered by ONNX and TFJS served through JavaScript without any frameworks. It demonstrates pose detection (estimation) on image as well as live web camera,
      HTML
      4000Updated Jun 13, 2025Jun 13, 2025
    • Official implementation of Influence Distillation: https://www.arxiv.org/abs/2505.19051
      Python
      0310Updated May 29, 2025May 29, 2025
    • PanzaMail

      Public
      Python
      1929846Updated Apr 8, 2025Apr 8, 2025
    • HALO-anon

      Public
      0000Updated Apr 1, 2025Apr 1, 2025
    • torch_cgx

      Public
      Pytorch distributed backend extension with compression support
      C++
      01640Updated Mar 24, 2025Mar 24, 2025
    • gemm-int8

      Public
      High Performance Int8 GEMM Kernels for SM80 and later GPUs.
      Python
      01800Updated Mar 11, 2025Mar 11, 2025
    • DarwinLM

      Public
      Official Pytorch Implementation of Paper "DarwinLM: Evolutionary Structured Pruning of Large Language Models"
      Python
      32000Updated Feb 21, 2025Feb 21, 2025
    • Official Repository for "Scalable Mechanistic Neural Networks" (ICLR 2025)
      Python
      0200Updated Feb 19, 2025Feb 19, 2025
    • SPADE

      Public
      Code of SPADE: Sparsity Guided Debugging for Deep Neural Networks
      Jupyter Notebook
      3110Updated Feb 18, 2025Feb 18, 2025
    • HALO

      Public
      HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arxiv.org/abs/2501.02625
      Python
      02910Updated Feb 17, 2025Feb 17, 2025
    • gemm-fp8

      Public
      High Performance FP8 GEMM Kernels for SM89 and later GPUs.
      Cuda
      12000Updated Jan 24, 2025Jan 24, 2025
    • MicroAdam

      Public
      This repository contains code for the MicroAdam paper.
      Python
      42110Updated Dec 14, 2024Dec 14, 2024
    • LLM training code for Databricks foundation models
      Python
      580001Updated Nov 27, 2024Nov 27, 2024