Skip to content
Change the repository type filter

All

    Repositories list

    • Python
      0003Updated Sep 17, 2025Sep 17, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      10k001Updated Sep 17, 2025Sep 17, 2025
    • LoRAFusion: Efficient LoRA Fine-Tuning for LLMs
      Python
      0100Updated Sep 17, 2025Sep 17, 2025
    • Python
      0293Updated Sep 16, 2025Sep 16, 2025
    • codex

      Public
      A comprehensive collection of integration examples for CentML. This repository serves as a resource hub for developers looking to seamlessly incorporate CentML's capabilities into their applications. Explore a variety of use cases and implementations to accelerate your integration process.
      Python
      1606Updated Sep 13, 2025Sep 13, 2025
    • A modular, extensible LLM inference benchmarking framework that supports multiple benchmarking frameworks and paradigms.
      Python
      21190Updated Aug 27, 2025Aug 27, 2025
    • Mist

      Public
      [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
      Python
      51602Updated Aug 6, 2025Aug 6, 2025
    • MDX
      0120Updated Jul 24, 2025Jul 24, 2025
    • Pull from private ECR repos... anywhere
      Go
      1111Updated Jul 18, 2025Jul 18, 2025
    • An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
      Python
      1.9k000Updated Jun 5, 2025Jun 5, 2025
    • A Kubernetes Operator to create and manage Cloudflare Tunnels and DNS records for (HTTP/TCP/UDP*) Service Resources
      Go
      52000Updated May 3, 2025May 3, 2025
    • aisuite

      Public
      Simple, unified interface to multiple Generative AI providers
      Python
      1.2k008Updated Apr 29, 2025Apr 29, 2025
    • Sylva

      Public
      Boost fine-tuning performance with sparse embedded adapters and hierarchical approximate second-order information.
      Python
      0202Updated Apr 29, 2025Apr 29, 2025
    • Go
      1000Updated Apr 17, 2025Apr 17, 2025
    • A simple configurable kubernetes sidecar injector.
      Go
      1000Updated Apr 17, 2025Apr 17, 2025
    • An Agent that reviews the papers published on a given day and picks the one most aligned with our mission.
      TypeScript
      1000Updated Mar 14, 2025Mar 14, 2025
    • Benchmarking suite for popular AI APIs
      Python
      16003Updated Feb 12, 2025Feb 12, 2025
    • 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.
      Python
      106400Updated Jan 21, 2025Jan 21, 2025
    • Composable building blocks to build Llama Apps
      Python
      1.2k000Updated Jan 2, 2025Jan 2, 2025
    • 🔮 Execution time predictions for deep neural network training iterations across different GPUs.
      Python
      31430Updated Dec 16, 2024Dec 16, 2024
    • 🛠 VSCode plugin that provides visual interface for CentML Tools
      TypeScript
      21520Updated Dec 5, 2024Dec 5, 2024
    • platform_docs

      Public archive
      CentML Platform Documentation
      MDX
      13000Updated Nov 5, 2024Nov 5, 2024
    • FastChat

      Public
      An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
      Python
      4.8k000Updated Sep 17, 2024Sep 17, 2024
    • Distributed ML Training and Fine-Tuning on Kubernetes
      Go
      815001Updated Aug 22, 2024Aug 22, 2024
    • Lightweight and extensible LLM Inference serving benchmark tool written in Rust.
      Rust
      0400Updated Apr 4, 2024Apr 4, 2024
    • examples

      Public
      A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
      Python
      9.7k000Updated Feb 28, 2024Feb 28, 2024
    • Jupyter Notebook
      0000Updated Jun 14, 2023Jun 14, 2023
    • Repository containing the necessary files for TMLS 2023 demo
      Python
      0000Updated Jun 6, 2023Jun 6, 2023
    • Python script to estimate GPU utilization using NVIDIA Nsight Systems
      Python
      0500Updated Apr 27, 2023Apr 27, 2023
    • cortex

      Public
      Production infrastructure for machine learning at scale
      Go
      605000Updated Mar 31, 2023Mar 31, 2023