Skip to content
Change the repository type filter

All

    Repositories list

    • cccl

      Public
      CUDA Core Compute Libraries
      C++
      3092.1k1.1k208Updated Dec 19, 2025Dec 19, 2025
    • NeMo-Agent-Toolkit

      Public
      The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
      Python
      4591.6k5732Updated Dec 19, 2025Dec 19, 2025
    • The CUDA target for Numba
      Python
      512329925Updated Dec 19, 2025Dec 19, 2025
    • spark-rapids-tools

      Public
      User tools for Spark RAPIDS
      Scala
      47652621Updated Dec 19, 2025Dec 19, 2025
    • cuda-python

      Public
      CUDA Python: Performance meets Productivity
      Cython
      2333.1k20215Updated Dec 19, 2025Dec 19, 2025
    • Ongoing research training transformer models at scale
      Python
      3.4k15k343248Updated Dec 19, 2025Dec 19, 2025
    • nv-ingest

      Public
      NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
      Python
      2812.8k10131Updated Dec 19, 2025Dec 19, 2025
    • Fuser

      Public
      A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
      C++
      74366209215Updated Dec 19, 2025Dec 19, 2025
    • OSMO

      Public
      The developer-first platform for scaling complex Physical AI workloads across heterogeneous compute—unifying training GPUs, simulation clusters, and edge devices in a simple YAML
      Python
      6601412Updated Dec 19, 2025Dec 19, 2025
    • cuda-quantum

      Public
      C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
      C++
      31387540581Updated Dec 19, 2025Dec 19, 2025
    • A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
      Python
      2181.7k5653Updated Dec 19, 2025Dec 19, 2025
    • TensorRT-LLM

      Public
      TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
      Python
      2k12k554485Updated Dec 19, 2025Dec 19, 2025
    • BioNeMo Framework: For building and adapting AI models in drug discovery at scale
      Jupyter Notebook
      10860660108Updated Dec 18, 2025Dec 18, 2025
    • MatX

      Public
      An efficient C++20 GPU numerical computing library with Python-like syntax
      C++
      1091.4k336Updated Dec 18, 2025Dec 18, 2025
    • TileGym

      Public
      Helpful kernel tutorials and examples for tile-based GPU programming
      Python
      2245201Updated Dec 18, 2025Dec 18, 2025
    • nvrc

      Public
      The NVRC project provides a Rust binary that implements a simple init system for microVMs.
      Rust
      91951Updated Dec 18, 2025Dec 18, 2025
    • open-gpu-kernel-modules

      Public
      NVIDIA Linux open GPU kernel module source
      C
      1.5k16k22246Updated Dec 18, 2025Dec 18, 2025
    • Experimental projects related to TensorRT
      MLIR
      221173712Updated Dec 18, 2025Dec 18, 2025
    • RAPIDS Accelerator JNI For Apache Spark
      Cuda
      7652867Updated Dec 18, 2025Dec 18, 2025
    • aistore

      Public
      AIStore: scalable storage for AI applications
      Go
      2311.7k10Updated Dec 18, 2025Dec 18, 2025
    • cuopt

      Public
      GPU accelerated decision optimization
      Cuda
      1046248533Updated Dec 18, 2025Dec 18, 2025
    • skyhook

      Public
      A Kubernetes Operator to manage Node OS customizations.
      Go
      33402Updated Dec 18, 2025Dec 18, 2025
    • NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to failures and interruptions.
      Python
      39239217Updated Dec 18, 2025Dec 18, 2025
    • Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods
      Python
      5192.2k3942Updated Dec 18, 2025Dec 18, 2025
    • NVFlare

      Public
      NVIDIA Federated Learning Application Runtime Environment
      Python
      2268491521Updated Dec 18, 2025Dec 18, 2025
    • A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
      Python
      5823k280102Updated Dec 18, 2025Dec 18, 2025
    • linux

      Public
      OpenBMC Linux kernel source tree
      C
      60k800Updated Dec 18, 2025Dec 18, 2025
    • The NVIDIA NeMo Agent Toolkit UI streamlines interacting with NeMo Agent Toolkit workflows in an easy-to-use web application.
      TypeScript
      4459714Updated Dec 18, 2025Dec 18, 2025
    • DALI

      Public
      A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
      C++
      6555.6k22331Updated Dec 18, 2025Dec 18, 2025
    • jaxpp

      Public
      JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training
      Python
      16101Updated Dec 18, 2025Dec 18, 2025