Change the repository type filter
All
Repositories list
642 repositories
- CUDA Core Compute Libraries
numba-cuda
PublicMegatron-LM
PublicOngoing research training transformer models at scale- NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
Fuser
Public- C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
Model-Optimizer
PublicA unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
bionemo-framework
PublicBioNeMo Framework: For building and adapting AI models in drug discovery at scaleTileGym
PublicTensorRT-Incubator
Publicspark-rapids-jni
Public- AIStore: scalable storage for AI applications
cuopt
PublicGPU accelerated decision optimizationnvidia-resiliency-ext
Publicphysicsnemo
PublicOpen-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods- NVIDIA Federated Learning Application Runtime Environment
TransformerEngine
PublicA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.NeMo-Agent-Toolkit-UI
PublicDALI
PublicA GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.jaxpp
Public