A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Dépôts
Dépôts de milesial
Personal experimentation with Deep-Q-Networks for reinforcement learning
Graphormer is a deep learning package that allows researchers and developers to train custom models for molecule modeling tasks. It aims to accelerate the research and application in AI for molecule science, such as material design, drug discovery, etc.
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
AI Toolkit for Healthcare Imaging
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
NeMo: a toolkit for conversational AI
Participation in the 2018 PLAsTiCC Astronomical Classification from Kaggle
PyTorch implementation of the U-Net for image semantic segmentation with high quality images
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Offline optimization of your disaggregated Dynamo graph
AIPerf is a comprehensive benchmarking tool that measures the performance of generative AI models served by your preferred inference solution.
Cosmos-Reason2 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
Dockerized Jupyter kernels.
A Datacenter Scale Distributed Inference Serving Framework
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Build and train PyTorch models and connect them to the ML lifecycle using Lightning App templates, without handling DIY infrastructure, cost management, scaling, and other headaches.
Machine learning metrics for distributed, scalable PyTorch applications.