Repositories
feifeibear repositories
ADMM-NeuralNetwork was implemented by a potato
Dive into Big Model Training
Fast CUDA Kernels for ResNet Inference. Using Winograd algorithm to optimize the efficiency of convolutional computing.
An RNN-based Chinese Poem Generator
Quantized Attention on GPU
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Colossal-AI: A Unified Deep Learning System for Big Model Era
Examples of training models with hybrid parallelism using ColossalAI
[WIP] The all in one inference optimization solution for ComfyUI, universal, flexible, and fast.
Test for PyTorch Async Collective Communication
A benchmark liberary for Colossal-AI.