58 posts in total
2024
TVM Python/C++ Interaction
Efficient Large-Scale Language Model Training on GPU Clusters
Megatron-LM
PipeFusion-Displaced Patch Pipeline Parallelism for Inference of DiT Models
xDiT Principle
Ring Attention Principle
Wafer-scale Computing Advancements, Challenges, and Future Perspectives
Outcourse Learning Materials
Open-Sora 1.2
PMPP Learning-Chapter 15 Graph traversal