Running 3.61k The Ultra-Scale Playbook π 3.61k The ultimate guide to training LLM on large GPU Clusters
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition Paper β’ 2305.05084 β’ Published May 8, 2023 β’ 3
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 14 days ago β’ 89