view post Post 2603 Hey everyone! Just wanted to share this awesome dataset that features over 1 million tokens specifically for the Egyptian dialect. Check it HeshamHaroon/1milion_token_EGY_songs
Fav dataset HuggingFaceFW/fineweb Viewer • Updated Jul 11, 2025 • 52.5B • 202k • 2.66k HeshamHaroon/ArzEn-MultiGenre Viewer • Updated Dec 31, 2023 • 26k • 1.18k • 12 gretelai/synthetic_text_to_sql Viewer • Updated Dec 16, 2025 • 106k • 2.27k • 628 Anthropic/persuasion Viewer • Updated Apr 9, 2024 • 3.94k • 210 • 202
My favorite models meta-llama/Meta-Llama-3-8B Text Generation • 8B • Updated Sep 27, 2024 • 1.89M • • 6.46k meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.46M • • 4.38k ai21labs/Jamba-v0.1 Text Generation • Updated Sep 11, 2024 • 915 • 1.19k CohereLabs/c4ai-command-r-plus Text Generation • Updated Apr 16, 2025 • 2.33k • 1.77k
Fav dataset HuggingFaceFW/fineweb Viewer • Updated Jul 11, 2025 • 52.5B • 202k • 2.66k HeshamHaroon/ArzEn-MultiGenre Viewer • Updated Dec 31, 2023 • 26k • 1.18k • 12 gretelai/synthetic_text_to_sql Viewer • Updated Dec 16, 2025 • 106k • 2.27k • 628 Anthropic/persuasion Viewer • Updated Apr 9, 2024 • 3.94k • 210 • 202
My favorite models meta-llama/Meta-Llama-3-8B Text Generation • 8B • Updated Sep 27, 2024 • 1.89M • • 6.46k meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.46M • • 4.38k ai21labs/Jamba-v0.1 Text Generation • Updated Sep 11, 2024 • 915 • 1.19k CohereLabs/c4ai-command-r-plus Text Generation • Updated Apr 16, 2025 • 2.33k • 1.77k