EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24, 2025 • 42
Investigating Regularization of Self-Play Language Models Paper • 2404.04291 • Published Apr 4, 2024 • 1
view post Post 15824 Diaries of Open Source. Part 15 🤗🕵️♀️Idefics 2 is out, a multimodal open-source model with very nice capabilitiesModels, demo, and datasets: HuggingFaceM4/idefics2-661d1971b7c50831dd3ce0feBlog: https://hf.co/blog/idefics2💾Snowflake released snowflake-arctic-embed, a family of powerful small embedding modelsModel: Snowflake/snowflake-arctic-embed-mBlog: https://www.snowflake.com/blog/introducing-snowflake-arctic-embed-snowflakes-state-of-the-art-text-embedding-family-of-models/✨Pile-T5, EleutherAI's T5 model trained on 2T tokensBlog: https://blog.eleuther.ai/pile-t5/Models: EleutherAI/pile-t5-65a76a0d0022dd270b385a66GitHub: https://github.com/EleutherAI/improved-t5🤖CodeQwen1.5-7B base and chat models. Models trained on 3T tokens strong benchmark results for code generation, editing and SQLBlog post: https://qwenlm.github.io/blog/codeqwen1.5/Demo: https://hf.co/spaces/Qwen/CodeQwen1.5-7b-Chat-demoModels: Qwen/CodeQwen1.5-7B and Qwen/CodeQwen1.5-7B-ChatMisc🦉 DocOwl1.5: Unified Stucture Learning for OCR-free Document Understanding mPLUG/DocOwl👀Cerule - a tiny Vision LM model Tensoic/Cerule-v0.1ChemLLM - a LLM for chemistry and molecule science ⚗️https://hf.co/AI4Chem/ChemLLM-7B-Chat-1.5-DPODistil Whisper Large📝New pdf/OCR datasets with 19 samples pixparse/pdf-document-ocr-datasets-660701430b0346f97c4bc628🔥Gretel AI high quality text-to-sql synthetic dataset gretelai/synthetic_text_to_sql 5 replies · 🔥 14 14 🤗 8 8 🚀 5 5 ❤️ 2 2 + Reply
view post Post 11396 Diaries of Open Source. Part 14 🤗🔥CohereForAI releases Command R+, an open 104B model with:- Tool usage capabilities- Specialized in RAGs- MultilingualIt's one of the first models to surpass GPT-4 in the lmsys arena, check it out!Model: https://hf.co/CohereForAI/c4ai-command-r-plusOfficial demo: https://hf.co/spaces/CohereForAI/c4ai-command-r-plusQuantized: https://hf.co/CohereForAI/c4ai-command-r-plus-4bit🎉Google releases a new version of their Gemma instruct models, with improved quality, nicer to converse, and a fancier RL algorithm. The model is similar to Llama 2 70B in the Chat Arena!Models: google/gemma-release-65d5efbccdbb8c4202ec078bTry it out in HuggingChat https://hf.co/chat/models/google/gemma-1.1-7b-it🪄VoiceCraft, a speech editing and TTS SOTA open modelPaper: VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild (2403.16973)Model: pyp1/VoiceCraft💻Google released CodeGemma, a family of code generation, completion, and chat modelsBlog post: https://hf.co/blog/codegemmaModels: google/codegemma-release-66152ac7b683e2667abdee11Report: https://storage.googleapis.com/deepmind-media/gemma/codegemma_report.pdfMisc models:🦖T-Rex2, a very powerful object detection model for many applications https://github.com/IDEA-Research/T-Rex👀 CT-RATE : A 3D dataset paired with text reports ibrahimhamamci/CT-RATE🐙Octopus v2: a Gemma-based model trained for Android API - extremely fast, better than Llama+RAG, great results https://hf.co/NexaAIDev/Octopus-v2 2 replies · 🔥 11 11 + Reply
view post Post 2409 Diaries of Open Source. Part 13 🤗🤏Two different bitnet 1.5 open-source replicationsOriginal paper: The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits (2402.17764)1bitllm experiment: https://hf.co/blog/joey00072/experiments-with-bitnet-1-5NousResearch experiment NousResearch/OLMo-Bitnet-1B 🥳Tiny and large multimodal models great for embeddingsGitHub: https://github.com/unum-cloud/uformEncoders: https://hf.co/collections/unum-cloud/multimodal-encoders-660553903617c5297eb16838ONNX weights: https://hf.co/collections/unum-cloud/uform-vl-english-large-onnx-66055a57c182d846f3bc1949📜 SMPLer-X: Expressive Human Pose and Shape EstimationProject website: https://caizhongang.com/projects/SMPLer-X/Demo: caizhongang/SMPLer-XPaper: SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation (2309.17448)🧙GeoWizard: 3D Geometry EstimationProject website: https://fuxiao0719.github.io/projects/geowizard/Demo: lemonaddie/geowizardMisc models and datasets- Dolphin-2.8-mistral-7b-v0.2 https://hf.co/cognitivecomputations/dolphin-2.8-mistral-7b-v02- Hermes-2-Pro-11B, a self-frankenmerge 11B variant mattshumer/Hermes-2-Pro-11B- Large conversational dataset based on Usenet data in the Italian language mii-community/UsenetArchiveIT-conversations 3 replies · ❤️ 10 10 🔥 5 5 + Reply