Open Coding Agents Specialization Collection Ai2 Open Coding Agents - Django, Sphinx, Sympy Data • 6 items • Updated Feb 11 • 4
NeMo Gym Collection Collection of RL verifiable data for NeMo Gym • 22 items • Updated 3 days ago • 49
OpenEnv Environment Hub Collection All OpenEnv-tagged environments on Hugging Face Hub • 173 items • Updated 16 days ago • 3
How2Everything Collection Artifacts related to "How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs" • 2 items • Updated Feb 9 • 1
DR Tulu Collection Models and data associated with DR Tulu, http://allenai-web/papers/drtulu • 6 items • Updated 19 days ago • 35
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 109
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report Paper • 2601.21051 • Published Jan 28 • 14
Scaling Law Discovery Collection Dataset and results for SLD (https://arxiv.org/abs/2507.21184) • 2 items • Updated Jan 8 • 2
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published Jan 21 • 72
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published Dec 19, 2024 • 39
CIMemories: A Compositional Benchmark for Contextual Integrity of Persistent Memory in LLMs Paper • 2511.14937 • Published Nov 18, 2025 • 1
Foundation-Sec-8B Collection Foundation-Sec-8B models and quantizations. • 8 items • Updated Jan 28 • 6