Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
updated a model about 5 hours ago
mehuldamani/bug_fixing_sft-latent-v1 published a model about 5 hours ago
mehuldamani/bug_fixing_sft-latent-v1 updated a dataset about 7 hours ago
mehuldamani/bug-fixing-latent-demos-v1Organizations
None yet