OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration
Paper
•
2602.05400
•
Published
•
287
None defined yet.
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models