Human preference data, reward model training sets, and generative reward modeling data for training Nemotron reward models.