Codebase for ICML 2025 Paper

"Reviving The Classics: Active Reward Modeling in Large Language Model Alignment"

Authors: Yunyi Shen, Hao Sun, Jean-Francois Ton. The first two authors contribute equally.

We have a series of work focusing on reward models in RLHF:

Part I. Reward Model Foundation ICLR'2025 Oral, Code Repo
Part II. Active Reward Modeling (This paper/repo) ICML 2025
Part III. Accelerating Reward Model Research with our Infra. Preprint, Code Repo

Structure of the repo

The algorithms we tested were implemented in model, there are two algorithms from other authors, namely coreset (Huggins et al. 2016) in lrcoresets and batchBALD (Kirsch et al 2019) in batchbald_redux, we did minimal modification to make sure then can be compatible with our computation environment.

The experiment code to be released soon after we remove unnecessary parts due to our specific computation environment.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
batchbald_redux		batchbald_redux
lrcoresets		lrcoresets
model		model
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Codebase for ICML 2025 Paper

"Reviving The Classics: Active Reward Modeling in Large Language Model Alignment"

Authors: Yunyi Shen, Hao Sun, Jean-Francois Ton. The first two authors contribute equally.

Structure of the repo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

YunyiShen/ARM-FI

Folders and files

Latest commit

History

Repository files navigation

Codebase for ICML 2025 Paper

"Reviving The Classics: Active Reward Modeling in Large Language Model Alignment"

Authors: Yunyi Shen*, Hao Sun*, Jean-Francois Ton. The first two authors contribute equally.

Structure of the repo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Authors: Yunyi Shen, Hao Sun, Jean-Francois Ton. The first two authors contribute equally.

Packages