Skip to content

Active reward modeling with last layer Fisher Information (ICML'25)

Notifications You must be signed in to change notification settings

YunyiShen/ARM-FI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Codebase for ICML 2025 Paper

"Reviving The Classics: Active Reward Modeling in Large Language Model Alignment"

Authors: Yunyi Shen*, Hao Sun*, Jean-Francois Ton. The first two authors contribute equally.

[ Preprint ] | [Embeddings]]

We have a series of work focusing on reward models in RLHF:

Structure of the repo

The algorithms we tested were implemented in model, there are two algorithms from other authors, namely coreset (Huggins et al. 2016) in lrcoresets and batchBALD (Kirsch et al 2019) in batchbald_redux, we did minimal modification to make sure then can be compatible with our computation environment.

The experiment code to be released soon after we remove unnecessary parts due to our specific computation environment.

About

Active reward modeling with last layer Fisher Information (ICML'25)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •