← Previous · All Episodes · Next →
The DPO debate: Do we need RL for RLHF? Episode 2

The DPO debate: Do we need RL for RLHF?

· 17:27

|
Direct vs. RL methods for preferences, more RLHF models, and hard truths in open RLHF work. We have more questions than answers.
Read the full post here

Subscribe

Listen to Interconnects Audio using one of many popular podcasting apps or directories.

Apple Podcasts Spotify Overcast Pocket Casts YouTube
← Previous · All Episodes · Next →