Episode 20

Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between

March 4, 2024 · 01:26:28

Louis recently has been founding a new startup focused on synthetic data for alignment, Synth Labs, and is a researcher at Eleuether AI. This interview should speak for itself, and it’ll need re-listens, even for myself. The list of topics we cover touches on pretty much every major and minor issue facing model fine-tuning. Please reach out or comment if there’s a paper we mention that I didn’t link before. Happy to dig it up for you. This post is very technical. If you’re having a hard time with it, I suggest you listen to my RLHF 201 post on Latent Space first.

Full transcript available here: https://www.interconnects.ai/p/rlhf-interview-1-louis

00:00:00: Introduction
00:01:24: Gemini News and RLHF’s Part in it
00:09:05: Long Context, In-Context, and Multimodal RLHF
00:21:20: What are people missing about RLHF these days?
00:30:30: OpenAI's Influence and the Need for Alternatives
00:39:20: Synth Labs and the Future of Alignment
00:55:00: Evaluation Talk p2: Open-ended Evaluation and Data Diversity
00:59:20: Algorithm Roundup: PPO, DPO, KTO, IPO
01:18:38: CarperAI, Early Days of RLHF, Reflecting on ChatGPT

View episode transcript

Listen to Interconnects Audio using one of many popular podcasting apps or directories.

Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between

Subscribe