← Previous · All Episodes · Next →
Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between Episode 20

Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between

· 01:26:28

|
Louis recently has been founding a new startup focused on synthetic data for alignment, Synth Labs, and is a researcher at Eleuether AI. This interview should speak for itself, and it’ll need re-listens, even for myself. The list of topics we cover touches on pretty much every major and minor issue facing model fine-tuning. Please reach out or comment if there’s a paper we mention that I didn’t link before. Happy to dig it up for you. This post is very technical. If you’re having a hard time with it, I suggest you listen to my RLHF 201 post on Latent Space first.

Full transcript available here: https://www.interconnects.ai/p/rlhf-interview-1-louis
  • 00:00:00: Introduction
  • 00:01:24: Gemini News and RLHF’s Part in it
  • 00:09:05: Long Context, In-Context, and Multimodal RLHF
  • 00:21:20: What are people missing about RLHF these days?
  • 00:30:30: OpenAI's Influence and the Need for Alternatives
  • 00:39:20: Synth Labs and the Future of Alignment
  • 00:55:00: Evaluation Talk p2: Open-ended Evaluation and Data Diversity
  • 00:59:20: Algorithm Roundup: PPO, DPO, KTO, IPO
  • 01:18:38: CarperAI, Early Days of RLHF, Reflecting on ChatGPT

View episode transcript


Subscribe

Listen to Interconnects Audio using one of many popular podcasting apps or directories.

Apple Podcasts Spotify Overcast Pocket Casts YouTube
← Previous · All Episodes · Next →