I had the pleasure of Talking with Ross Taylor (https://x.com/rosstaylor90), who has a great spectrum of unique experiences in the language modeling space — evaluation experience, Galactica lead author, Llama post training, etc. This is a really great conversation on the frontier of language model (LM) reasoning, LM deployments and demos, LM’s for science, RLHF, and other topics. I’ve been trying to get Ross to come on for a bit. He’s one of those people in the LM space that doesn’t speak too much, but when you do, you listen.
Ross Taylor was previously an LLM lead at Meta AI, heading up the reasoning team. Previously he led the early work on LLM agents, and was the research lead on the Galactica project. Before that, he was a co-founder of Papers with Code, which was acquired by Meta in 2019. Before that, he has worked as a quant in sports betting and finance, and before that a policy advisor for the UK Government. He is currently working on a new startup.
More details: https://www.interconnects.ai/p/interviewing-ross-taylor-on-llm-reasoning
00:00:00 Introduction of Ross Taylor and his background
00:02:12 Papers with Code
00:09:58 Galactica, goals, controversy, legacy
00:18:12 Technical details of the Galactica model
00:23:18 Potential for language models to make scientific discoveries
00:25:21 Defining and improving reasoning in language models
00:32:38 Process-based reward models and their potential applications
00:35:00 Generating synthetic data for SFT
00:40:23 Evaluating the effectiveness of language models as judges for human preference data
00:42:43 Considerations for creating base models that are easy to fine-tune
00:46:45 Balancing SFT and RLHF
00:54:13 Characteristics of successful post-training teams
00:58:26 Future directions for language model development