London, England, United Kingdom
I want to make good AI.
- Assessing candidate technical work in ML, DL, and AI safety. - Aided participants in equipping talented individuals with the skills, tools, and environment necessary for upskilling in ML engineering, for the purpose of contributing directly to AI alignment in technical roles. - Provided hands-on support in understanding, implementing and debugging DL implementations, during an intensive 5-week program, including DL fundamentals, mechanistic interpretability, circuit discovery, and reinforcement learning.
Managing MATS research projects with a focus on evaluations and control research tracks (UK AISI & GDM).
Investigating machine unlearning, behaviour modelling, capability separability, and applying mechanistic interpretability methods such as SAEs for training and fine-tuning to improve safety in LLMs.
- Investigated "Attention Head Superposition" in LLM/NLP models with Chris Mathwin and Lee Sharkey. - Proposed and implemented the gated attention block, resolving attention head superposition with the aim of making it easier for researchers to study individual attention heads. - Facilitated Alignment 201 reading group for 5 MATS scholars.