Anirudhan Badrinath

Research Scientist @ Google DeepMind, prev. OpenAI

United States

About

Experience

Research Scientist at Google DeepMind
Jun 2026 - Present · 1 mo
Member of Technical Staff at OpenAI
May 2025 - May 2026 · 1 yr 1 mo
Worked in foundations research for search with a focus on language model post-training and architecture, with some work in data, pre-training, evaluation, and retrieval.
Machine Learning Engineer at Pinterest
Jan 2024 - May 2025 · 1 yr 5 mos
Alignment of language models using reinforcement learning for downstream and auxiliary tasks (e.g., recommendations), fun exploration within generative retrieval (of reinforcement and meta-learning, slate-level ranking, outcome-conditioning), and leveraging heterogeneous graphs for unified representation learning. – Developed unified LLM alignment approach, leveraging both reinforcement learning and direct preference optimization, with improved stability and efficiency over RL and 10% improvement in performance over DPO [1] (accepted at TMLR) – Implemented PinRec, an efficient outcome-conditioned, multi-token generative retrieval technique, with over +0.5% online sitewide time spent and +0.3% sitewide fulfilled sessions [2] (accepted at KDD ADS 2026) – Trained and integrated multi-task node representations of heterogeneous graphs with 60B edges, OmniSage, leveraging improved feature-level retrieval and user sequence objectives, lifting online sitewide repins by over 2.5-3% [3] (accepted at KDD ADS 2025) [1] Anirudhan Badrinath, Prabhat Agarwal, and Jiajing Xu. Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier. arXiv preprint arXiv:2405.17956, 2024. [2] Anirudhan Badrinath, Prabhat Agarwal, Laksh Bhasin, Jaewon Yang, Jiajing Xu, and Charles Rosenberg. "PinRec: Outcome-Conditioned, Multi-Token Generative Retrieval for Industry-Scale Recommendation Systems." arXiv preprint arXiv:2504.10507 (2025). [3] Anirudhan Badrinath, Alex Yang, Kousik Rajesh, Prabhat Agarwal, Jaewon Yang, Haoyu Chen, Jiajing Xu, and Charles Rosenberg. "OmniSage: Large Scale, Multi-Entity Heterogeneous Graph Representation Learning." arXiv preprint arXiv:2504.17811 (2025).
Graduate Research Assistant at Stanford Artificial Intelligence Laboratory (SAIL)
Oct 2022 - Dec 2023 · 1 yr 3 mos
Performed research under Prof. Emma Brunskill and Chris Piech in the SAIL Lab @ Stanford within the intersection of offline reinforcement learning (RL) and applications to education. – Developed variant of decision transformer using intermediate waypoint generation with outperforming state-of-the-art RL methods (IQL, CQL), often by 30-80%; presented at NeurIPS 2023 [1] and Goal Conditioned RL Workshop (NeurIPS 2023) – Evaluated reinforcement learning tasks offline using an ensemble-based off-policy evaluation method, OPERA, accepted to NeurIPS 2024 [2] – Developed RL technique with Prof. Chris Piech for chess puzzle recommendation on chess.com 2 with 1 billion interactions; outperforms production in off-policy and LLM evaluation, accepted at RLC 2026 and RLC Journal [3] – Developed a preliminary system leveraging assessments of cognitive mastery via knowledge tracing, educational indicators, and LLMs to assist Carnegie Learning tutors in prioritizing student support [1]: Anirudhan Badrinath, Yannis Flet-Berliac, Allen Nie, and Emma Brunskill. Waypoint transformer: reinforcement learning via supervised learning with intermediate targets. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NeurIPS ’23, Red Hook, NY, USA, 2024. Curran Associates Inc. [2]: Allen Nie, Yash Chandak, Christina Yuan, Anirudhan Badrinath, Yannis Flet-Berliac, and Emma Brunskill. OPERA: Offline policy evaluation with re-weighted aggregates of multiple estimators. In Proceedings of the 38th International Conference on Neural Information Processing Systems, NeurIPS ’24, 2024. [3]: Anirudhan Badrinath*, Allen Nie*, Nicholas Tomlin, Timothy Dai, Carissa Yip, Rose E Wang, Emma Brunskill, and Christopher J Piech. Discovering high-quality chess puzzles through one billion plays with offline reinforcement learning. In Reinforcement Learning Conference, RLC ’26, 2026. To be published in Reinforcement Learning Journal.
Software Engineer Intern at Amazon Web Services (AWS)
May 2022 - Aug 2022 · 4 mos
Implemented end-to-end SSL/TLS certificate tracking framework on EC2 instances, with fully integrated support on AWS Console; migrated AWS OpsInsights to Athena backend with > 50x cost reduction.