Shivanshu Verma

Software Engineer at Google | IIT Delhi

San Francisco Bay Area

About

Passionate about AI and machine learning, I specialize in the development and optimization of large language models (LLMs) and reinforcement learning (RL) systems. My work includes innovating the Triple Preferences Optimization (TPO) method, which surpasses state-of-the-art techniques in aligning LLMs with human feedback (DPO by 4% and SFT by 4.7%). I have extensive experience in training, fine-tuning, and evaluating LLMs and deep learning models, successfully improving their logical reasoning capabilities using reinforcement learning. In addition to my expertise in LLMs, I have developed novel RL approaches that enhance sample efficiency by 20% over existing state-of-the-art methods. My research spans comprehensive evaluations of various alignment methods, significantly boosting LLM performance across multiple benchmarks. I am committed to pushing the boundaries of AI, making it more efficient, scalable, and interpretable.

Experience

  • Software Engineer at Google
    Nov 2024 - Present · 1 yr 8 mos

  • Generative AI Engineer II at American Express
    Aug 2024 - Nov 2024 · 4 mos

  • Arizona State University (Tempe, Arizona, United States · On-site)
    • Visiting Researcher
      May 2024 - Aug 2024 · 4 mos

    • Graduate Services Assistant
      Aug 2023 - May 2024 · 10 mos

      - Innovated Triple Preferences Optimization (TPO), surpassing SOTA methods in aligning LLMs with human feedback (DPO by 4% and SFT by 4.7%). - Extensive experience in training, fine-tuning, and evaluating LLMs and other deep learning models, improving logical reasoning with reinforcement learning. - Developed novel RL approaches, enhancing sample efficiency by 20% over state-of-the-art methods.

  • Software Developer - Technology & Innovation at Standard Chartered Bank
    Aug 2020 - Jul 2022 · 2 yrs

    ● Spearheaded end-to-end design & development of a sophisticated trade processing application, leveraging Spring Cloud architecture, Apache Camel and Kafka integration for seamless data processing & inter-microservice communication. ● Optimized data storage strategies with a combination of HBase database for dynamic data and Oracle DB for static data. ● Created user-friendly interfaces using ReactJS. Ensured high code quality through comprehensive unit testing, utilized project management software JIRA and Confluence to track the progress of project and Git for version control. ● Scaled the application to achieve a 10-fold increase in number of trades processed per day with around 2 million trades.

  • Teaching Assistant at Indian Institute of Technology, Delhi
    Jul 2019 - Jun 2020 · 1 yr