Siddharth Sharma

Machine Learning at Meta

Sunnyvale, California, United States

About

Working as a Machine Learning Engineer on Natural Language Understanding. I have 18 months of internship experience in Machine Learning/Data Science. My MSE-CS was concentrating on Machine Learning at JHU.

Experience

  • Machine Learning Engineer at Meta
    May 2024 - Present · 2 yrs 2 mos

    Ranking Models and AI based Prevalence Measurement

  • Software Engineer Machine Learning at Google
    Dec 2021 - Mar 2024 · 2 yrs 4 mos

    YouTube Ads ML - Retrieval models and measurement

  • Machine Learning Engineer at Ushur, inc
    Sep 2018 - Nov 2021 · 3 yrs 3 mos

    I work on Natural Language Understanding.

  • Graduate Research Assistant at The Johns Hopkins University
    Feb 2018 - Aug 2018 · 7 mos

    Simulated High-Performance Computing (HPC) systems with natural and artificially injected faults using The Structural Simulation Toolkit (SST). Created a framework using python to perform node-based and task-based reliability analysis on logs generated by simulated HPC systems. This analysis is independent of Network Structure. Built a Support Vector Machine based classifier to identify artificial fault injection. Weibull and Log-Normal lifetime models were used to parameterize the reliability curves.

  • Applied Scientist Intern at Amazon Lab126
    Sep 2017 - Jan 2018 · 5 mos

    Simulated human annotators using Bayesian modeling to create synthetic annotated data for Speaker Identification (SID) system. Used Unsupervised Label Refinement (ULR) methods (like Dawid Skene) and showed that these methods work better than Majority Voting for SID annotation. Evaluated human annotator's False Acceptance Rate (FAR) and False Recognition Rate (FRR) for speaker identification by created ground truth data of varying difficulty. Showed that the current annotation process was unacceptable even if we use ULR on labels from multiple annotators to reduce errors. Showed that metadata of test utterance and enrolled utterance did not have enough signal to judge annotation difficulty. Created training and testing data to evaluate domain classifier.