San Francisco Bay Area
I am a software engineer with experience leading teams in building ML platforms and infrastructure.
Enabling accelerated data ingestion and processing globally
• Lead for the Ads ML team, overseeing several projects, including development of the deep learning conversion model, delay modeling, and bidder service decomposition. • Started new processes to regularly review performance metrics and foster cross-pollination among team members. • Consulted cross-team to ensure ML platform architecture meet needs for Ads ML. • Drove initiatives to level-up operational excellence via improved testing and service resilience.
• As the ML Training Platform Lead, owned several core projects, including Auto-Retraining Platform, Model Pipeline Runner, Text Processing, Rescoring Pipeline, and Golden Labels. • Defined objectives, roadmaps, and regularly interfaced cross-team and cross-org. • Drove Auto-Retraining Platform to regularly retrain and deploy core models. • Architected and built the Offline Label Generation Pipeline, enabling a new paradigm for gathering training and evaluation data. • Lead the migration of offline data and model training pipelines company-wide to Python 3.8. • Working with Engineering Effectiveness, drove the improvement of engineering excellence through code ownership, testing quality, and static type checking. • Technologies: Python, Airflow, Tensorflow, scikit-learn, Databricks, Spark, Redis, AWS
• Technical lead for the ML Pipelines team on Cortex Platform. • Defined quarterly objectives, roadmaps, and interfaced cross-team, cross-org to ensure alignment. • Lead the evaluation of industry pipeline solutions and eventual adoption of TFX for defining end-to-end ML pipelines. • Addressed platform-wide systemic issues, such as user experience across products. • Member of various working groups pertaining to ML Ops, company culture, communications and process, formalizing technical leadership, data formats, and data lineage. • Technologies: Python, Scala, TensorFlow Extended (TFX), Airflow, Google Cloud Platform (GCP), BigQuery, Dataflow
· Tech lead for the ML Pipelines team · Led design and implementation of the hyperparameter tuning platform
· Led the model tuning effort: hyperparameter tuning with bayesian optimization for both ad-hoc experimentation and production pipelines. · Led the integration between our machine learning workflows system and deep learning platform. · Built integrations with continuous integration, Aurora, and state management. · Constructed prototypes of machine learning workflows for ads-targeting and timelines. · Supported ads, timelines, health, and recommendations teams with automation of machine learning pipelines and hyperparameter tuning. · Taught courses on developing machine learning workflows. · Realized 4x iteration speed improvement through optimization efforts.
· Owner, modeler, and builder of scalable production machine learning systems for data integrity and named entity disambiguation—simultaneously improved recall, precision, and system throughput. (Kafka Streams, Scala, Weka, Elasticsearch) · Engineered features using web scraping, NLP, and graph analysis. (Scala, Ruby, Postgres) · Tuned Elasticsearch, balancing string similarity and graph metrics, for front-facing autocomplete search and candidate generation—leading to better user experience and higher precision/recall in ML models. (Elasticsearch, Scala) · Designed the framework for scalable node and edge validations (schema and business-logic) of graph data structures. (REST API, Kafka, Scala, Postgres, Swagger) · Built pipelines for ingesting, disambiguating, and integrating external data that were often noisy, incomplete, and from inconsistent APIs. (Python, Kafka, Elasticsearch, Postgres)