South San Francisco, California, United States
• Served as a Senior Software Engineer on the Search Replication and Routing team at MongoDB. • Developed Service Level Objectives (SLOs) to monitor the health of the search system. • Automated system processes to manage resource usage and failures, ensuring high availability for customers. • Synchronized search operations to prevent race conditions and automated freezing of high IO operations.
Scaled Slicer, Google’s sharding system, to support multi-location isolation for 40+ internal customers, improving system reliability and fault tolerance. Redesigned testing infrastructure to simulate multi-location scaling scenarios, improving deployment confidence for Slicer’s new multi-location isolation feature. Automated deployment of Slicer’s admin controller, saving about an hour per week for deployment captains and reducing manual deployment errors.
Designed automated integration tests for the production deployment process. Standardized data processing logic, improving data consistency across 4+ customer pipelines.
Managed 5 backend engineers, leading successful scaling of customer prediction pipelines.
Scaled data processing pipeline, allowing Afresh to aggregate 1 GB of new data on top of historical data used for customer prediction pipelines.
Optimized AMBROSIA, a fault tolerance system for failure oblivious code, to improve performance for distributed failure cases. Co-designed PRISM interface to add 4 RDMA primitives, enabling higher throughput and lower latency in RDMA distributed systems.
Developed Ripple, a serverless parallelization framework that improved application performance by up to 80x compared to IaaS/PaaS clouds for similar costs.