Boston, Massachusetts, United States
I’m a Senior Data Engineer with 7+ years of experience building scalable, reliable data systems that drive measurable business impact. At Wayfair, I built and scaled production-grade pipelines supporting homepage, search, and product page experiences, which contributing to $100M+ in annual revenue impact. I led logging architecture redesigns that reduced infrastructure costs by $300K annually, improved SLA adherence from 60% to 100%, and developed analytics frameworks used widely across engineering, product, and data science. Currently at Yahoo, I’m focused on modernizing data infrastructure, leading on-prem to GCP migration and GA4 adoption initiatives. My work centers on transitioning legacy systems to cloud-native architectures while maintaining data integrity, reliability, and analytics continuity. Core areas of focus: • Data architecture & scalable ETL design • Cloud migration & GCP modernization • BigQuery & distributed data systems • Analytics engineering (LookML, KPI standardization, GA4) • Data quality, observability & cost optimization • Experimentation & metrics infrastructure I enjoy operating at the intersection of engineering and analytics, building platforms that empower teams to move faster with trusted data. I’m also open to select project-based advisory engagements around cloud migration, data architecture, pipeline optimization, and analytics enablement.
• Partnered with business stakeholders to identify top AI opportunities, transformed large volumes of data into AI-driven solutions using natural language processing (NLP) techniques to deliver a project driving significant value to Fidelity • Developed a supervised deep learning model that outperforms the state-of-the-art methods using Python Keras on GPU • Extracted data (50 million rows) from Hive, preprocessed and analyzed data on PySpark • Identified business insights by performing analyses on text data using keywords extraction (Text Rank and RAKE) • Communicated effectively with manager and teammates, collaborated on GitLab
• Collaborated with Xi Liu, Yalei Peng, and Congyang Wang in adding value with machine learning in the building Internet of Things (IoT). • Predicted unreservable space usage status (occupied or unoccupied) via classification models, such as logistic regression, decision tree, random forest, and support vector machine. • Improved accuracy of predictive statistical models using ensemble models by around 20%. • Forecasted unreservable space usage duration using long short term memory network (LSTM) which outperforms time series models by improving 35% accuracy.
• Worked with Prof. Renata Konrad, Prof. Andrew C. Trapp, and Kayse Lee Maass on using data science to fight human trafficking. • Built prioritization framework to categorize states based on the prevalence of human trafficking, the legislative environment regarding human trafficking, and the current number of dedicated human trafficking shelters per million residents within the state. • Visualized human trafficking statistics via python (plotly and matplotlib) and R.