Los Angeles, California, United States
PhD candidate in CS @ USC focused on scalable Machine Learning and graph representation learning, with experience shipping data-driven systems using Python, Spark, and Docker. First-author work at SIGSPATIAL, ICDE, PAKDD, and IEEE BigData (Best Student Paper Runner-Up), incoming SWE intern at Google.
Software Engineering PhD Intern within the Platforms Infrastructure Engineering (PIE) AXIS group. Working with the Server RAS (Reliability, Availability, and Serviceability) team on data-driven solutions to optimize fault management and ensure fleet-wide server infrastructure reliability at scale.
Currently a PhD candidate in Computer Science at the University of Southern California, advised by Prof. Cyrus Shahabi. I build scalable machine learning systems for spatiotemporal and time-series data, with a focus on graph representation learning and deployment-ready pipelines. Project highlights: Wearables for Health (W4H) Toolkit (https://github.com/USC-InfoLab/w4h-integrated-toolkit): - Led an open-source framework for real-time and offline wearable analytics (Garmin, Apple Watch, Fitbit). - Built a modular architecture separating data engineering, analysis, and visualization - Implemented streaming and batch pipelines with Python, Flask, Spark, Kafka, and Streamlit. - Shipped via Docker and deployed on USC clusters for reproducible research workflows. WaveGNN (https://github.com/USC-InfoLab/WaveGNN): - Developed a Transformer+GNN model for irregularly sampled clinical time series, modeling decay-aware dependencies and missingness. Implemented in PyTorch - Achieved SOTA on P19, P12, PAM, and MIMIC-III with up to 12% relative F1 improvement - Best Paper Runner-Up, IEEE BigData 2025. DeepStateGNN for Scalable Traffic Forecasting - Proposed a scalable traffic forecasting model that replaces large sensor graphs with a compact latent-state graph, enabling inference for arbitrary and unseen sensor configurations and improving scalability and generalization. NeuroGNN (https://github.com/USC-InfoLab/NeuroGNN): - Built a GNN-based EEG seizure detection/classification model in PyTorch Geometric with dynamic brain connectivity modeling - Open-sourced (45+ GitHub stars). - Explored pretrained LLMs to enhance representations of brain dependencies. BysGNN (https://github.com/USC-InfoLab/busyness-graph-neural-network): - Modeled POI visit prediction as multivariate forecasting with multi-context correlations (spatial, temporal, semantic, taxonomic). - Used pretrained LLMs to enrich semantic signals and improve accuracy over strong baselines.
- Designed and deployed a machine learning–based domain scoring system that optimized website selection for digital ad campaigns in privacy-constrained environments, boosting ad-conversion performance by up to 30%. - Built contextual and behavioral embeddings using Spark, Scala, ALS, XGBoost, and neural networks, enabling accurate predictions for new or low-data domains at scale. - Developed an adaptive meta-modeling framework that combined historical performance with model-predicted signals, improving robustness and reducing manual tuning efforts.
Responsibilities include: • Engineered automated business processes using Python, Flask, and Selenium, enhancing efficiency. • Conducted unit and integration testing using pytest and unittest libraries in Python. • Developed interactive web dashboards using HTML, CSS, and JavaScript for enhanced user experience.