Arash Hajisafi

Google SWE Intern | CS PhD Candidate @ USC | ML, Time-Series & Graph Neural Networks

Los Angeles, California, United States

About

PhD candidate in CS @ USC focused on scalable Machine Learning and graph representation learning, with experience shipping data-driven systems using Python, Spark, and Docker. First-author work at SIGSPATIAL, ICDE, PAKDD, and IEEE BigData (Best Student Paper Runner-Up), incoming SWE intern at Google.

Experience

  • Software Engineering PhD Intern at Google
    May 2026 - Present · 2 mos

    Software Engineering PhD Intern within the Platforms Infrastructure Engineering (PIE) AXIS group. Working with the Server RAS (Reliability, Availability, and Serviceability) team on data-driven solutions to optimize fault management and ensure fleet-wide server infrastructure reliability at scale.

  • PhD Research Assistant at University of Southern California
    Jan 2022 - Present · 4 yrs 6 mos

    Currently a PhD candidate in Computer Science at the University of Southern California, advised by Prof. Cyrus Shahabi. I build scalable machine learning systems for spatiotemporal and time-series data, with a focus on graph representation learning and deployment-ready pipelines. Project highlights: Wearables for Health (W4H) Toolkit (https://github.com/USC-InfoLab/w4h-integrated-toolkit): - Led an open-source framework for real-time and offline wearable analytics (Garmin, Apple Watch, Fitbit). - Built a modular architecture separating data engineering, analysis, and visualization - Implemented streaming and batch pipelines with Python, Flask, Spark, Kafka, and Streamlit. - Shipped via Docker and deployed on USC clusters for reproducible research workflows. WaveGNN (https://github.com/USC-InfoLab/WaveGNN): - Developed a Transformer+GNN model for irregularly sampled clinical time series, modeling decay-aware dependencies and missingness. Implemented in PyTorch - Achieved SOTA on P19, P12, PAM, and MIMIC-III with up to 12% relative F1 improvement - Best Paper Runner-Up, IEEE BigData 2025. DeepStateGNN for Scalable Traffic Forecasting - Proposed a scalable traffic forecasting model that replaces large sensor graphs with a compact latent-state graph, enabling inference for arbitrary and unseen sensor configurations and improving scalability and generalization. NeuroGNN (https://github.com/USC-InfoLab/NeuroGNN): - Built a GNN-based EEG seizure detection/classification model in PyTorch Geometric with dynamic brain connectivity modeling - Open-sourced (45+ GitHub stars). - Explored pretrained LLMs to enhance representations of brain dependencies. BysGNN (https://github.com/USC-InfoLab/busyness-graph-neural-network): - Modeled POI visit prediction as multivariate forecasting with multi-context correlations (spatial, temporal, semantic, taxonomic). - Used pretrained LLMs to enrich semantic signals and improve accuracy over strong baselines.

  • PhD Data Science Intern, Decision Sciences at Epsilon
    May 2025 - Aug 2025 · 4 mos

    - Designed and deployed a machine learning–based domain scoring system that optimized website selection for digital ad campaigns in privacy-constrained environments, boosting ad-conversion performance by up to 30%. - Built contextual and behavioral embeddings using Spark, Scala, ALS, XGBoost, and neural networks, enabling accurate predictions for new or low-data domains at scale. - Developed an adaptive meta-modeling framework that combined historical performance with model-predicted signals, improving robustness and reducing manual tuning efforts.

  • Software R&D Intern at Gam Electronics Co.
    Jul 2020 - Sep 2020 · 3 mos

    Responsibilities include: • Engineered automated business processes using Python, Flask, and Selenium, enhancing efficiency. • Conducted unit and integration testing using pytest and unittest libraries in Python. • Developed interactive web dashboards using HTML, CSS, and JavaScript for enhanced user experience.