Haoran (Matt) Wan

Data Scientist | Incoming 5th-year Grad Student | Python, R, SQL | Machine Learning & Predictive Modeling

St Louis, Missouri, United States

About

As a data scientist with over five years of hands-on experience, I translate complex data into actionable strategy. I specialize in building predictive models and robust experimentation frameworks to understand user behavior and drive measurable business outcomes. I apply advanced statistical methods — including Bayesian modeling, mixed-effects analysis, and causal inference — to solve challenging problems in forecasting, customer segmentation, and risk assessment. I lead end-to-end data projects, from initial design and data wrangling to model deployment and the clear communication of results to executive stakeholders. My work turns rigorous analysis into reliable, data-driven solutions, from developing forecasting pipelines for clinical trials to building segmentation models that identify new revenue opportunities. My technical toolkit includes Python (Pandas, Scikit-learn), R (Tidyverse, Stan, Shiny), and SQL. I am passionate about tackling complex business challenges, partnering with cross-functional teams to build impactful data products, and informing key strategic decisions. I am driven by the opportunity to use data to uncover insights, test new ideas, and create measurable value.

Experience

Data Scientist at Swipesum
May 2025 - Present · 1 yr 2 mos
Washington University in St. Louis (St Louis, Missouri, United States · On-site)
- Graduate Student Researcher
  Aug 2021 - Present · 4 yrs 11 mos
  Led end-to-end research projects to model and predict human decision-making, managing the full project lifecycle from experimental design (A/B testing) to the collection and analysis of large-scale behavioral data. Developed and implemented advanced predictive models in R and Stan, using Bayesian and mixed-effects methods to quantify behavioral patterns. Engineered robust data pipelines for data wrangling, validation, and quality control, ensuring high data integrity and reproducibility across all projects. Translated complex statistical findings into actionable insights for technical and non-technical audiences through reports, presentations, and dashboards.
- Lecturer & Teaching Assistant, Quantitative Methods
  Sep 2022 - May 2025 · 2 yrs 9 mos
  Taught and mentored graduate students in advanced statistical modeling using R, covering topics such as predictive modeling, mixed-effects models, and causal inference. Developed comprehensive training materials and code examples focused on best practices for reproducible research, including data wrangling, model diagnostics, and effective data visualization.
Quantitative Medicine Analyst at Critical Path Institute (C-Path)
May 2024 - Aug 2024 · 4 mos
Developed and validated Bayesian mixed-effects models in R and Stan to forecast individual-level trajectories from complex longitudinal data. Engineered and automated an end-to-end modeling pipeline using GitLab CI/CD, which streamlined model updates, ensured reproducibility, and significantly reduced project turnaround time. Translated complex model outputs and validation results (e.g., posterior predictive checks, LOO-CV) into actionable insights to support data-driven decisions for cross-functional teams.
Reed College (Portland, Oregon, United States)
- Research Assistant at Learning & Adaptive Behavior Laboratory
  Oct 2017 - Aug 2021 · 3 yrs 11 mos
  Developed predictive models using Bayesian inference to analyze complex behavioral data, translating findings into reports and peer-reviewed publications. Engineered and optimized data wrangling and quality-control pipelines in R to ensure reproducible results for time-series and repeated-measures data.
- Teaching Assistant
  Jan 2018 - May 2021 · 3 yrs 5 mos
  Provided training and mentorship to students in data analysis and experimental design, delivering practical instruction in Excel and statistical programming.
- Research Assistant at Economics Department
  Jan 2019 - Aug 2019 · 8 mos
  Applied advanced causal inference and econometric methods (e.g., DiD, Instrumental Variables) in Stata to estimate the impact of policy interventions on labor-market outcomes. Managed and cleaned large-scale longitudinal panel datasets, performing rigorous data validation and feature engineering to prepare data for modeling.
Quantitative Analyst at Shenzhen Stock Exchange
May 2019 - Aug 2019 · 4 mos
Conducted quantitative analysis on large-scale fixed-income datasets to model trends in pricing, volume, and credit risk across government and corporate bond markets. Developed automated reporting workflows that translated raw market data into key summary reports for senior analysts, significantly increasing the efficiency and accuracy of policy evaluation.