Keith Dowd 🤖

Principal Data Scientist | Senior ML & AI Engineer | Python, LLMs, RAG, Agents, Production ML | Shipping Models That Matter

Charlotte, North Carolina, United States

About

Data scientist and ML engineer with over a decade of experience building, shipping, and evaluating production ML and AI systems. I've worked across the full stack of applied data science, from training classification models and designing evaluation frameworks to shipping LLM-powered products used by real customers. The last two years have focused on applied LLM work: RAG, prompt optimization, evaluation frameworks, conversational AI, and propensity modeling. The decade before that was spent on the traditional ML stack, including predictive modeling, deep learning, NLP, time series, anomaly detection, and production deployment. I've worked across industries including education, telecom, healthcare, government, and political research. A few things I've worked on and/or shipped recently: → Evaluation framework for a conversational engagement tool powered by LLMs (OpenAI APIs + RAG on AWS Bedrock), scoring factuality, tone, and relevance across 100 test cases. Cut RAG retrieval errors by 40%. → Propensity scoring model + LLM outreach system now used by 100+ admissions counselors across 20 institutions, where students in the top decile apply at 13x the rate of the average student. → Modernized a production scoring pipeline serving 32 institutions and tens of thousands of students annually, reducing training and prediction cycle time by 25%. → Full ownership at TrueLearn of score prediction models for medical licensure exams, from training through the monitoring and evaluation processes that kept them calibrated in production. → Production ML at Verizon (regression, clustering, time series, anomaly detection, neural networks) to solve network engineering problems impacting reliability and profitability. I've led teams both as a Principal IC and as a Director, and I'm grateful for both chapters. Leading made me a much better data scientist. But the work that energizes me most is still the technical work itself: training models, writing Python, building pipelines, and figuring out whether the thing I built actually works. My next chapter is full time IC. Core areas: Python · SQL · Machine Learning · Deep Learning · LLMs · RAG · NLP · Prompt Engineering · Predictive Modeling · Model Evaluation · Data Pipelines · Production ML · AWS · Snowflake Let's connect. I'm open to Senior, Staff, and Principal Data Scientist roles, and AI/ML Engineer roles at companies building products where model quality actually matters.

Experience

Principal Data Scientist / Director of Data Science & AI at Capture Higher Ed
Jun 2025 - Apr 2026 · 11 mos
→ Led data science development of Enrollment Scout, an LLM-powered conversational engagement tool powered by OpenAI’s model APIs, including prompt optimization, building a testing and evaluation framework on AWS Bedrock spanning 3 dimensions (factuality, tone, relevance) and 100 test cases, and reducing RAG content retrieval errors by 40%. → Led data science development of Counselor Copilot, a propensity scoring model and LLM-based outreach tool adopted by 100+ admissions counselors across 20 institutions, where students in the top decile apply at 13x the rate of the average student. Drove prompt optimization, RAG and knowledge base implementation, evaluation metric design, and integration of model outputs into email, text, and call workflows. → Led modernization of the production system behind Apply and Enroll propensity scoring models serving 32 institutions and scoring tens of thousands of prospective students annually, reducing training data ingestion and prediction delivery cycle time by 25% through automation and reporting improvements. Trained classification models, refined evaluation metrics, and embedded predictions into product flows and dashboards. → Built, tested, and maintained data pipelines, backend systems, and model serving infrastructure to support end-to-end deployment of models into production. Adopted AI-assisted software development workflows (Claude Code) to accelerate individual contributor output on modeling, pipeline, and evaluation work. → Led adoption of AI-assisted IC development workflows (Claude Code) across the team; continued as hands-on individual contributor alongside team leadership responsibilities.
Principal Data Scientist at TrueLearn
Mar 2021 - Jun 2025 · 4 yrs 4 mos
→ Led end-to-end training, development, and production deployment of all score prediction models predicting student performance on high-stakes medical licensure exams based on longitudinal test simulation and behavioral data. → Prototyped LLM-powered product features (flashcard generation, conversational exam banks) and partnered with engineering to prototype semantic question search, contributing to evaluation and prompt design. → Developed monitoring and evaluation processes to track model performance, calibration, and data quality in production. → Applied predictive and prescriptive modeling to optimize customer experience, revenue, and marketing outcomes across the product suite. → Administered the company analytics stack (Snowflake, Fivetran, DBT, Looker) to support data science, product, and business reporting. → Built the Data Science & Analytics function from scratch. Hired and mentored data scientists, engineers, and analysts while continuing to own and deliver hands-on modeling, evaluation, and production deployment work across Product, Marketing, Engineering, and Sales.
Lead Data Scientist at Verizon
Aug 2019 - Mar 2021 · 1 yr 8 mos
→ Designed and deployed production ML and statistical models (regression, clustering, time series, anomaly detection, neural networks) in a cloud-based Big Data environment to solve network engineering and operations problems impacting profitability, reliability, and efficiency. → Advised senior leadership on advanced analytics and ML strategy, translating model outputs into decisions that influenced network operations. → Coached and mentored junior data scientists and analysts through hands-on project work, reviews, and workshops.
Data Scientist at U.Group, An Agile Defense Company
Dec 2018 - Aug 2019 · 9 mos
→ Delivered client-facing data science solutions using machine learning, NLP, Python, Spark, Hadoop, and SQL/NoSQL databases to accelerate decision-making on complex business problems. → Built an entity resolution pipeline with an evaluation framework and bench-marking metrics to reduce matching error across data sources lacking a common primary key. → Developed a risk score model to assess the threat to U.S. interests of foreign-led mergers and acquisitions, and drove improvements to the model's data coverage, feature engineering, and prediction accuracy.
Data Scientist at Association of American Medical Colleges (AAMC)
Jan 2018 - Dec 2018 · 1 yr
→ Built data-driven tools for medical school admissions and residency selection, spanning model development, validation, and psychometric evaluation. → Led data science for the Residency Exploration Tool, a web-based resource helping residency applicants compare programs and make informed application decisions. Owned data modeling, visualization, and evaluation of similarity measures for program comparison. → Served as data lead for the AAMC Standardized Video Interview, building computer scoring models from audio and video data, automating the score report and distribution pipeline in Python, and communicating findings to stakeholders through reports and presentations.