Greater Indianapolis
Seasoned Senior Data Enginner with 7+ years of experience leveraging GenAI, LLMs, and scalable experimentation to drive $5M in annual savings and significant user engagement gains. Expert in bridging the gap between complex data patterns and business goals to deliver high-impact, innovative products at an industry scale. Passionate collaborator dedicated to building advanced AI-driven solutions and creating tangible organizational value through data-driven leadership.
Fractal AI Senior AI Engineer, Feb 2025-Current •Led to develop a comprehensive prompt evaluation framework to assess LLM outputs across correctness, safety, and hallucination detection, leveraging retrieval-based grounding and generation-quality metrics such as Contextual Precision and Recall, Hallucination Check. • Generalized nudge framework using prompt engineering using RAG to support code database for various diagnoses led to 30% faster release cycles. • Led end-to-end experimentation frameworks (A/B, multivariate) to evaluate product features, improving retention and engagement by 15%. • Designed causal inference studies (t-tests, regression, observational analysis) to derive insights where randomized trials were infeasible. • Partnered with prompt engineers to optimize structured prompting, reducing latency by 30% via caching and Sage Maker parallelization. • Mentored junior data scientists and set best practices for experiment design, accelerating cycle times by 20%
. Developed an hourly energy load forecasting model using NeuralProphet achieving 75% accuracy, leveraging historical time series data to generate accurate demand predictions. Created reusable data assets by engineering features like weather, aggregating historical trends, and optimizing hyperparameters to enhance forecasting accuracy for energy management. • Automated the extraction and processing of census data through APIs, creating structured data assets for demographic analysis. Implemented robust data pipelines to clean, validate, and store census data, ensuring seamless integration into downstream analytics and machine learning workflows. • Designed and deployed machine learning models on Azure ML, leveraging cloud-based compute resources for scalable processing. Built and managed data assets for model training and inference, optimizing feature stores, experiment tracking, and deployment workflows to improve model performance and operational efficiency by 40%
• Spearheaded inbound call prediction machine learning model by using XG Boost Python, eliminating manual crunching and saving $5 million yearly in manual efforts. The model is used to predict Inbound calls on daily and 15 min intervals. • Effectively collaborated with the call center forecasting and Loan servicing team for above project presenting the model’s performance. Improved accuracy to 80% by tuning model parameters with AWS sage maker and cross validation technique. •Led document search model using LLM for fast searching and sentiment analysis, saving $50M in hours of work. • Created dashboard for internal chatbot, using SQL Database, Tableau, and PowerBI platform, by enabling sentiment analysis. • Designed and built Python API for SQL server and client application, reducing manual work by 80% and achieving faster API update for business application
• Verified 100+ model performance for CMS (Central Medical Services) – used for fraud protection and Insurance billings. • Maintained model evaluation metrics and tracked programming efforts by using MLflow and H2O AutoML, increasing programming efficiency. • Collaborated in FPS dashboard processing, performed quality control for ML Model’s output, and handled model governance. ● Led strategic planning sessions to identify key business metrics, prioritize high-impact initiatives, and ensure stakeholder alignment, resulting in a 20% improvement in project delivery timelines. • Managed fraud protection inputs and KPIs through dashboard and UX interface, resulting in 40% savings in transactions. • Created and tested Azure CI / CD pipeline and data curation / preparation.
Client: Home Depot: Oct 2020-July 2021 • Analyzed and investigate complex data sets and summarize using Z-score, IQR, Std Deviation and spot anomalies. • Handled Statistical Insights engine for maintaining the profitability of each store’s sales by applied research. • Conducted generation of test cases and validations for Automation of Metrics Recasting, improving clickthrough rate by 38%. • Performed data profiling of the store’s data for analysis of the metrics viability, with 89% accuracy using GCP. Client: Cummins: April 2020-Septebmer 2020 • Developed detailed problem statement for Machine failure hypothesis and its effect on target customers defining model’s scope, objective, outcome statements and metrics. • Detected and interpreted emerging product quality issues faster by training and tuning a moving average algorithm, saving $50M on warranty. • Initiated Text Analytics and NLP techniques for sorting out customer complaints, improving satisfaction rate by 40%. • Lowered the cost of repair by 25% by generating various metrics and signals based on them. Client: PNC Bank: January 2019-March 2020 • Led, developed, and implemented data lake (by ingestion), resulting in 50% increase in data accessibility in Hadoop. • Automated with programming macros and SAS analytics for credit card data loan servicing dept to increase efficiency by 80%. • Developed deep learning neural network in Python, TensorFlow for fraudulent signature identification on deposit checks, saving $60M in transactions. • Masked personal identifiable information in Hadoop and Python ecosystem, securing transaction processing. • Implemented performance tuning and embedded SQL queries by using Hive and Spark, improving information retrieval efficiency by 70%.