Yuvraj Singh Bhadoria

ML Engineer @ BofA | LLM & RAG Systems | 52%→88% Retrieval Accuracy | LangChain · LangGraph · PyTorch | Ex-PolicyBazaar

Greater Delhi Area

About

I build ML systems that work when the stakes are real. Not prototypes. Not demos. Systems serving real users, at scale, under latency constraints, with guardrails that catch what can go wrong. In 3+ years across Bank of America, Policybazaar, and Infosys, I've shipped work that moved real numbers: → Retrieval accuracy: 52% → 88% (hybrid dense-sparse RAG pipeline, p95 < 300ms) → LLM inference latency: 800ms → under 300ms (KV-cache tuning + async batching) → Inference cost: cut 25% through automated model evaluation and selection → Pricing error: reduced 18% by replacing a rule-based engine with gradient-boosted regression on SageMaker → Customer retention: +12% via XGBoost churn model with SMOTE on policyholder data → ETL runtime: cut 40% via PySpark partition pruning and broadcast joins → Forecast accuracy: +15% for a Fortune 500 retail client using XGBoost + LightGBM ensembles → Model retraining cycle: 2 weeks → 3 days through automated feature pipelines What I build with: LLMs · RAG · Fine-tuning (DPO, LoRA, FSDP) · LangChain · LangGraph · PyTorch · HuggingFace · XGBoost · LightGBM · SageMaker · MLflow · FastAPI · Docker · PySpark · Pinecone · FAISS · Qdrant I came from Computer Science but I think like a systems engineer — latency budgets, failure modes, and cost efficiency matter to me as much as model accuracy. That's why I added safety guardrails in production, tuned KV-cache before asking for bigger hardware, and rewrote ETL jobs before scaling the cluster. Currently at Bank of America building production LLM infrastructure. Open to Senior ML Engineer roles at product-first companies where reliability and scale are non-negotiable. 📌 Portfolio: https://portfolio-yuvraj-singh-bhadoria.vercel.app 📩 [email protected]

Experience

Machine Learning Engineer at Bank of America
Jun 2025 - Present · 1 yr 1 mo
Machine Learning Engineer at Policybazaar.com
Feb 2025 - Jun 2025 · 5 mos
-> Improved customer retention by 12% by shipping an XGBoost churn model on policyholder data with SMOTE for class imbalance — model outputs directly powered targeted retention campaigns. -> Cut pricing error by 18% by replacing a legacy rule-based system with gradient-boosted regression, deployed as a REST API on SageMaker with p95 latency under 50ms. -> Reduced PySpark ETL runtime by 40% and cloud compute cost by 18% through partition pruning and broadcast join optimization — enabling faster ML experimentation cycles.
Data Scientist at Infosys
Dec 2022 - Feb 2025 · 2 yrs 3 mos
-> Lifted forecast accuracy by 15% for a Fortune 500 retail client using an XGBoost + LightGBM ensemble — with measurable downstream impact on inventory planning and stock allocation. -> Cut model retraining cycle from 2 weeks to 3 days by building automated PySpark feature pipelines for lag features, rolling statistics, and categorical encodings on time-series data. -> Owned end-to-end MLOps on SageMaker — MLflow experiment tracking, automated data-drift detection triggers, and production retraining pipelines — reducing manual monitoring overhead across multiple live models.