Mihir Nilesh Holmukhe

Software Engineer I @Prodapt | MS in Applied Data Science (Syracuse) | Big Data​: Spark, Kafka, Airflow, BigQuery | GenAI & LangChain, FastAPI — Open to work

Dallas-Fort Worth Metroplex

About

Software Engineer with 4+ years of experience building scalable backend systems, distributed microservices, and real-time data platforms across telecom, banking, and SaaS. Expert in Python, Java, Spark, Kafka, FastAPI, AWS, GCP, and Kubernetes, with deep expertise in GenAI/RAG pipelines and streaming systems processing millions of records daily. Key impact: Reduced ETL runtime by 40%, cut network issue detection latency by 35%, built RAG microservices processing 12M+ daily events (reducing hallucinations by 30%), and decreased subscription churn by 18% through ML models. Core skills: Backend architecture, distributed systems, ETL/data pipelines, real-time streaming, GenAI/RAG, cloud-native deployments (AWS/GCP), Kubernetes, CI/CD, and secure API design. MS in Applied Data Science from Syracuse University (May 2025). Passionate about scaling production systems, mentoring engineers, and bridging data engineering with AI/ML.

Experience

  • Software Engineer at Prodapt
    Jun 2025 - Jun 2026 · 1 yr 1 mo

    Owned end-to-end development of RAG-based GenAI microservices using LangChain, FastAPI, and pgvector on AWS EC2 and Kubernetes (EKS), processing 12M+ daily telecom events. Mentored 2 junior engineers on RAG architecture and SOC 2-compliant data handling practices; enforced code-review standards across Agile sprint cycles. Engineered ETL pipelines using Apache NiFi, Spark, Kafka, and Airflow (Cloud Composer) into BigQuery, reducing ETL runtime by 40% for Verizon operations. Built production FastAPI REST services integrating LLM inference, AWS S3, Lambda, and RAGAS evaluation, reducing hallucination rate by 30% across customer-facing pipelines. Implemented Kafka + Spark Structured Streaming pipelines on Kubernetes and GCP Dataproc, cutting network issue detection latency by 35% for Verizon’s monitoring teams. Integrated CI/CD pipelines using GitHub Actions with automated testing and deployment; reduced release cycle time by 25% within an Agile/Scrum delivery model.

  • Software Engineer (Backend, AI/ML) at Morgan Stanley
    Jan 2025 - Jun 2025 · 6 mos

    Designed and delivered Java Spring Boot microservices integrating Python-based ML inference APIs for real-time risk scoring across 3 banking workflows. Collaborated with senior engineers and data scientists to build RAG pipelines using LangChain, pgvector, and FastAPI, reducing analyst document lookup time by 35%. Applied RAGAS-based LLM evaluation and SEC/FINRA-compliant data handling across agentic AI workflows built with LangGraph for automated document classification and risk tagging. Built async FastAPI backend services on AWS (EC2, S3, Lambda) with vector search and Redis caching, improving internal AI application response time by 40%.

  • Associate Software Engineer at Space Infolab
    Feb 2021 - Jul 2023 · 2 yrs 6 mos

    Owned Django REST API and scikit-learn ML pipeline delivery; reduced churn prediction latency by 30% across production SaaS environments. Mentored 1 junior developer on backend best practices; enforced OWASP-compliant code standards and secure API design across all sprint deliverables. Built and maintained Django REST Framework APIs serving 500K+ monthly requests; collaborated with product and QA teams across Agile sprints to deliver 4 production releases on schedule. Developed Python churn prediction and recommendation models using scikit-learn and Pandas on 5M+ user records, reducing subscription churn by 18%. Containerized Django and ML services using Docker + GitHub Actions CI/CD; deployed to AWS EC2 with zero-downtime releases across staging and production. Automated support ticket classification using Python NLP (Transformers, Kafka), reducing manual triage by 70% and cutting response time from hours to minutes.