Shivam Taneja

AI @ JPMC | Competitions, Dataset & Notebooks Expert @ Kaggle | ICPC Regionals Finalist 2023 | Samsung Prism ‘22

Delhi, India

About

AI/ML Engineer with 26+ months of experience building scalable systems and Applied AI solutions. Proficient in GenAIworkflows, RAG, and cloud infrastructure (AWS). Experienced in full-stack development, large-scale data pipelines,and model optimization. Recognized on Kaggle with top 10% global rankings for work in LLM retrieval and PIIdetection.

Experience

  • JPMorganChase (Hybrid)
    • Applied AI/ML Analyst
      Jul 2025 - Present · 1 yr

      • Automation of KYC process for 75+ documents

    • Software Engineer, Software Engineering Programme
      Jan 2024 - Jun 2025 · 1 yr 6 mos

      • Achieved 76.3% precision and 81.2% recall by building a document parser using PaddleOCR, DBSCAN, and UMAP. • Delivered 91.4% intra-cluster cohesion and 90.8% inter-cluster separation by building an error catego- rization pipeline using TF-IDF, metadata features, and HDBSCAN. • Reduced on-premise storage costs by over $1M annually by porting 1.2 PB of data to AWS S3 and optimizing retrieval pipelines • Built a reasoning pipeline for outlier alerting using Langchain, web scraping, and a pretrained LLM API, achieving 95% retrieval coverage and 88% reasoning accuracy, significantly reducing false positives.

    • Software Development Engineer Intern
      Jan 2024 - Jul 2024 · 7 mos

      • Designed an alerting system that identifies outliers using scheduled SQL queries, flagging anomalies in production data and notifying relevant teams. • Mapped detected outliers to the correct stakeholders using ownership metadata and triggered alerts through internal notification systems.

  • Software Development Intern at JPMorganChase
    Jun 2023 - Jul 2023 · 2 mos

  • Research Intern at Samsung R&D Institute India
    Nov 2022 - Jun 2023 · 8 mos

    • Achieved over 84% accuracy in detecting Regions of Interest using PaddleOCR and DistilBERT. • Led a team of 6 to research Region of Interest detection in sports footage. • Curated a publicly available dataset for model training by extracting data through web scraping, released under the Apache 2.0 license.

  • Software Development Intern at Ribango
    Jan 2023 - Mar 2023 · 3 mos