Khushboo Jain

Data Engineer | Publicis Sapient | NITJ | 5+ YoE | 3.9k+ @LinkedIn

Gurugram, Haryana, India

About

As a seasoned Big Data Engineer with 5.2 years of experience, I have honed my skills in designing and implementing large-scale data solutions using a diverse range of Big Data technologies. My expertise spans Hadoop, Spark, Hive, and PySpark, along with cloud platforms such as AWS (S3, Redshift, Glue) and Azure (Data Factory, Synapse). I excel in building robust data pipelines that handle billions of rows, enabling businesses to extract actionable insights and make data-driven decisions. At Annalect India, I led the creation of a fully automated data reporting system for Mitsubishi, leveraging my deep knowledge of AWS and Python. This initiative reduced manual effort by 85% and streamlined data processing, showcasing my ability to optimize complex workflows. My work in predictive modeling, including developing a model that enhanced forecast accuracy by 25%, further underscores my ability to apply Big Data technologies to solve real-world business challenges. I am passionate about pushing the boundaries of Big Data, constantly exploring new tools and methodologies to drive innovation. Whether it’s automating data workflows, optimizing ETL processes, or building predictive models, I thrive on transforming data into a strategic asset for organizations.

Experience

  • Senior Data Engineer at Publicis Sapient
    Oct 2024 - Present · 1 yr 10 mos

    Currently working for client NAB Previous Experiences: Kotak Mahindra Bank: 1. Developed Snowflake Data Model and curated SQL queries to migrate the data from Sybase to Redshift. 2. Developed Data Model by implementing Fuzzy Logic on the database to catch the money laundering cases. 3. Created Python Function to standardize the party, counter party and bank names across the database using Fuzzy Token Ratio and Regex, enabling joins seamlessly. 4. Created SQL Procedure to handle 1 Billion records after cross join by using bucketing technique. 5. Leveraged Linux Commands to run the production environment and Azure DevOps for version control.

  • Big Data Engineer at Annalect India
    Apr 2023 - Oct 2024 · 1 yr 7 mos

  • Data Engineer at BYJU'S
    May 2022 - Jan 2023 · 9 mos

    • Leading a team of 4 members. • Utilized MySQL, data warehousing programs, Power BI, and other dashboard/visualization toolsets for data intelligence and analysis. • Worked cross-functionally to define problem statements, collect data, build analytical models, and make recommendations related to content for K-12 segment. • Data cleansing and data manipulation using python of user journey on BYJU’s, the learning app.

  • Larsen & Toubro (Full-time · 1 yr 9 mos)
    • Data Engineer
      Nov 2020 - Apr 2022 · 1 yr 6 mos

      • Created WBS of Project using Advanced Excel and Macros before the deadline of 3 months. • Achieved Gross Margin of 12% by tracking the bidding data of subcontractors in other projects using Statistics. • Created a base map of Patiala city by mapping the locations of 65,000 consumers using Python. • Ensured 0% theft by tracking utilization of Material on Operational Dashboards using Power BI. • Cleaning and exploration of the data received from execution team using MySQL.

    • Cluster Coordinator
      Aug 2020 - Oct 2020 · 3 mos

      • Introduced the concept of tracking relative performances of 4 projects on Operational Dashboards using Power BI.

  • Intern at TechnipFMC India Limited
    Jun 2018 - Jul 2018 · 2 mos

    Learned about Closed Loop working of Plant , Piping and Instrumentation Diagram and Field Instruments , Instrumentation index.