Vijaya Kumari Chagam

Data Engineer

Singapore

About

 AWS Data Engineer with 4 + years of experience in Software development with proficiency in design and development of Hadoop and Spark applications with SDLC Process.  Extensive work experience in building efficient data pipelines using Big Data-Hadoop Frameworks (HDFS, Hive and Oozie), Spark Eco System Tools (Spark Core, Spark SQL), PySpark and Python  Very good experience in AWS Cloud Services EMR, S3, Glue, Athena and RDS.  Worked in Product Based Companies and have good experience in Product Development.  Worked in Agile software development model  Collaborated with other technology teams to ingest, transform, and load data from multiple data sources, structured data.  Built and maintained robust automated data pipelines to support data solutions across BI and analytics use cases.  Implemented trustworthy and efficient data transformations via ETL and ELT Built & improved data pipelines and services through CI/CD.  Closely worked with Product Owners during requirements analysis and UAT Phase and received appreciations from Product owners.  Executed Production Deployment in smoother way by connecting with multiple teams.  Ability to work with business users to explain concepts and understand requirements  Worked closely with testers to prove the functional and non-functional behavior of the pipelines.

Experience

  • Carelon Global Solutions India (3 yrs 4 mos)
    • Data Engineer
      Aug 2020 - Nov 2023 · 3 yrs 4 mos

      · Tech Stack: Spark Core, Spark SQL, PySpark, Python, ETL, Cloudera Cluster, Hadoop, Apache Hive, Spark with Scala, Migrating of project from abinitio code to spark with scala. · Cloud: AWS Glue, EMR, S3, RDS, Athena · Built and optimized batch data pipelines using PySpark and Glue · Migrated Spark Jobs to EMR and improved performance · Developed standardized data pipelines and utilities · Documented processes and executed unit tests · Provided production support

    • Data Engineer
      Aug 2020 - Oct 2023 · 3 yrs 3 mos

      Tech Stack: Hadoop, Spark Core, Spark SQL, PySpark, Python, Cloudera Cluster, and Apache Hive Cloud: AWS Glue, EMR, S3, RDS and Athena  Actively Involved in analysis of the technical specifications.  Efficiently built the batch data pipelines using PySpark.  Expertise in performance optimization of spark jobs.  Build data pipelines using Glue.  Stored the transformed data into RDS (Postgres)  Skilfully worked on Migration of Spark Jobs to EMR using S3 as Storage.  Actively worked on performance improvement to spark jobs to run efficiently in EMR.  Built common utilities, and standardized data pipelines to ensure consistency across the organization.  Building and maintaining operational runbooks, streamlining procedures and ETL jobs.  Developed documentation to assist users.  Executed the Unit test cases  Actively participated in Production Support.

  • Data Engineer at Unisys
    Nov 2019 - Jun 2020 · 8 mos

    Tech Stack: Hadoop, Spark Core, Spark SQL, PySpark, Python, Cloudera Cluster, Apache Hive and Oozie  Involved in analysis of the technical specifications.  Processed Banking transactions of the customers  Used Spark SQL to process the huge amount of structured data available in Hive Tables.  Expertise in performance optimization of spark jobs.  Automation of jobs using Oozie.

  • Data Engineer at Unisys India
    Nov 2019 - Jun 2020 · 8 mos

    · Tech Stack: Hadoop, Spark Core, Spark SQL, PySpark, Python, Cloudera Cluster, Apache Hive, Oozie, ETL · Processed banking transactions and optimized Spark jobs · Automated jobs using Oozie

  • Data Engineer at Apricot Consulting Private Limited
    Sep 2018 - Sep 2019 · 1 yr 1 mo

    Tech Stack: Ab Initio, Spark (Scala), Hadoop, SQL, Shell Scripting, Airflow, Cloudera, Jenkins, GitHub • Worked on Ab Initio to Spark (Scala) migration, converting legacy ETL graphs into Spark-based batch processing jobs. • Developed and optimized Spark (Scala) applications on Hadoop (Cloudera) for large-scale data processing. • Wrote complex SQL queries for data transformation, validation, and reconciliation during migration. • Scheduled and monitored workflows using Apache Airflow, and automated deployments through Jenkins CI/CD pipelines. • Used Shell scripting, GitHub version control, and provided production support to ensure smooth job execution.