Yogesh G

Director, Data Engineering | Building data platforms that actually scale | Mentoring aspiring data engineers | AWS · Azure · GCP

Bengaluru, Karnataka, India

About

I build data platforms that process terabytes daily across AWS, Azure, and GCP — and I build the teams that run them. 11+ years in data engineering. Started writing Python scripts. Now I direct engineering for Fortune 500 financial institutions, managing multi-cloud budgets and leading 15+ engineers across 3 squads. What I actually do: → Architect enterprise data lakes and real-time pipelines → Grow engineering teams from scratch (built a team from 3 to 8, then to 15+) → Make the hard calls on cloud architecture that save millions → Mentor aspiring data engineers breaking into the field My career path: developer → engineering manager in 5 years at the same company, then strategic moves that doubled my scope each time. 25+ data projects delivered. A 5M+ record matching engine built from scratch. Governance frameworks that actually get followed. I also mentor people transitioning into data engineering — because someone did that for me, and it changed everything. If you're building data teams, transitioning into data engineering, or just want to talk about why most data lakes become data swamps — let's connect.

Experience

  • Director, Data Engineering at ITI Data
    Aug 2024 - Present · 1 yr 11 mos

    • Direct data engineering strategy for Fortune 500 financial institutions, managing a $2M+ annual multi-cloud infrastructure budget across AWS, Azure, and GCP • Lead a team of 15+ engineers across 3 squads delivering cloud-native data solutions • Architected enterprise data lake on S3/Athena/Glue with cross-cloud integration to Azure Data Factory and BigQuery for hybrid analytics • Reduced data pipeline processing time by 60% through architecture redesign and Spark optimization • Drove Elasticsearch → OpenSearch migration across 50+ indices with zero downtime, saving ~₹40L/year in licensing • Implemented Azure Synapse Analytics for client-facing reporting, enabling self-service BI across business units • Established data governance framework — RBAC, data cataloging, and audit logging for 500+ datasets across AWS and Azure • Cut cloud infrastructure costs by 35% through right-sizing, reserved instances, and automated scaling across all three cloud providers • Designed and built full-stack data management applications using React, Node.js, and Python REST APIs • Developed microservices architecture for data ingestion APIs handling 10K+ requests/minute with auto-scaling on ECS

  • Senior Manager, Data Engineering at Innoitus
    Dec 2022 - Aug 2024 · 1 yr 9 mos

    • Built and led a team of 8 data engineers, growing the practice from 3 to 8 in under a year • Designed multi-cloud data lake architecture processing 2TB+ daily with AWS as primary, plus Azure Data Lake and GCP BigQuery for specific client workloads • Built real-time pipelines using ECS, Lambda, SQS and Azure Event Hubs achieving sub-500ms end-to-end latency • Optimized Redshift warehouse by redesigning star schemas, reducing average query time from 45s to 8s • Led GCP migration for analytics workloads implementing Dataflow and BigQuery, reducing analytics costs by 25% • Delivered 12+ data products across finance, HR, and operations domains on time and under budget • Implemented automated data quality checks on Azure Databricks catching 99.5%+ of data anomalies • Built RESTful and GraphQL APIs for data platform self-service • Developed full-stack internal tools using React and Python/Django for pipeline monitoring and data quality dashboards

  • Data Engineer at E-Mech Solutions Private Limited
    Feb 2015 - Dec 2022 · 7 yrs 11 mos

    • Progressed from engineer to engineering manager in 5 years, eventually leading a team of 10 across 4 active projects • Built a matching engine processing 5M+ records daily with 99.9% accuracy using Python, Spark, and AWS • Managed end-to-end delivery of 25+ data projects across finance, procurement, staffing, and knowledge management • Reduced deployment time from 2 days to 30 minutes by implementing CI/CD with Terraform and GitHub Actions • Drove adoption of IaC (Terraform) across all projects, standardizing 100+ AWS resources under version control • Mentored 15+ junior engineers, with 4 growing into senior/lead roles under guidance • Designed and developed RESTful web services and full-stack applications for data processing • Built API-driven matching engine with Python Flask backend, serving 5M+ daily lookups

  • Software Engineer at Relyon Softech Ltd.
    Jun 2015 - Oct 2016 · 1 yr 5 mos

    Data Engineer with experience in designing, developing, testing, and deploying data pipelines and workflows for ETL, data processing, and data analytics. I have strong expertise in utilizing AWS services such as EC2, S3, Redshift, Lambda, Glue, Step Functions, and Redshift for data processing and analytics, and extensive experience in Infrastructure-as-Code (IaC) using Terraform and CloudFormation for provisioning and managing AWS resources. I am proficient in programming languages such as Python, SQL, HTML, CSS, JavaScript, and jQuery for data processing and analytics, and have strong experience working with DevOps teams to deploy and manage data processing and analytics infrastructure in production environments. I am experienced in working with Agile teams to develop and maintain data pipelines and workflows for continuous delivery, and have developed dashboards and reports using BI tools such as Tableau and Power BI. I am also experienced in utilizing machine learning algorithms and frameworks such as Scikit-Learn and TensorFlow for data processing and analytics. My strong communication skills allow me to effectively communicate technical concepts to both technical and non-technical stakeholders, and my experience in mentoring and training new team members on data engineering best practices makes me a valuable asset to any team.