Pune Division, Maharashtra, India
I'm a Data Engineer with 3.8 years of experience building cloud-based ETL pipelines, currently working with Databricks, PySpark, and Azure Data Lake to move, clean, and structure large datasets that business teams actually depend on. I started in traditional ETL with Informatica PowerCenter, which gave me strong fundamentals before I moved into the cloud-native stack. That combination, knowing both worlds is what separates me from engineers who only know one. I automated client onboarding pre-checks that cut manual effort by 75% and accelerated onboarding cycles by 50%. I've built real-time monitoring dashboards in Sumo Logic, implemented Medallion Architecture using Delta Lake, and developed transformation pipelines that handle multi-tier business aggregations at scale. I'm now looking for a senior Data Engineering role where the data problems are genuinely hard, teams that care about pipeline reliability, data quality, and building things that last. Let's connect: 📩: [email protected]
1. Developed Python scripts to automate full-load pipeline triggers for client onboarding, data issue resolution, and selective table reloads. 2. Automated client onboarding pre-checks including schema validation, CDC status verification, and integrity checks, reducing manual effort and accelerating onboarding cycles. 3. Built real-time monitoring dashboards in Sumo Logic with automated anomaly detection to improve pipeline reliability and SLA tracking. 4. Engineered end-to-end ETL pipelines on Azure Databricks using PySpark for large-scale data ingestion and processing. 5. Built PySpark transformation logic for data cleansing, schema standardization, and multi-tier business aggregations.