United States
I am a senior data engineer with over 10 years of experience building cloud-native data platforms, large-scale analytics pipelines, and scientific data ecosystems across biotechnology and pharmaceutical organizations. I have a strong background in designing lakehouse architectures, distributed data processing systems, and research data platforms supporting genomics analytics, precision medicine, and AI-driven scientific workflows. I have got full experience with Python, SQL, Spark, Databricks, Snowflake, Airflow, and AWS.
Lilly TuneLab - Led development of Lilly TuneLab’s enterprise scientific data platform by designing a Databricks Lakehouse architecture on AWS that unified research, manufacturing, and clinical datasets across multiple R&D organizations. - Built large-scale ingestion and transformation pipelines using PySpark, Airflow, and Delta Lake to process more than 15TB of laboratory, manufacturing, and clinical data daily, improving data availability for research teams by 70%.
RGC Genomic Lakehouse Platform ( Databricks & Project Glow )
Compass Precision Medicine Platform