Minneapolis, Minnesota, United States
Senior Data & AI Engineer with 8+ years of experience delivering scalable, cloud-native data solutions across AWS, Azure, and GCP. I specialize in building real-time and batch pipelines, integrating GenAI/LLMs (Bedrock, SageMaker), and automating ML workflows with MLOps best practices. My expertise includes: End-to-end data architecture using Snowflake, Databricks, AWS Glue, ADF, and Kafka ML & GenAI solutions with SageMaker Pipelines, Bedrock, LangChain, and Pinecone CI/CD & DevOps using Terraform, ArgoCD, Kubernetes, and GitHub Actions Multi-cloud strategy and cost optimization for enterprise-scale environments I’ve modernized data ecosystems across finance, logistics, and SaaS industries—driving data accessibility, real-time insights, and ML/AI readiness. Known for translating complex challenges into actionable engineering, I bridge technical depth with business impact. Let’s connect if you're building data platforms that need to scale with intelligence and precision.
Industry: Financial Services / Investment Management Tech Stack: AWS, SageMaker, Bedrock, LLMs, Transformers, Glue, DynamoDB, ArgoCD, GitHub Actions, Terraform, CloudFormation, Python, PySpark, SQL, LangChain, Model Monitor Built a cloud-native data platform for a global investment firm to enable scalable analytics and integrate GenAI capabilities. Delivered real-time access to insights through ML-powered automation and NLP-based querying. • Designed and deployed an LLM-powered data lake using SageMaker and Bedrock with zero-shot NER and transformers for natural language querying. • Operationalized ML workflows with SageMaker Pipelines, LangChain, and Model Registry, ensuring version control and governance. • Automated CI/CD pipelines using ArgoCD, GitHub Actions, and modular Terraform/CloudFormation, accelerating deployment velocity. • Re-architected EMR workloads, reducing idle time by 35% and optimizing compute resource cost via autoscaling S3 policies. • Partnered with data scientists to embed embedding-based search for complex financial documents using Bedrock and LangChain. • Strengthened observability using CloudWatch, CloudTrail, and Step Functions for orchestrated ML training workflows.
Industry: Enterprise Software / ERP Systems Tech Stack: AWS Glue, EMR, Redshift, Lambda, Kinesis, Pinecone, Step Functions, QuickSight, Hive, Python, IAM, Fine-Grained Access Controls Modernized a global ERP firm's data ecosystem by migrating legacy workloads to AWS, enabling real-time customer insights and GenAI-based semantic search integrations. • Migrated petabyte-scale Hadoop workloads to AWS using EMR, Glue, and Step Functions for batch + event-based transformations. • Developed hierarchical clustering pipelines and integrated outputs with Pinecone vector DB to deliver semantic search in SAP applications. • Streamed real-time events via Kinesis into persistent EMR clusters, powering alerting and fraud detection. • Secured multi-tenant infrastructure using fine-grained IAM policies, Lambda triggers, and VPC isolation. • Built role-specific BI dashboards using QuickSight, reducing executive reporting time by 60%. • Deployed versioned artifacts using S3, Hive, and cross-region replication for disaster recovery.
Industry: Environmental Services / Logistics Tech Stack: Azure Data Factory (ADF), Azure Databricks, Azure SQL, Kafka, Snowflake, Spark Streaming, Azure Monitor, Power BI, Azure DevOps, Kubernetes, Azure AD, Azure Log Analytics Enabled intelligent route optimization by integrating IoT data, real-time pipelines, and analytics across Azure-based systems. Reduced service delays and improved environmental compliance. • Built streaming + batch ETL pipelines with ADF, Databricks, and Blob Storage, processing sensor data in near real time. • Leveraged Kafka with Spark Streaming for continuous ingestion from route telemetry systems. • Orchestrated deployments via Azure DevOps with Dockerized microservices running on Kubernetes (AKS). • Implemented Azure Monitor and Log Analytics with custom metrics, thresholds, and automated alerts for pipeline failures. • Applied Azure Active Directory (Azure AD) to manage fine-grained data access across teams and BI consumers. • Built Power BI dashboards tracking pickup efficiency, reducing route overruns by 18%.
Industry: Supply Chain / Logistics Automation Tech Stack: GKE, BigQuery, Cloud Build, Cloud Functions, Pub/Sub, Cloud Monitoring, Terraform, GitLab CI, Helm, IAM, Cloud Armor, Stackdriver Migrated and modernized legacy logistics infrastructure using GCP-native services, reducing downtime and driving proactive observability across microservices. • Deployed 50+ microservices on GKE across multi-zone regions with CI/CD via Cloud Build and GitLab CI. • Orchestrated rollouts with Helm, and implemented chaos engineering simulations to validate failover logic. • Designed active-passive failover architecture using HTTP(S) load balancers, Cloud Armor, and custom routing. • Aggregated logs and events with Stackdriver and visualized service health in BigQuery dashboards. • Authored Terraform-based infrastructure-as-code templates for provisioning GCP networking, security, and CI environments. • Enforced policy-based access via IAM roles, VPC Service Controls, and audit logging.
Industry: Financial Products / Insurance Analytics Tech Stack: SQL, SAS, Python, Tableau, Power BI, Snowflake, Data Warehousing, BIC, Web Scraping, JSON, Git Contributed to policy workflow optimization by engineering data pipelines, developing cohort models, and building executive dashboards for better retention analytics. • Built ETL flows for policy lifecycle events into the Business Intelligence Center (BIC) using SQL and Python. • Performed root-cause analysis for underwriting failures using SAS, reducing policy lapse rate by 12%. • Developed interactive dashboards in Tableau and Power BI for real-time policy tracking and cohort modeling. • Built Python scripts for web scraping, data wrangling, and parsing JSON from policy databases. • Used Snowflake and Redshift for data storage and query acceleration in BI reports.