Berlin Metropolitan Area
I am a Principal DevOps / Site Reliability Engineer with AWS and Google Cloud hands-on experience. I am certified in AWS and Kubernetes (CKA - CKS). I always work with security first mindset. I worked in PCI DSS and SOC2 Type II certified environments (also during certification progress). I am specially interested in observability and monitoring and I love to automate everything.
As Principal Cloud DevOps Engineer and Platform Tech Lead at Entrust my primary responsibilities are: -Architect and champion the platform vision, shaping scalable, secure, and reliable cloud infrastructure across enterprise environments. -Lead the development and execution of DevOps strategies, harnessing automation, CI/CD pipelines, and infrastructure-as-code to drive accelerated delivery and operational excellence. -Collaborate with cross-functional teams to deliver robust cloud-native solutions aligned with strategic business goals, fostering innovation and best practices in architecture, security, and reliability.
Following Entrust’s acquisition of Onfido, I transitioned into the role of Staff Cloud DevOps Engineer at Entrust, where I also serve as Platform Tech Lead. - Lead platform architecture and DevOps strategy to ensure scalability, security, and reliability across cloud environments. - Drive automation, CI/CD pipelines, and infrastructure-as-code practices to accelerate delivery and improve operational efficiency. - Collaborate with cross-functional teams to design and implement cloud-native solutions aligned with business objectives.
- Participant in the regular on call rotation - Designing and implementing various parts of platform repave initiative - Upgrading very old Postgres DBs to latest version and splitting shared DBs which both enabled yearly savings upto 500k and increased reliability of services - Getting rid of lots of legacy infrastructure parts/tools and moving to latest opensource technologies - Implementing GitOps with Flux and migrating all infra tools to GitOps in all EKS clusters - Also I have been granted Onfido Value award for "find a better way" as recognition of all my work on infrastructure.
- Member of Core SRE Team - Participant in the regular on call rotation - Migration of Dolap (Trendyol's c2c - secondhand marketplace) from AWS ECS to EKS. End to end IaC/terraform orchestrated and implemented GitOps via Flux v2. - AWS Cost optimization - Cost and Capacity Management: Creating a product for calculating on-premise data center costs and capacity management (running across 7 DCs with ~7K physical hosts and ~30K VMs) for tribes, teams and workloads. - Service Level Management (SL-I|O|A): Enabling teams to define, visualize and alert based on their Service Levels/Error Budgets using Latency, Traffic, Error and Saturation metrics. - Implementation/Maintenance/Automation of ArgoCD and Flux v2 GitOps tools running across 350+ k8s clusters
- Participant in the regular on call rotation - Managing GitOps tools and repositories (Flux v1 and v2) - Used AWS Services: EC2, ECS, EKS, RDS, Aurora, Lambda, DynamoDB, Kinesis, SNS, SQS, Systems Manager, Secrets Manager - AWS ECS to EKS migration for 150+ services - IaC with terraform - Developing an automation for tenant provisioning: Step Function running Lambdas and containers for automation of deploying/creating/removing AWS infrastructure for tenants - Developing a kubernetes native tenant health-checking tool - Working with 100+ AWS accounts and using SSO, cross account permissions and VPC peering - Supporting RPC over RabbitMQ to gRPC migration of services communication - Implementing Prometheus, Grafana and AlertManager as service for tenants - Implementing service mesh with istio on k8s - Implementing Thanos (for multi tenant long term metrics storage) - Working on efforts for fully automated end to end multi region disaster recovery - Working on SOC 2 compliance (eg. secrets rotation, migration from static tokens/keys to IAM roles, repository scanning for secrets, base image scanning, dependency scanning)