London, England, United Kingdom
Hello, and welcome to my LinkedIn profile! ๐ Current Position: Currently, I'm proudly serving as a Senior Software Engineer at Elastic. My area of expertise and interests revolve around an array of modern technology, including: - โป๏ธ Renovate and dependency management - โ๏ธ Cloud Native technologies - ๐ Observability principles and tools - ๐ฆ Kubernetes and Container orchestration - ๐ฆพ DevOps methodologies - ๐ป Programming, particularly with a focus on Golang Additional Roles: - โ๏ธ Writer at CloudNativeEngineer for 2 years, running a newsletter focused on cloud native technologies - ๐จโ๐ซ Technical mentor for 2 years, helping engineers with system design, Kubernetes, and Elasticsearch - ๐ Technical reviewer for "Elastic Stack 8.x Cookbook" Past Experience: Before Elastic, I accumulated diverse skills through experiences such as: - ๐พ Big Data Engineer at Playstation, turning data into insights and action. - ๐ Hadoop Trainer, empowering a new generation of data engineers with Hadoop knowledge. Personal Interests: When the laptop is closed, you'll find me: - ๐พ Staying active with squash and hitting the hiking trails. - ๐ฎ Delving into the world of PlayStation games. Certifications: I'm a firm believer in continuous learning and have some important certifications under my belt: - ๐ Certified Kubernetes Administrator (CKA), proving my expertise in Kubernetes. - ๐๏ธ Holder of various Hadoop certifications, affirming my proficiency in big data technologies. Feel free to explore my profile and reach out for any discussions about tech, opportunities, or potential collaborations. Thanks for dropping by!
Part of a 4-person platform team supporting ~100 engineers on GKE at scale: 13,000+ containers, 350 nodes, 13+ TiB storage, 500 ArgoCD applications. Ramped on-call fast - shadowing on-call within 2 weeks, first independent shift within 50 days. Handled 300+ support and alert investigations across 60+ on-call/support days, keeping the GitOps backbone behind 100,000+ commits and 34,000+ merged PRs per year running reliably AI-Augmented Alert Investigation Built autonomous Slack bot cutting triage time to minutes. Engineers trigger via emoji; bot investigates by querying Kubernetes, Grafana, Loki, Prometheus, and Tempo and returns root cause + remediation recommendation. Repurposed the same architecture at a company hackathon in 2 days: swapped the observability MCP for a custom Python Looker MCP, enabling non-technical stakeholders to query daily booking data in natural language. Infrastructure Automation Delivered automated Terraform drift detection across two monorepos via Atlantis post-workflow hooks and GitHub Actions. AI Adoption Built 10+ Claude Skills automating daily work: PR automation, compliance evidence, infrastructure research. Reached 85% AI adoption against 61% company baseline. FinOps: ยฃ160k/year savings Two initiatives: network traffic analysis via custom Streamlit app surfaced network compression opportunity (ยฃ100k/year); Spot instance migration (ยฃ60k/year). Data & Analytics Built escalations data product joining incident.io + PagerDuty in BigQuery via DBT with Looker dashboards โ surfaced escalation trends invisible in native tooling. Open Source Published a custom Terraform Sentry provider fork (250+ Terraform Registry downloads) and contributed upstream fixes merged into the official terraform-provider-sentry.
Platform Engineer with proven experience in developing and maintaining critical infrastructure solutions. - Single-handedly designed and implemented a way to integrate with Elastic Synthetic monitoring, achieving widespread adoption across 8+ engineering teams for monitoring business-critical internal infrastructure - Supported company-wide dependency management through Renovate to ensure consistent and up-to-date software dependencies across the organization's codebase - Member of the production incident response team, providing on-call support during business hours and weekend rotations. Demonstrated strong problem-solving abilities through incident investigation and documentation of root cause analyses
- Developed and implemented a custom Elastic integration for Istio logs and metrics as part of the Cloud Native Observability team - Created and maintained Elastic integration dashboards for Kubernetes monitoring and observability - Engineered a remote debugging solution for Elastic Beats using Go Delve Debugger, Tilt, Docker, and Kubernetes, streamlining troubleshooting processes - Contributed bug fixes and new features to Elastic Agent and Beats codebases - Automated support ticket investigation processes and resolved complex Kubernetes-related support issues - Served as a member of the interview committee, participating in technical evaluations across multiple positions
- Wrote a standalone application to compress, deduplicate and re-partition historical and daily game analytics data. This made querying months worth of data in 10s of seconds possible. Previously, it would have taken hours to query a single month and it would have run out of memory for querying longer periods (Golang, AWS S3, AWS Athena, AWS Glue). - Wrote a Lambda to modify game analytics events in real-time. This lambda replaced IDs in the events according to custom rules per game and per event type stored in AWS S3. Part of this project included writing an in-memory cache for the S3 configs to reduce the processing time (Python, AWS Kinesis, AWS Lambda). - Wrote a Lambda to offload data from the real-time pipeline to the data warehouse (Golang, AWS Kinesis, AWS Kinesis Firehose, AWS S3) - Contributed to the successful launch and ongoing support of 10+ Playstation first-party games from World Wide Studios by providing capacity estimations, cost optimizations, modifying configs, fixing bugs and being on call (AWS EMR, AWS Lambda, AWS Kinesis, Golang, Python, Terraform, Jenkins) - Contributed to a major refactoring of the data pipeline (Golang, Python, AWS Kinesis, EMR, AWS Lambda, Terraform) - Reduced the time taken by Engineers to manually create Grafana users, by writing an automation script that synchronized Okta users with Grafana users (Python) - Wrote a script that would allow Engineers from a Studio to search, download and transcode videos according to different criteria (Python). - Maintained and improved 10+ internal microservices, 20+ EMR jobs, 5+ AWS Lambdas to support the data pipeline (Python, Golang, Terraform, Ansible)
- Reduced the running time of the daily pipeline by ~4x (from 8 to 2 hours), by optimizing some Hive queries and replacing the legacy code with a Google Dataflow job written in Java for the ingestion of 120 million documents into ElasticSearch. - Achieved 30% cost savings in the infrastructure expenses of my team by replacing the legacy Hadoop cluster with a new setup with the same processing power plus Kerberos Security - Wrote a Google Dataflow job in Java as part of the migration of the data pipeline from Hive on-premise to Google BigQuery - Automated the scheduling of the data pipeline by creating, configuring and supporting an Airflow cluster running on Kubernetes on Google cloud and by writing various Airflow jobs in Python - Improved various parts of the data pipeline (SQL for Hive queries, Java for Hive UDFs and Hive SerDes) - Developed scripts to collect and display data pipeline statistics into InfluxDB and Grafana (Python)
- Reduced the ingestion time of 40 million scientific publications into ElasticSearch by ~8x (from 24 to 3 hours) by rewriting one of the steps of the data pipeline (Spark, Hadoop cluster on AWS). - Deployed, configured and maintained a variety of Hadoop, ElasticSearch and Cassandra clusters either bare-metal on-premise or on cloud (Python, Ansible, Amazon AWS) - Contributed to the migration of the batch data pipeline from a bare-metal Hadoop cluster to an Amazon EMR cluster by rewriting a Map/Reduce job that was written in Java into a Spark job written in Scala - Automated the scheduling of the data pipeline by writing multiple Airflow jobs (Python) - Improved the data pipeline alerting capabilities (Python and Bash scripting) - Contributed and maintained various apps running on Mesos and Marathon as Docker containers