United Kingdom
I like building systems and applications from the simple to complex always with a focus on automation. I'm adept at learning new languages and technologies and have dabbled or even written production code in many languages from the popular (Java, Python, Go) to the obscure (ABAP, Groovy). I take a pragmatic approach to coding and building systems but also have well formulated views on the best approach to a given problem. I don't come from a formal CS background but have many years of varied experiences to draw on.
Lead development of Hosted Runners to Limited Availability that secured multiple customer multi-year commitments of $500k+ in ARR Developed blueprints for Zero Downtime Deployment and Disaster Recovery Served as part of a technical escalation for critical GitLab Dedicated incidents Key expert in a working group to overhaul SRE hiring across the company Lead analysis and implemented fixes to support the largest GitLab Dedicated tenant's critical scaling issues
Delivering the GitLab Dedicated product (single tenant SaaS) on the Environment Automation team. Responsible for automating tenant infrastructure and tooling to support managing tenants. - Coded and deployed a custom prometheus exporter to report on AWS Instance Health events - Collaborated inside and outside the team to deploy the GitLab Geo product to provide Disaster Recovery for tenant instances - Technical lead for delivering Hosted Runners as a product for GitLab Dedicated customers
Building the next phase of Magic Leap! Helping lead the team to build the next platform for ML's infrastructure and software. Heavily leveraging Terraform, ArgoCD and GCP to deploy and manage Kubernetes workloads. Major projects: - Implemented POC Kubernetes based machine learning platform (kubeflow) - Coded (Go) a custom Kubernetes operator to create databases and set database permissions for users in Google Cloud SQL - Created a cloud function and associated pipelines to aggregrate NAT IPs for allowlisting across our GCP projects - Finished out a migration of our data warehouse from AWS Redshift to GCP cloud composer - Collaboratively planned go-live for services/onboarding portal for ML2 - Advised on high level purchases of software and services - Contributed to and created terraform modules and module code for our custom Kubernetes and ArgoCD platform
Fully Remote SRE team member supporting Long Term Data teams and other groups withing the Battle.net Platform. Major Projects: - Set up Atlantis (runatlantis.io) in GKE for applying our Terraform configuration in GCP - Wrote a Go based tool to automate parts of our Argo manifests to help keep things "gitops"
I worked remotely for a geographically distributed team fulfilling all types of SRE responsibilities: - Monitoring & Alerting - CI/CD (Concourse/Bitbucket Pipelines) - Infrastructure Automation (Terraform) - Scripting (Primarily in Go but also Python) Major Projects: - One of the primary architects of a Kubernetes based Platform as a Service internal tool (automatic scaling to 0 with knative, istio and also provisioning of sql databases for services running on the platform) - Built a pipeline for all our GCP infrastructure using Infrastructure as Code with Terraform - Built a pipeline for our GCP Shared VPC that easily provisioned additional subnets in the shared VPC for on premises connectivity
Part of an SRE team supporting Apple Maps. I can't say much about what I do because of company policy but this is part of the job description and all part of my daily work: - You will be responsible for the application and all aspects of it in production including the user experience - Work reciprocally with developers in supporting new features, services, releases, and become an authority in our services - Monitor site reliability and performance - Scale infrastructure to meet demand - Fix site down issues - Continuously monitor/improve the quality of our infrastructure - Develop automation tools - Document system design and procedures - Participate in on-call rotation