Islam Gomaa

Cloud and Hybrid-Edge Engineer | SRE Lead | Enterprise Architect | IT Operations Manager | Former Field & Support Engineer | Mentor to the Next Generation of Cloud Talent | Security Cleared | Bilingual

Ottawa, Ontario, Canada

About

Experience

  • Microsoft (Canada · Remote)
    • Senior Customer Engineer and SRE Lead | Azure Reliability
      Oct 2024 - Present · 1 yr 9 mos

      As a key contributor to the Azure Reliability Engineering team, I am responsible for ensuring service resilience, operational excellence, and performance at scale across Azure’s enterprise cloud platform. My work directly supports Azure’s global uptime commitments and contributes to the engineering systems and processes that uphold reliability across thousands of customer workloads. • I specialize in automated remediation, incident response, and proactive engineering practices that reduce operational risk and improve service health across regions. I operate at the intersection of systems engineering and software development, building solutions that scale and endure under demanding production conditions. • I design and contribute to automation tooling, operational scripts, and self-healing mechanisms that eliminate toil, reduce human error, and accelerate recovery during incidents. These automations enable consistent, repeatable operational tasks while improving time-to-mitigation across critical services. • I lead efforts to develop and enhance observability frameworks, including telemetry pipelines, advanced dashboards, alerting strategies, and actionable insights. This work ensures we can detect, diagnose, and resolve issues proactively, often before they impact customers. • I collaborate closely with Site Reliability Engineers (SREs), product groups, and partner engineering teams to identify recurring failure patterns and engineer durable, systemic solutions. This includes contributing to post-incident analysis, risk assessments, and implementing architectural fixes that improve the long-term reliability posture of Azure services. • My contributions support Azure’s reliability roadmap, helping ensure the platform meets customer SLAs, reduces downtime, and continues to deliver enterprise-grade performance and availability at global scale.

    • Global Lead Engineer | Azure Hybrid & Edge | FastTrack
      Oct 2021 - Oct 2024 · 3 yrs 1 mo

      • Led hybrid cloud and migration engagements across North America, EMEA, Asia, and Australia, supporting federal and enterprise clients in regulated industries. • Designed and implemented Azure landing zones with secure Virtual Networks, Subnets, NSGs, and Private Endpoints to host mission-critical workloads. • Directed multiple application and infrastructure migrations using Azure Migrate, rehosting VMware/Hyper-V workloads into Azure and modernizing legacy systems with cloud-native services. • Deployed and configured Application Gateways (WAF), App Services, and SQL Managed Instances using Infrastructure as Code (Terraform, PowerShell, AVM). • Created and enforced custom Azure Policies and governance guardrails to align with compliance frameworks such as Government of Canada Cloud Guardrails and ITSG 33. • Built and operated the North America Hybrid Lab, used globally to validate complex migration scenarios, zero-touch deployments, and compliance models. • Delivered 50+ technical sessions and migration accelerators to train engineering teams and partners on secure deployment practices. • Partnered with Azure product groups to refine migration tooling and improve automation capabilities, directly contributing to GA readiness of Azure Stack HCI, vSphere Resource Bridge, and Azure Migrate.

    • Senior Customer Service Engineer
      Apr 2017 - Sep 2021 · 4 yrs 6 mos

      Served as a trusted advisor and technical lead for Microsoft’s top enterprise clients during a pivotal period of cloud adoption (2015–2019), specializing in Microsoft Azure solutions. Delivered strategic, hands-on guidance to accelerate cloud transformation, modernize infrastructure, and enhance operational resilience. • Led the design and implementation of Azure IaaS and PaaS architectures, focusing on virtual networks, Azure Site Recovery, Azure Backup, and hybrid cloud integrations using Azure ExpressRoute and VPN gateways. • Conducted Azure Readiness and Migration Assessments to prepare customer environments for large-scale cloud adoption, aligned with the Microsoft Cloud Adoption Framework. • Delivered Azure Health and Risk Assessments (RaaS), providing customers with comprehensive reviews of their cloud environments and actionable remediation plans. • Facilitated on-site workshops and deep-dive technical sessions on topics such as Azure governance, cost optimization, identity and access management (AAD), and security best practices. • Partnered with engineering teams to pilot and improve Azure features based on customer feedback, contributing to product evolution during early iterations of Azure Monitor, Azure Security Center, and ARM (Azure Resource Manager). • Played a key role in supporting hybrid cloud strategies, integrating on-premises infrastructure with Azure using tools like Azure Migrate, Azure Automation, and Log Analytics. • Collaborated with Microsoft Sales and Account teams to support pre-sales architecture design, perform technical assessments, and author Statements of Work for cloud engagements.

  • Part-Time Professor & Program Coordinator at Algonquin College of Applied Arts and Technology
    Mar 2018 - Present · 8 yrs 4 mos

    As Program Coordinator for the Cloud Development and Operations (CDO) postgraduate program, my primary focus is on empowering students to become confident, capable, and career-ready cloud professionals. I work closely with learners from diverse backgrounds, helping them build the technical and soft skills needed to thrive in today’s cloud-driven IT landscape. As Program Coordinator for the Cloud Development and Operations (CDO) postgraduate program, my primary focus is on empowering students to become confident, capable, and career-ready cloud professionals. I work closely with learners from diverse backgrounds, helping them build the technical, professional, and problem-solving skills needed to thrive in today’s cloud-driven IT landscape. My passion lies in helping students not only master cloud technologies—such as Azure, AWS, DevOps practices, and infrastructure automation—but also believe in their potential to make a meaningful impact in the tech industry. Whether it’s earning their first certification, solving a challenging lab, or landing a full-time role, supporting students on their journey from the classroom to the cloud workforce is the most rewarding part of what I do.

  • Principal Engineer at Pluralbyte Inc
    Apr 2010 - Present · 16 yrs 3 mos

  • Enterprise Architect at Kivuto Solutions Inc.
    Apr 2010 - Mar 2014 · 4 yrs

    As an Enterprise Architect, I led the design, implementation, and operational support of resilient datacenter environments, supporting production systems that powered key lines of business including finance, operations, HR, and customer services. I ensured infrastructure scalability, high availability, and compliance with business continuity standards. • Designed and maintained high-availability enterprise datacenter architecture aligned with the needs of multiple business units. • Defined and implemented comprehensive site recovery and backup strategies, using System Center Data Protection Manager (DPM) and SAN replication to ensure compliance with RPO/RTO requirements and maintain operational continuity. • Supported production systems powering critical lines of business, collaborating with stakeholders to align infrastructure planning with financial, HR, and operational priorities. • Led server consolidation and virtualization projects using Hyper-V and SCVMM, reducing physical footprint and optimizing resource allocation. • Deployed and managed the Microsoft System Center Suite (SCCM, SCOM, Orchestrator, VMM) to streamline monitoring, automation, and configuration management. • Administered Active Directory, Group Policies, and core services (DNS, DHCP), ensuring secure and efficient enterprise access. • Delivered infrastructure support for enterprise workloads including Exchange, SQL Server, and SharePoint, ensuring high reliability and performance across departments.

  • IT Operations Manager (Hands-On) at E-academy Inc.
    Feb 2005 - Jan 2010 · 5 yrs

    As a hands-on IT Operations Manager, I was responsible for the reliability, performance, and security of enterprise IT infrastructure and production environments. I led a team of 10 IT professionals, ensuring 24/7 operational support, proactive system monitoring, and effective incident response to meet business-critical demands. • Managed end-to-end IT operations, including servers, networking, backups, patching, monitoring, and user support across multiple locations. • Supervised a team of 10 system administrators and support engineers, handling task delegation, skills development, and performance evaluations. • Oversaw a rotating on-call schedule, providing 24/7 support coverage for production environments and acting as the primary escalation point for critical incidents. • Implemented and maintained monitoring and alerting systems using tools such as SCOM, PRTG, and SPLUNK, ensuring proactive detection and resolution of system issues. • Led incident and problem management processes, including root cause analysis, documentation, and follow-up corrective actions. • Administered core infrastructure including Windows Server, Active Directory, Group Policy, DNS, DHCP, Exchange. • Executed virtualization and storage optimization projects using Hyper-V, and SAN/NAS platforms. • Defined and enforced disaster recovery (DR) and backup strategies using DPM, Acronis, and offsite storage solutions. • Handled vendor management, procurement, and lifecycle support for hardware and software assets.