Rahul Jain

Director / Sr. Engineering Manager | Distributed Systems | Cloud Security | Observability | FinOps | AI-Ready Platforms

Santa Clara, California, United States

About

I’m an engineering leader focused on building and scaling distributed, cloud-native platforms that operate at high scale and deliver real business impact. Over the years, I’ve led global teams across backend, QA, and release engineering to design and run systems processing hundreds of millions of events per hour, supporting enterprise customers with demanding reliability, security, and performance requirements. My core strengths lie at the intersection of platform engineering, cloud cost optimization, observability, and security: - Scaled engineering organizations and delivered 200+ AWS production deployments - Led FinOps initiatives driving meaningful cloud cost reduction through governance and visibility - Delivered customer-facing observability capabilities (logs, events, telemetry) to improve transparency and reduce MTTR - Drove platform modernization and toolchain migrations (e.g., CI/CD, observability stacks) - Led identity platform transitions improving security posture and reducing dependency on third-party providers I enjoy working at the boundary of engineering execution and business outcomes—partnering with Product, SRE, DevOps, and Customer Success teams to translate customer needs into scalable platform capabilities. More recently, I’ve been exploring how AI is reshaping software development and platform engineering—from AI-assisted developer productivity to intelligent observability and agent-driven workflows—and how to integrate these capabilities into modern engineering stacks. I’m particularly interested in roles where I can: - Lead platform and infrastructure engineering at scale - Drive AI-enabled engineering and operational workflows - Build systems that combine performance, cost efficiency, and strong security foundations

Experience

  • Director/Senior Software Engineering Manager at Zscaler
    Jul 2022 - Present · 4 yrs

    Led engineering for distributed cloud security platform, driving scale, cost optimization, platform modernization, and customer-facing capabilities across global teams. Key Responsibilities & Accomplishments - Scaled and led global engineering organization across backend, QA, and release engineering, growing team size from ~10 → ~30 and delivering high-impact platform capabilities - Architected and operated high-scale distributed systems processing ~200M events/hour, supporting 1000+ enterprise customers and 40K+ on-prem agents - Drove AWS cost optimization strategy, implementing standardized tagging, governance, and cost attribution frameworks, resulting in ~40% reduction in cloud spend - Led strategic platform migrations including Bitbucket → GitLab and Sumo Logic → Kloudfuse, improving developer productivity and observability - Delivered customer-facing observability features, enabling real-time access to logs and event telemetry, reducing MTTR and improving system transparency for customers - Led identity platform transformation by migrating users from commercial IDPs to Zscaler IDP with seamless user experience, reducing third-party costs and strengthening security posture - Partnered cross-functionally with Product, SRE, DevOps, and Customer Success to define roadmap and deliver 150+ AWS production deployments - Improved engineering efficiency and quality by adapting AI driven tools and building internal AI agents for debugging and troubleshooting. - Drove customer-centric engineering culture, including developer-led customer triage model, contributing to ~50% improvement in CSAT Tech stack used: - AI Productivity tools Windsurf, AI agent building tools: SageMaker, Langchain, MCP, Langgraph - Java, Spring, RESTFul APIs, Kafka Streams, Opensearch, Redis, Druid - AWS, ECS, Docker; - Jenkins, Ansible, Terraform, Kubeflow for data and release pipelines - DataDog, PagerDuty

  • Senior Software Development Manager at FIS
    2019 - 2022 · 3 yrs

    Managing 2 full stack teams for Merchant Payment Processing for FIS(formerly Worldpay/Vantiv/Litle). Technologies used are: - Tomcat, Java, Spring MVC, Cassendra, RESTFul Web Application, JUNIT, Kafka - AWS for deployment platform using Jenkins, Ansible, Terraform for CI/CD. - Dynatrace, Sysdig, Splunk for observability stack. - Scrum methodology - Monolith J2EE application to micro services architecture using Docker, Jenkins, Terraform. Responsibilities: - Deeply involved in technical, architectural, security design and reviews for the project and services - Interacting with cross functional teams of Operations, Support, Product management, Architecture to define product roadmap, provide vision and deliver it - Hiring and building teams from ground up - Representing team and product to higher management for reporting, status, planning. - Own the product and deliverables from concept to delivery and supporting in production.

  • Director Of Software Development at Bottomline Technologies
    Nov 2018 - Mar 2019 · 5 mos

    Managed 2 full stack Scrum Teams in Global Logic Ukraine and 1 Scrum Team in Portsmouth NH. As part of Digital Banking BU, the web application provides various payment capabilities ( ACH, Wire etc) to business customers of the banks as both on- prem and hosted solution. Tech stack used is Java, Tomcat, Oracle and Javascript. Responsibilities: - Managed and Lead the Global Logic team to deliver customer Go-Live releases by defining release content, ensuring quality controls and delivery on time. - Created vision and built team to execute Blue/Green deployment for zero downtime upgrades. - Participated in representing teams at higher management level for reporting, status etc. Note: Due to restructuring (resulting from missed revenue numbers), the position was eliminated. Received a strong LinkedIn recommendation from my direct manager to address this.

  • Senior Software Development Manager at Sophos
    Mar 2017 - Nov 2018 · 1 yr 9 mos

    Managing multiple full stack teams(US and offshore team based in Ukraine) for Enterprise Dashboard product for Sophos Cloud, a SaaS product for management of enterprise customers. Technologies used are: - Tomcat, Java, Spring MVC, Mongo DB, RESTFul Web Application, Angular, JUNIT, Jasmine. - AWS for deployment platform using Bamboo plans and Jenkins for CI/CD. - Integration with Salesforce for backend order, licensing, account processing. - Logzio for monitoring logs, Gradle for builds - Scrum methodology - Migrating J2EE application to micro services architecture using Docker, Jenkins, Terraform. Responsibilities: - Deeply involved in technical, architectural, security design and reviews for the project. - Interacting with cross functional teams of Support, Product management, Architecture to define product roadmap, provide vision and deliver it - Hiring and building both local and remote teams from ground up - Representing team and product to higher management for reporting, status, planning. - Own the product and deliverables from concept to delivery and supporting in production. - Part of team to manage release to production every 3 weeks.

  • Software Development Manager at HPE ( earlier HP Autonomy)
    Nov 2013 - Mar 2017 · 3 yrs 5 mos

    1. Worked on building Identity Management micro service part of HPE Verity Platform. This service implements Authentication and Authorization for various consuming applications. Technology stack used and managed: - Docker containers for faster delivery - Mesos/Marathon for managing containers in AWS env. - Swagger for API documentation. - RESTFul API implementation using Java, Tomcat, Jersey. - Liquibase for DB upgrades with PostGres as DB. Responsibilities: - Working with RE and DevOps for various release management, branch management and deployment. - Own and manage IDM product and release road map - Practice entire scrum methodology as scrum master including sprint planning, dailys, planning poker, burndown, backlog maintained etc. - Security scan using Fortify and Pen tests. - Reviewed design and architecture documents for the service. 2. Managed and built Backup Optimizer product which is used to optimize the storage by pushing the assets to a cheaper storage leaving behind a stub. Backing up a stub achieves faster and cheaper solution. Technology stack used Java, Tomcat, PostGres, Javascript etc. Worked as a scrum master and product owner for building the product from scratch. 3. Managed and delivered releases on 8.x and 9.x versions of Data Protector product. Worked as scrum master to deliver Adaptive Backup and Recovery functions of Data Protector. Tech stack used is C, Java, TCP/IP socket and PostGres.