Singapore, Singapore
Strong interest in Entrepreneurship, Computing and Business. Looking to embark on exciting projects and always willing to learn more. Experienced in Cloud Computing and Data Analytics in Python.
• Ensured high availability and reliability of mission-critical platforms supporting multi-environment, air-gapped, and multi-network deployments, aligning with global service stability requirements. • Designed and operated API Gateway and access platforms using Kong Enterprise, Envoy, Nginx, and Open Policy Agent, enabling secure and resilient request routing at scale. • Built and evolved backend infrastructure components and APIs supporting authentication, authorization, and service-to-service communication using Keycloak (OIDC, OAuth2, SAML, Kerberos) and OpenFGA. • Owned Kubernetes-based production environments, managing containerized micro-services with Helm and Operators to improve resilience, rollout safety, and deployment velocity. • Led infrastructure provisioning and configuration automation using Terraform, Helm, and GitOps workflows, minimising manual intervention and configuration drift across environments. • Established end-to-end observability and reliability practices using OpenTelemetry, Prometheus, and ELK; defined SLIs/SLOs and leveraged metrics to identify reliability risks and bottlenecks. • Performed root cause analysis (RCA) for production issues and drove systemic improvements to reduce repeat incidents. • Hardened and operated Linux systems in high-security environments, applying OS-level tuning and security best practices to improve system stability. • Developed automation and operational tooling (Shell / Python) to reduce repetitive tasks and improve operational efficiency. • Handled production incident response and service recovery for critical systems. • Analysed reliability risks and capacity bottlenecks using metrics and operational data.
- Maintained and enhanced a proprietary fork of Apache Flink codebase - Extended Flink’s Kubernetes integration by contributing patches, such as adding customized init containers and extra volumes for upstream use cases. - Developed and optimized containerized deployment scripts and Helm charts to streamline Flink cluster provisioning on Kubernetes - Collaborated closely with data engineering teams to translate job requirements into Flink-aware Kubernetes configurations and utility scripts
- Using of Elasticsearch to build Application Performance Monitoring (APM) tools for applications hosted on Student Development Platform running on Amazon Web Services. (http://icode4nus.sg/) - Building of metrics to measure performance for hosted applications (web, mobile and customized applications such as Telegram Bots)
- Created automation code (Terraform and Ansible) for deployment of website on Amazon Web Services (AWS): https://icode4nus.sg/ - Creation of Key Performance Indicator for platform performance and outreaching to student developers to onboard platform - Exposed to agile methodology for development - Exposure to Operating Systems (Ubuntu and CentOS)