Description
Why PayNet / Why Now
- Build the infrastructure behind Malaysia’s national payments network, powering rails Malaysians use daily, including DuitNow, FPX, JomPAY, and MyDebit.
- Operate at true national scale (PayNet reports 8.44B transactions processed in 2025), where architecture decisions materially affect uptime, speed, and trust.
- Join a public-good operator that reinvests surplus back into making the country’s payment infrastructures more resilient, competitive, and accessible.
- Step in at an inflection point: PayNet has been shifting from legacy/private connectivity toward internet-based API connectivity and a cloud-based computing model, raising the bar for security and platform engineering.
TL;DR
- Own day‑to‑day reliability of UNIX, virtualization, and backup platforms supporting national payments
- Keep mission‑critical systems secure, performant, and recoverable
- Drive automation and modernisation across enterprise infrastructure
- Act as L3 technical escalation for complex infrastructure incidents
Why This Role Matters
- Payment infrastructure must remain available, recoverable, and secure at all times
- Infrastructure failures directly affect transaction processing and national trust
- Automation and modernisation reduce operational risk and human error
- Senior engineering judgment is critical during incidents and platform changes
What You Will Actually Do
- Own administration and optimisation of enterprise UNIX/Linux systems, including patching and performance tuning
- Operate and improve hypervisor platforms such as VMware or KVM, covering lifecycle, capacity, and upgrades
- Ensure backup and disaster recovery readiness through hands‑on restore validation and DR exercises
- Build automation using Bash, Python, and Ansible to improve provisioning, compliance, and efficiency
- Act as L3 escalation for incidents, driving root‑cause analysis and permanent remediation
- Ensure systems meet security, compliance, and documentation standards while mentoring team members
Examples of This Role in Practice
- Restoring a critical system during an incident and validating data integrity under pressure
- Improving VM performance by fixing capacity or configuration bottlenecks
- Automating patching workflows to reduce downtime and manual risk
- Leading root‑cause analysis for a recurring platform issue and preventing recurrence
- Supporting a DR exercise and closing gaps before production risk materialises
Required
What Will Help You Succeed
- Strong hands‑on UNIX/Linux administration experience (RHEL, AIX) in production environments
- Practical expertise operating hypervisor platforms such as VMware or KVM
- Hands‑on experience with enterprise backup and recovery solutions (NetBackup, Veeam, or Commvault)
- Ability to automate infrastructure tasks using Bash, Python, or Ansible
- Comfort acting as escalation support for complex system issues
Helpful
- Experience implementing or supporting HA and DR architectures
- Exposure to enterprise storage and networking (SAN, Fibre Channel, iSCSI, TCP/IP, VLANs)
- Experience with monitoring and observability tools such as Nagios, Grafana, Prometheus, or ELK
- Familiarity with vulnerability management, system hardening, and audit environments
- Relevant certifications such as RHCE, VMware VCP, or backup platform certifications