
Senior Platform Engineer (DevOps)
- Kuala Lumpur
- Permanent
- Full-time
- Participate in platform software engineering, writing code to continue reducing human intervention in operational tasks and automating processes.
- Lead in-depth technical and data analysis to gauge service trends and drive improvements.
- Contribute to prioritizing reliability features and the design, development, and delivery of effective tooling, alerts, and automated responses to identify and address reliability risks.
- Contribute to proactive technical communication of reliability, stability, and efficiency results (based on Service Level Objectives), service health (via dashboards), key reliability risks, and issues to senior business and technology stakeholders - to prioritize activity (based on trend analysis) and direct investment and action. ▪ Automate the installation and maintenance of the test/development server, release build, and deployment of existing tools and dependent solutions.
- Design and take ownership of innovations that improve software engineering velocity, infrastructure resiliency, and security. ▪ Evaluate new application packages and tools and perform research on best practices.
- Ability to debug and find the root cause of the errors related to infrastructure problems for an ongoing operation. ▪ Have the technical skills to review, verify, and validate the software code developed in the DevOps project
- 6+ years experience in software development and DevSecOps/SRE functions with at least two years in a senior technical capacity.
- You are either a Software Engineer with a real interest in systems, networking, monitoring, and automation or an experienced sysadmin or systems engineer with professional Linux skills, development experience managing distributed systems at scale, and a demonstrable interest and experience in using software engineering to solve operational problems.
- Comfortable writing software to automate API-driven tasks at scale. Tooling engineers primarily use Java C/C++, NodeJS, Python, and Go. ▪ Experience automating the build and deployment of software products and understanding the related challenges in distributed systems.
- Ability to quickly and clearly communicate incident status via email in business-friendly language ▪ Experience and advanced understanding of Observability tools (e.g., ELK, Grafana/Prometheus, Zabbix, Nagios, etc. ▪ Experience designing and implementing CI/CD and release management solutions.
- Well-rounded broad knowledge of OS platforms (Linux/UNIX), Networking, Web Systems, and DevSecOps. ▪ Experience working with large-scale distributed systems with an understanding of microservices architecture concepts. ▪ Strong organizational skills and the ability to effectively manage multiple tasks.
- Experience with containers and CD tools - e.g., Pulumi, Docker, Ansible, Puppet, etc. ▪ Experience with integration and build tools - Jenkins, Groovy, Maven, Atlassian Suite, GitLab CI.
- A permanent role with opportunities for career growth and skill development.
- Work in a large organization recognized for its innovation in the Transport & Distribution industry.
- Be part of a collaborative and technology-driven team in Kuala Lumpur.