Kuala Lumpur , W.P. Kuala Lumpur
|Hybrid
|Direct hire
Kuala Lumpur, W.P. Kuala Lumpur
|Hybrid
|Direct hire
Kuala Lumpur, Malaysia
About Horizontal: Established since 2003 in the US, Horizontal solves complex challenges across two distinct businesses: Horizontal Digital and Horizontal Talent. We are consistently recognized for being a top workplace and one of the fastest-growing private companies. Horizontal Talent specializes in staffing for IT, Digital & Creative, and Business & Strategy markets. We have global offices in US, UAE, India, and Malaysia.
Job Summary:
We are seeking a Lead Site Reliability Engineer (SRE) to drive the design, build, and evolution of a Managed Patching Service (MPS) for enterprise-scale on-premise environments in a regulated financial services context.
This is a hands-on technical leadership role focused on building a scalable, automated, and compliant patching capability. The service will evolve over time from automated patch execution into a more mature platform aligned with modern engineering practices, including infrastructure as code and self-service capabilities. You will combine deep automation expertise with strong SRE principles to build a reliable, observable, and scalable service, while establishing the operational model required to run it effectively.
Key Responsibilities
1. Own the Managed Patching Service End-to-End
• Lead the design, build, and rollout of an automated patching service at enterprise scale
• Own service lifecycle: reliability, scalability, performance, and compliance
• Translate regulatory and enterprise requirements into engineering solutions with audit-ready outcomes
• Drive the evolution of the service towards a platform-based model over time
2. Build Automation Architecture and Orchestrated Workflows
• Design and implement patching workflows using Ansible Automation Platform
• Integrate with CI/CD orchestration tools such as CloudBees or equivalent
• Define robust automation patterns: idempotency, versioning, rollback, safe concurrency, and failure isolation
• Extend automation using Python where needed for orchestration logic and integrations
• Implement Git-based workflows for version control, testing, and release governance
3. Define Service Model and Platform Integration
• Design subscription and onboarding model via ServiceNow
• Define scheduling, maintenance windows, and deployment strategies
• Build integration patterns across ServiceNow, automation platforms, and inventory systems
• Contribute towards evolving the service into a self-service platform
4. Observability, Reporting and Compliance
• Design telemetry for patch outcomes, compliance posture, and drift detection
• Build reporting capabilities for operational and regulatory visibility
• Ensure all activities are traceable, auditable, and evidence-ready
5. Reliability Engineering and Continuous Improvement
• Define SLIs, SLOs, and operational standards for the service
• Lead incident response, root cause analysis, and corrective actions
• Drive continuous improvement to reduce toil and improve resiliency
6. Cross-Functional Technical Leadership
• Collaborate with Infrastructure, Security, Architecture, and Platform teams
• Align patching strategies with enterprise standards and dependencies
• Drive adoption of the service across teams
• Influence without authority across a distributed organization
Required Skills & Qualifications
Must-Have
• Strong expertise in Ansible Automation Platform
• Experience with CI/CD orchestration tools (CloudBees or equivalent)
• Deep Linux/RHEL knowledge including patching and system internals
• Experience managing large-scale infrastructure environments
• Strong understanding of SRE principles and operational practices
• Experience integrating with ServiceNow or similar platforms
• Strong stakeholder communication and cross-team collaboration
Nice-to-Have
• Experience with platform engineering and self-service models
• Familiarity with infrastructure as code practices
• Observability and monitoring integration experience
• Experience with reporting tools such as Power BI
• Experience in regulated or financial environments
The above description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.
Horizontal is committed to taking affirmative action to employ and advance in employment qualified individuals with disabilities and protected veterans. If you are an individual with a disability and require a reasonable accommodation to complete any part of the application process or participate in the interview process, click here to request accommodation assistance.
All applicants applying must be legally authorized to work in the country of employment.