Lead, Service Assurance

7-9 Years

Save

Early Applicant

Job Description

About You

You're a go-getter with mad juggling skills (or multiple hats) who can thrive in a fast-paced, agile environment
You enjoy doing purpose-led and meaningful work
You have a strong thirst for knowledge and are driven to find solutions that don't exist yet
You are comfortable with ambiguity and extremely resourceful (in your past life, you could've been a detective)
You always find a way to get things done without sacrificing the quality of your work, integrity, and values
No task is off limits for you
You are humble and prioritize the success of the team over your own with an eagerness to help those around you
You don't shy away from challenges and can bounce back from setbacks

What you'll do and what success looks like in this role:

Manage and oversee the handling of critical incidents in a timely and structured manner, ensuring effective root cause analysis (RCA) and implementation of preventive actions.
Drive operational automation initiatives, including auto-remediation, automated health checks, and improvement of monitoring alerts to enhance efficiency and reduce recurring issues.
Monitor service performance in alignment with Service Level Agreements (SLA) both internally and externally, including collaboration with vendors and regulators.
Ensure all system changes are conducted based on proper risk analysis and adhere to change management procedures.
Maintain compliance with internal policies, security standards, and applicable regulatory requirements.
Promote a data-driven culture focused on preventive measures and continuous improvement across IT operations.

What Is Required and What We're Looking For

Bachelor's degree in Information Technology, Information Systems, Computer Science, or a related field.
Minimum 7 years of experience in IT Operations, Service Assurance, Incident/Problem Management, SRE, DevOps, or other technology operations functions within banking, fintech, or digital platforms.
Proven experience in implementing modern operational approaches, including observability, automation, reliability metrics, post-incident reviews, and continuous improvement.
Strong understanding of ITIL framework and IT operational governance.
Hands-on experience with monitoring and observability tools.
Solid understanding of cloud-native architecture, APIs/microservices, and large-scale digital services.
Ability to design or lead the implementation of operational automation.
Familiarity with OJK/BI regulations and ISO 27001 standards related to incident, change, and DR/BCP.
Strong leadership in handling major incidents and coordinating cross-functional teams.
Excellent communication skills, with the ability to bridge technical and non-technical stakeholders.
Strong analytical thinking and ability to make quick decisions under pressure.
Results-oriented, adaptable, and equipped with structured problem-solving capabilities.