We're looking for a DevOps Engineer (GCP) who thrives at the intersection of automation, reliability, and cloud architecture. You'll play a critical role in shaping a secure, scalable, and highly available Google Cloud ecosystem, enabling engineering teams to ship faster and safer.
This is a hands-on role with real ownership from infrastructure design to production reliability where your decisions directly impact system performance, cost efficiency, and developer velocity.
What You'll Be Responsible For
- Design, build, and maintain secure, scalable, and fault-tolerant GCP infrastructure using Terraform as the primary IaC tool.
- Ensure consistent Dev, Staging, and Production environments through robust Terraform state management and best practices.
- Operate and optimise Google Kubernetes Engine (GKE) clusters and core compute services including Compute Engine and Cloud Run.
- Architect and manage end-to-end CI/CD pipelines (Cloud Build, Cloud Deploy, Jenkins, GitLab CI) to automate testing, artefact creation, and deployments.
- Embed security scanning, compliance checks, and quality gates into pipelines to support a strong DevSecOps culture.
- Implement and maintain observability frameworks across GCP using Cloud Monitoring & Logging, Prometheus, and Grafana.
- Define, track, and report SLIs and SLOs to balance reliability with delivery speed.
- Participate in on-call rotations, lead deep Root Cause Analysis (RCA), and drive preventative improvements after incidents.
- Design and manage IAM policies following the principle of least privilege.
- Implement network security controls including VPC design, firewall rules, Cloud Armor, and encryption standards.
- Manage application secrets and credentials using Google Secret Manager or HashiCorp Vault.
- Proactively monitor cloud spend and identify cost-optimisation opportunities without compromising performance.
- Perform capacity planning and performance tuning across GCP services.
- Mentor development teams on DevOps best practices and GCP architecture.
- Maintain high-quality technical documentation, including architecture diagrams, runbooks, and operational procedures.
What We're Looking For
- 2+ years experience as a DevOps Engineer or Site Reliability Engineer, with demonstrable hands-on expertise in Google Cloud Platform.
- Advanced proficiency in Terraform for managing complex, production-scale GCP infrastructure.
- Strong experience running Kubernetes in production, preferably GKE.
- Solid scripting skills in Python, Bash, or Go for automation and tooling.
- Deep understanding of GCP networking: VPCs, load balancing, peering, DNS, CDN, VPN/Interconnect.
- Experience managing GCP managed databases such as Cloud SQL, BigQuery, or Spanner.
- GCP Certifications, such as: Professional Cloud DevOps Engineer, Professional Cloud Architect, Professional Cloud Security Engineer
- Experience with advanced observability stacks (Prometheus, Grafana, ELK, Splunk).
- Hands-on knowledge of configuration management (Ansible, Chef, Puppet).
- Strong understanding of DevSecOps practices (SAST, DAST, container vulnerability management).
- Exceptional problem-solving and analytical thinking in high-pressure environments.
- Clear, confident communication skills able to translate complex technical topics for diverse stakeholders.
- A collaborative mindset with experience mentoring engineers and leading small technical initiatives.
Nice to Have
- Proven experience with cloud migration projects (on-prem or multi-cloud to GCP).
- Strong exposure to serverless architectures (Cloud Run, Cloud Functions, App Engine).
- Hands-on FinOps experience on GCP, using Billing Reports, Recommenders, or Cost Management APIs.
- Advanced database experience with PostgreSQL on Cloud SQL, Cloud Spanner, or BigQuery.
- Experience implementing advanced security controls using Security Command Centre (SCC) or Cloud Armor (WAF/DDoS protection).