Find a job // Lead SRE (Azure)

Lead SRE (Azure)

Permanent / Full Time
07.07.2025
1361134
Lead SRE role in Azure-focused SaaS environment. Drive reliability, automation, and DevOps best practices in a high-impact, cloud-native platform.
Why You’d Like It
  • Take technical ownership in a lead level Site Reliability role with real impact
  • Work on large scale Azure infrastructure that supports critical SaaS platforms
  • Join a high-trust, autonomous team that values innovation, automation, and engineering excellence
  • Contribute to an evolving platform treated as a product on a continuous journey of improvement
  • Enjoy a flexible, agile working environment with a focus on collaboration and learning
The Story
As this organisation grows, so does the scale and complexity of its infrastructure. The engineering team is committed to delivering reliable, high-performing cloud platforms that support a 24/7 global SaaS product. They’re now seeking a Lead Site Reliability Engineer to join their Azure Platform Team, someone who can balance hands on technical leadership with big picture architectural thinking. This role is key to enabling engineering teams to innovate safely, scale securely, and maintain peak system performance.

Company Profile
Our client is a forward thinking technology business with a strong focus on smart, sustainable digital infrastructure. Their product suite is built to operate at scale and in real-time, making reliability, performance, and developer enablement top priorities. With a collaborative team culture and a clear roadmap for platform maturity, they’re building out their cloud capabilities while staying grounded in DevOps values and modern engineering practices.

Your Role
  • Lead efforts to improve the reliability, performance, and automation of cloud environments in Azure
  • Work closely with other SREs to evolve infrastructure, unlock capabilities, and reduce complexity for engineering teams
  • Contribute to architectural planning and new solution design, supporting both legacy and greenfield systems
  • Manage and troubleshoot container orchestration platforms like AKS and Kubernetes
  • Refine deployment tooling and CI/CD pipelines using Terraform and Azure DevOps
  • Perform root cause analysis of incidents and drive long-term fixes across distributed systems
  • Build dashboards, metrics, and visualisation tools to support observability, capacity planning, and pre-emptive issue resolution
  • Automate routine tasks and champion DevOps practices to increase team autonomy
  • Conduct performance and reliability testing to remove bottlenecks and identify single points of failure
  • Support releases and deployments across production and non-production environments
Your Fit
  • Proven experience as a technical leader in SRE or DevOps roles
  • Deep expertise in Azure production environments 
  • Experience operating container orchestration frameworks like Kubernetes, ECS, or AKS
  • Proficient in Infrastructure as Code tooling, especially Terraform and Azure DevOps
  • Background in managing and optimising distributed, customer facing systems in real time environments
  • Scripting skills in bash, Ruby, Python, or similar languages
  • Familiarity with observability tools such as Datadog, SumoLogic, or Grafana
  • Strong understanding of both Windows-based and Linux systems
  • Solid grasp of relational database operations and performance
  • Passion for web operations, automation, and continuous improvement

Applicants must be legally entitled to work in New Zealand. If you are not a NZ Citizen, you must have the right of permanent residence or a valid work visa.

Does this sound like it could be the next role for you? Get in touch with Henry for a confidential discussion email [email protected] or call on 0274991187

Like the sound of that?

If this sounds like you please reach out to me at
henry@digitalgarage.co.nz