Staff Cloud DevOps/Site Reliability Engineer (SRE) - Canada
Why Join Inworld
Inworld is the best-funded startup in AI and games with a $500 million valuation and backing from top tier investors including Intel Capital, Microsoft’s M12 fund, Lightspeed Venture Partners, Section 32, BITKRAFT Ventures, Kleiner Perkins, Founders Fund, and First Spark Ventures.
Inworld is the leading AI engine for games and interactive media. Inworld’s suite of AI components enables developers to build interactive, responsive, and personalized AI gaming experiences, orchestrate models to create intelligent game behaviors, and unlock enhanced productivity with AI-generated content. Inworld powers experiences built by Ubisoft, NVIDIA, Niantic, NetEase Games and LG, among others, and has partnerships with key industry players such as Microsoft Xbox, Epic Games, and Unity.
Inworld was recognized by CB Insights as one of the 100 most promising AI companies in the world in 2024 and was also named among LinkedIn's Top Startups of 2024 in the USA.
Our Technical Operations team manages the infrastructure, DevOps, and Site Reliability of our platform. We are looking for a Staff Cloud DevOps/Site Reliability Engineer to join our team.
Qualifications
- Bachelor's degree in Computer Science, Engineering, or a related field
- 7+ years of experience as a DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience).
- At least 2 years experience each with:
- Terraform
- Helm
- Kubernetes
- AWS, Azure, or GCP
- CI/CD using modern tools (GitOps)
- Optional (not required but considered a plus):
- MLOps (building, orchestrating, and maintaining Machine Learning Pipelines)
- Prometheus / Grafana
- Multi-cloud deployments (2 or more)
- ArgoCD
- Network management and VPNs
Responsibilities
- Infrastructure: Maintain and contribute to Infrastructure-as-Code (Terraform)
- DevOps and CI/CD Pipelines: Orchestrate pipelines using Github Actions, Helm, ArgoCD
- Microservices scalability: Kubernetes Administration
- Cloud Administration
- Site Reliability: Measure and monitor availability, latency, and overall service health, drive incident management and post-mortem analysis
Work location: British Columbia, Canada.
The base salary range for this full-time position is CAD $170,000 - $220,000. In addition to base pay, total compensation includes bonus, equity and benefits. Within the range, individual pay is determined by work location and additional factors, including competencies and experience.
Apply for this job