Production Support Engineer (Azure Cloud) – Reliability and PROD Engineering
AIT Global inc.
- Montreal, QC
- Permanent
- Full-time
Location: Montreal, QC (Onsite)Skill: Prod support with Azure cloud, SRE, DevOps CI/CD Pipeline.Job Description:We are seeking a Production Support Engineer with expertise in Azure Cloud for the IT Reliability & Production Engineering (RPE) domain. The role focuses on ensuring the stability, performance, and reliability of production systems in a high-availability environment.Key Responsibilities:
- Monitor, troubleshoot, and resolve production incidents and service disruptions.
- Ensure system reliability, availability, and performance using SRE principles.
- Automate manual processes and optimize cloud infrastructure on Azure.
- Analyze logs, metrics, and alerts to prevent incidents.
- Collaborate with development, DevOps, and infrastructure teams for issue resolution.
- Implement CI/CD pipelines, observability, and proactive monitoring strategies.
- Maintain Azure resources (VMs, AKS, Storage, Networking, etc.).
- Participate in on-call rotations for critical production support.
- Strong experience in Azure Cloud (Azure Monitor, App Insights, Log Analytics).
- Scripting & Automation: PowerShell, Python, Terraform, or Ansible.
- Monitoring & Observability: Prometheus, Grafana, Splunk, or Datadog.
- Incident Management: ITIL, SRE principles, RCA methodologies.
- CI/CD & DevOps: Azure DevOps, GitHub Actions, Jenkins.
- Containers & Orchestration: Kubernetes (AKS), Docker.
- Networking & Security: Load balancers, firewalls, IAM, VPNs.
- Experience with large-scale distributed systems.
- Familiarity with SQL/NoSQL databases.
- Knowledge of Cloud-Native architecture
We are sorry but this recruiter does not accept applications from abroad.