Production Support Engineer (Azure Cloud) – Reliability and PROD Engineering

AIT Global inc.

  • Montreal, QC
  • Permanent
  • Full-time
  • 1 day ago
  • Apply easily
Job Title: Production Support Engineer (Azure Cloud) Reliability and PROD Engineering
Location: Montreal, QC (Onsite)Skill: Prod support with Azure cloud, SRE, DevOps CI/CD Pipeline.Job Description:We are seeking a Production Support Engineer with expertise in Azure Cloud for the IT Reliability & Production Engineering (RPE) domain. The role focuses on ensuring the stability, performance, and reliability of production systems in a high-availability environment.Key Responsibilities:
  • Monitor, troubleshoot, and resolve production incidents and service disruptions.
  • Ensure system reliability, availability, and performance using SRE principles.
  • Automate manual processes and optimize cloud infrastructure on Azure.
  • Analyze logs, metrics, and alerts to prevent incidents.
  • Collaborate with development, DevOps, and infrastructure teams for issue resolution.
  • Implement CI/CD pipelines, observability, and proactive monitoring strategies.
  • Maintain Azure resources (VMs, AKS, Storage, Networking, etc.).
  • Participate in on-call rotations for critical production support.
Required Skills:
  • Strong experience in Azure Cloud (Azure Monitor, App Insights, Log Analytics).
  • Scripting & Automation: PowerShell, Python, Terraform, or Ansible.
  • Monitoring & Observability: Prometheus, Grafana, Splunk, or Datadog.
  • Incident Management: ITIL, SRE principles, RCA methodologies.
  • CI/CD & DevOps: Azure DevOps, GitHub Actions, Jenkins.
  • Containers & Orchestration: Kubernetes (AKS), Docker.
  • Networking & Security: Load balancers, firewalls, IAM, VPNs.
Preferred Qualifications:
  • Experience with large-scale distributed systems.
  • Familiarity with SQL/NoSQL databases.
  • Knowledge of Cloud-Native architecture

AIT Global inc.