
SRE Specialist 3
- Burnaby, BC
- $112,000-137,000 per year
- Permanent
- Full-time
- Cloud Infrastructure Management:
- Administer and maintain private cloud platforms including OpenStack, Proxmox, and VMware vSphere.
- Design and execute migration plans from public cloud platforms (e.g., GCP, AWS) to private cloud environments.
- Build and manage Infrastructure-as-Code (IaC) frameworks for automating OS, application, and network provisioning.
- Automation:
- Automate recurring operational tasks using tools like Ansible and Terraform to increase reliability and efficiency.
- Create CI/CD pipelines with GitLab
- Linux System Administration:
- Manage and support Linux servers across multiple distributions such as Ubuntu, Red Hat, and Oracle Enterprise Linux (OEL).
- Monitoring & Troubleshooting:
- Monitor performance across servers, VMs, containers, applications, and networks.
- Troubleshoot and resolve issues related to infrastructure, including network components, security appliances, servers, and storage.
- On-call Support:
- Participate in on-call rotations to provide 24/7 support for critical systems and ensure high service availability.
- 5+ years of experience managing production environments.
- Must have hands-on experience in OpenStack & Ceph Administration.
- Expertise in server virtualization technologies such as KVM and VMware.
- Strong Linux server administration skills (RHEL, CentOS, Ubuntu).
- Solid understanding of network administration and standard protocols.
- Experience with monitoring tools like Zabbix or Nagios.
- Proficiency in at least one scripting language (e.g., Python, Bash).
- Skilled in Ansible and Terraform for configuration and provisioning automation.
- Experience troubleshooting complex systems in Linux/OpenStack/Kubernetes environments.
- Strong analytical and problem-solving skills, with the ability to analyze logs and identify root causes.
- Excellent organizational, multitasking, and communication skills.
- RHCE (Red Hat Certified Engineer) certification.
- OpenStack certification.
- Experience with container platforms such as Docker and Kubernetes.
- Familiarity with Software-Defined Storage (Ceph) and Networking (OVN/OpenvSwitch).
- Prior experience in maintaining a 24/7 global service environment.