Sr. Systems Administrator
OpenText View all jobs
- Waterloo, ON
- $76,800-115,200 per year
- Permanent
- Full-time
- Deploy, configure and maintain monitoring and alerting systems including Operations Bridge Manager (OBM), Network Node Manager (NNMi), Zabbix, Prometheus, and Graylog.
- Follow monitoring strategies, standards, and governance for infrastructure, network, and applications.
- Develop enterprise-wide SLIs/SLOs and ensure effective alerting aligned to business impact.
- Serve as the primary engineering escalation point for NOC, operations, and product teams. Lead complex incident resolution, root-cause analysis, and post-incident improvement plans.
- Be part of 24/7 Tier-3 support team and participate in support and incident calls and lead post incident reviews
- Develop automation for monitoring provisioning using Python, Bash, Terraform, and Ansible. Maintain metrics/log pipelines, exporters, collectors, and discovery rules.
- Support and maintain monitoring infrastructure deployed in DPZ-specific zones in the EMEA region
- Implement AI-driven improvements such as anomaly detection and AI-assisted troubleshooting. Use AI copilots for automation, documentation, and script generation.
- Follow ITIL processes for Incident, Problem, Change, and Configuration Management.
- Hands-on experience with Observability and Monitoring technologies (correlation, event pipelines, topology, service modeling). Observability Tools: NewRelic, Dynatrace
- Expertise in NNMi skills: network topology, polling strategies, SNMP, traps, RCA.
- Deep understanding of Zabbix architectures, templates, event routing, and automation.
- Expert-level Graylog pipeline design, parsing logic, and log ingestion strategies.
- Proven SRE experience with SLIs/SLOs, reliability architecture, and incident management.
- Strong Linux, Kubernetes, networking, and scaling concepts for production systems.
- Expertise in automation using Python, Bash, Terraform, and Ansible integration with GitLab pipelines.
- Strong understanding of ITIL v3/v4 and ability to enforce governance.
- Strong understanding of GDPR principles, data protection rules, and EMEA data residency requirements.Ability to leverage LLM based tools for troubleshooting, documentation and automation creativity