Intermediate Application Support
Global Relay View all jobs
- Vancouver, BC
- $85,000-100,000 per year
- Permanent
- Full-time
- Service Reliability: Proactively identifying risks to service and remediate them. Reduce risk from deployments by improved use of resilience and ensuring appropriate testing of releases pre and post deployment. Provide support and troubleshooting when service incidents occur. Improve time to recover from service impacting incidents. Identifying trends and root causes to reduce volume of incidents.
- Automation: Identify and deliver on opportunities to use automation to increase efficiency, reduce toil and drive service availability. Use automation and orchestration techniques to provide repeatable solutions and reduce risk of mis-operations.
- Observability: Monitor and ensure smooth operation of all production services. Identifying gaps in coverage and improving observability of Production services. Ensuring appropriate events are generated for service failure or degradation scenarios. Responding to events and alerts in timely manner managing through to resolution.
- Knowledge management: Continuously improving the knowledge of the Application Support team to become subject matter experts on the Product and the technology that runs it. Collaborating with other teams to understand how underpinning services support the Products. Identifying opportunities to share knowledge and decrease the time it takes to resolve customer related incidents.
- Platform and Database tech: Linux, Cassandra, Kafka, Arango dB
- Containerization/Virtualization: Kubernetes/OpenShift, VMware
- Instrumentation & Monitoring: Splunk, Zabbix, Prometheus, Grafana
- Scripting: PowerShell, Python
- Service focused
- Experience running highly available, critical services, ideally SaaS
- A problem solver who takes initiative
- Effortlessly self-motivates while working on team-based projects or individual tasks
- A well organized, thorough and detail-oriented person
- Able to keep the "bigger picture" in mind while prioritizing conflicting demands and tasks
- Ability to take ownership in pressurized situations to provide direction during service incidents, tenacious enough to ensure issues do not get dropped
- Ability to negotiate and liaise with other teams to influence across teams as required to ensure appropriate outcome
- Confident enough to voice your opinion, ask questions and not afraid to suggest a better solution, without being abrasive
- Collaborative and willing to share knowledge, able to engage and meet needs of demanding stakeholders
- 3-5+ years' experience as a SRE or Application Support Engineer or similar role
- Bachelor's degree in computer science or related field
- Scripting ability in PowerShell, Python, etc.
- Understanding of software systems concepts such as networking, firewalls, protocols, databases and more
- Java debugging exposure - ability to complete thread dumps
- Experience with monitoring solutions
- Splunk experience - creating dashboards, reports, events & analysis
- Awareness of software delivery practices (CI/CD)
- Experience troubleshooting connectivity issues: TCP/IP, DNS, Telnet, Trace Route, TCP dump
- Awareness of load balancing technologies such as HA Proxy, Nginx, F5
- Experience of collaboration technologies: email, archiving, instant messaging
- Exposure to supporting voice / SMS technologies (nice to have)