Senior Cloud Engineer, Computer Vision Infrastructure
CHEP View all jobs
- Canada
- Permanent
- Full-time
- Design, orchestrate, and maintain scalable cloud infrastructure, working with engineers to automate processes and improve efficiency.
- Collaborate with Innovation Squads, Product Success, and other teams to maintain fast, reliable, and resilient CI/CD pipelines, empowering engineers to self-service their infrastructure needs.
- Support the development, testing, and maintenance of disaster recovery scenarios to ensure system availability and business continuity.
- Develop and deploy automated tools that enhance the developer experience, simplifying infrastructure management and deployment processes.
- Monitor performance, capacity, and availability of systems and infrastructure, working cross-functionally to troubleshoot and resolve platform-related issues.
- Create and maintain technical documentation, ensuring it is fit for use in design reviews, incident response, and support processes.
- Ensure best practices in cloud security, governance, and compliance are implemented across all cloud platform services.
- Stay up-to-date on emerging cloud technologies and trends, applying knowledge to continuously improve the scalability and performance of cloud platforms.
- Global
- Responsible for experimenting with and implementing machine learning frameworks for data science/machine learning development and operations
- This person will be dedicated to the Computer Vision Data Science team to support serialization related machine learning infrastructure
- Responsible for learning and operating new data science frameworks and technologies and exploring their viability for current and planned projects
- Responsible for learning and operating data storage frameworks and technologies and exploring their viability for current and planned projects
- Responsible for rigorous testing of framework robustness and scalability
- Will contribute to data science teams and the engineering teams discussions, providing insight as needed on other team member’s current approaches and methods as well as on tools and data repositories.
- Liaise with Cloud Team (Global IT) to understand corporate-wide cloud standards and policies and ensure compliance
- Supporting Serialization and Asset digitization programs across
- Responsible for the Continuous Integration and Deployment pipelines to support data science learning and production software delivery
- Responsible for contributing to capability building of the Cloud Engineering team, including researching and staying up-to-date on best practices e.g. GitOps, IaC (infrastructure as code).
- Successful roll out, development and continuous evolution and operation of cloud-based data science and machine learning platforms, both for research & development and for continuous operation
- Effective support of data science projects
- Reliability of systems
- Adoption of new systems and data science approaches
- Data science and machine learning frameworks: selection and implementation
- Tooling selection and implementation
- Working autonomously in a highly matrixed organization
- BS degree in Data Science, Computer Science, Engineering, Math, Statistics, Physics, or similar formal training or equivalent
- Proven experience with looking after data science environments
- Proven experience with looking after data storage systems with high availability and database tuning
- Proven experience with FinOps and being able to optimize spend for CE impact
- Experience with working with IoT and Edge interaction with the Cloud
- 5 years relevant experience in Cloud Engineering or adjacent fields
- Installed, operated, and managed several data science and machine learning frameworks, or developed own data science methodologies
- Experience with Continuous Integration and Continuous Deployment
- Experience operating, optimizing, querying, and administering databases (such as Iceberg, Postgres, Patroni, DuckDB, TimescaleDB, etc.)
- Comfortable using and working in a polyglot computer language environment (Python, Go, Julia etc.)
- Experience with Amazon Web Services (S3, EKS, ECR, EMR, etc.)
- Experience with containers and orchestration (e.g. Docker, Kubernetes)
- Experience with Big Data processing technologies (Kubeflow, Spark, Hadoop, Flink etc)
- Experience with interactive notebooks (e.g. JupyterHub, Databricks)
- Experience with Git Ops style automation
- Experience with *ix (e.g, Linux, BSD, etc.) tooling and scripting
- Participated in projects that are based on data science methodologies, and/or physical experiments, or statistical analysis – especially in a data engineer and dev ops capacity.
- Knowledge of major data science and dev ops frameworks and methods
- Very strong analytical skills and systems thinking
- Strong programming skills in addition to operational skills a plus (ideally in one or more of the following languages: Python, Go, Julia, or C/C++)
- Attention to big picture and details