Machine Learning Operations Specialist

Sherbrooke, QC
Permanent
Full-time

Just now

Job DescriptionThe Machine Learning Operations Specialist plays a key technical role within the Data & Analytics team, responsible for building, deploying, and optimizing machine learning models that drive scalable, production-grade AI capabilities. Reporting to the AI Team Lead, the ML Engineer transforms data science prototypes into robust, high-performance solutions by implementing end-to-end pipelines, automating deployment, and operationalizing AI within modern cloud and enterprise platforms. This role is essential to delivering reliable, governed, and reusable AI capabilities across the organization.The MLOps Specialist will also play a central role in advancing the organization's MLOps and LLMOps maturity, ensuring models are not only performant, but also secure, monitored, and easily maintainable through automation and best practices.The MLOps Specialist collaborates closely with data scientists, software engineers, and business stakeholders to ensure that machine learning solutions are aligned with business needs and seamlessly integrated into broader systems and workflows.What does your typical day look like?End-to-End ML Pipeline Implementation & Deployment

Design and implement robust, reproducible machine learning pipelines from experimentation through deployment.
Develop scalable workflows for data preparation, model training, evaluation, and prediction using production-grade tools and frameworks.
Translate prototypes into performant and maintainable code suitable for real-world applications.
Optimize training and inference performance for reliability and efficiency at scale.

MLOps & LLMOps AutomationAutomate model deployment using CI/CD pipelines and containerized environments.

Implement and manage a model registry, monitor drift, performance, and versioning.
Define rollback strategies and model governance policies for responsible AI deployment.
Contribute to the development and standardization of GenAI integration workflows and tools.
Implement real-time model monitoring, performance alerting, and drift detection to ensure reliability and ongoing model accuracy in production.

Platform Integration & Performance Optimization

Leverage cloud-native tools and platforms to deploy and scale models effectively (e.g., Snowpark, container services, orchestration frameworks, Dataiku).
Work closely with platform teams to ensure alignment between model requirements and system capabilities.
Leverage Azure-native tools such as Azure Data Factory to support upstream data flow into ML workflows where needed.
Continuously optimize data and model pipelines for efficiency, resilience, and cost-effectiveness.

Continuous Learning, Collaboration & Innovation

Stay ahead of emerging technologies and best practices in AI/ML and GenAI deployment.
Share knowledge and collaborate with peers, data scientists, and engineers to elevate overall technical capability.
Participate in sprint planning, retrospectives, and cross-team reviews to align on shared objectives and improve team processes.

Major Challenges

Building ML systems that balance rapid experimentation with long-term maintainability and governance.
Managing increasing complexity in deployment, monitoring, and scaling of models across platforms.
Adapting to evolving GenAI trends while maintaining rigorous standards around model evaluation, explainability, and control.
Ensuring alignment with enterprise security and compliance requirements without slowing down innovation.

Major Job Accountabilities

Translate data science prototypes into robust, scalable, and reusable machine learning pipelines that cover data ingestion, feature engineering, training, evaluation, and inference.
Ensure solutions are modular, maintainable, and production-grade by following software engineering best practices.
Develop and manage CI/CD workflows for automated model deployment, testing, and monitoring using containerized and cloud-native infrastructure.
Operationalize models through proper registration, version control, rollback policies, and real-time monitoring for drift and performance anomalies.
Deploy and integrate models within enterprise platforms such as Snowflake (Snowpark), Dataiku, and other Azure-native tools while ensuring optimal performance, cost-efficiency, and platform alignment.
Collaborate with infrastructure and platform teams to ensure system compatibility and performance at scale.
Implement processes for secure, explainable, and governed model deployment in alignment with regulatory and enterprise risk requirements.
Maintain visibility over model behavior in production through monitoring dashboards, alerts, and audit logs.
Contribute to the development and standardization of GenAI workflows and reusable components.
Share knowledge and tools across teams to promote scalable best practices in ML engineering.
Participate in agile ceremonies, sprint reviews, and architectural discussions to support cross-team alignment.

Success Measures

Delivery of reliable, high-performance ML solutions deployed using standardized pipelines.
Increased automation in model deployment and monitoring workflows across AI initiatives.
Reduction in time-to-production for AI models without compromising quality or governance.
High platform utilization (e.g., Snowpark, container orchestration) and improved ML pipeline performance.
Measurable skill growth in automation, deployment, and performance tuning of AI models across the ML team.

What skills and training do you need?

Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related technical field.
3–6 years of hands-on experience in ML engineering or applied MLOps roles within enterprise-scale environments.
Proven experience designing, deploying, and maintaining machine learning models using modern MLOps and DevOps practices.
Hands-on experience with Dataiku as a primary platform for ML orchestration, workflow automation, and operationalization of AI solutions.
Strong experience with Azure cloud data platforms, including Snowflake and deployment frameworks like Snowpark or similar container services.
Proficiency in building and maintaining CI/CD pipelines for ML, including model versioning, monitoring, and rollback strategies.
Familiarity with GenAI tools and practices is considered a strong asset.
Relevant certifications (e.g., Dataiku DSS, Snowflake, Azure ML, MLOps platforms) are considered an advantage.
Experience working in Agile/SAFe environments and cross-functional product teams.
Preferably: Bilingual (English and French) to support collaboration across regional and global initiatives.

Implement projects or training throughout our different sites, including those outside Quebec.Work closely with teams from our Windsor, Ontario site.

Strong understanding of ML engineering practices including MLOps, LLMOps, and model lifecycle management.
Knowledge of cloud-based ML tools, containerized environments, and orchestration systems.
Familiarity with ML observability, model drift monitoring, and pipeline instrumentation.
Deep working knowledge of Dataiku as a central AI platform, including its capabilities for orchestrating ML workflows, automating pipelines, and operationalizing AI models at scale.
Awareness of AI security, compliance, and governance principles in regulated environments.
Familiarity with data engineering concepts and tools, particularly within Azure environments (e.g., Azure Data Factory), for building scalable data ingestion and transformation pipelines.
Proficiency in building end-to-end ML pipelines using modern toolchains.
Skilled in CI/CD for ML, containerized deployments, and model monitoring systems.
Hands-on experience designing and deploying workflows within Dataiku, including plugins, scenarios, custom recipes, and model versioning.
Hands-on experience with machine learning frameworks such as TensorFlow, PyTorch, or Keras for developing, training, and optimizing advanced models, as well as with Snowpark or similar frameworks for deploying and operationalizing models within cloud-based data platforms.
Strong coding skills in Python and familiarity with data science libraries and frameworks (e.g., scikit-learn, MLflow, Hugging Face).

Detail-oriented and committed to production-quality, scalable AI development.
Highly collaborative and able to work closely with data scientists, engineers, and product teams.
Adaptable to changing technologies and priorities in a fast-paced AI/ML environment.
Strong communication skills with the ability to clearly articulate technical concepts to both technical and non-technical stakeholders.
Curious, proactive, and passionate about building sustainable and impactful AI systems.

When you apply:If you require assistance or accommodation during our recruitment process, please notify Human Resources so that we can review and consider how we may be able to assist you based on your individual needs.We offer you

Global Excel offers more than a position;
We offer a competitive compensation package that includes a base salary, performance bonus and an extensive benefits package;
A professional future with opportunities for development, growth, and advancement;
RRSP Match program;
Financial assistance to employees who wish to continue their education;
Work/life balance, health and wellness initiatives including an excellent EAP program;
Employee engagement programs that focus on fitness, food, and fun.

To get a taste of the Global Excel life and for more information on our company, visit our Facebook page and website:

Global Excel

Apply Now