Lead Data Engineer, Associate Director (Toronto)
Fitch Group View all jobs
- Toronto, ON
- $120,000-155,000 per year
- Permanent
- Full-time
- Lead the design and architecture of end-to-end data pipelines and solutions on modern cloud-based platforms, including Snowflake, Databricks, and AWS.
- Build and optimize robust, scalable data orchestration workflows using Apache Airflow and implement best practices across multiple agile squads.
- Design and implement data solutions using PostgreSQL for relational data and MongoDB for NoSQL requirements, ensuring optimal performance and scalability.
- Architect and deploy containerized data applications using Docker, Kubernetes, and AWS EKS, incorporating GitHub Actions for automated deployments.
- Design and implement CI/CD pipelines using GitHub Actions, establish branching strategies, and ensure automated testing, code quality checks, and security scanning.
- Collaborate with cross-functional teams-including Data Scientists, Analytics teams, and business stakeholders-to translate requirements into scalable technical solutions.
- Mentor and guide data engineers by promoting technical excellence, establishing coding standards, and conducting architecture reviews.
- Drive data platform modernization initiatives and ensure data quality, reliability, and governance across all data systems.
- Design and implement AI-enhanced data pipelines that leverage LLMs and Agentic AI frameworks to automate data quality checks, anomaly detection, and intelligent data transformation workflows.
- Architect data infrastructure to support AI/ML workloads, including feature stores, vector databases, and real-time inference pipelines integrated with cloud-native services.
- Leverage established standards and best practices to integrate AI agents into data engineering workflows, including context management protocols (MCP) for seamless AI-to-data-platform communication.
- You have 8+ years of data engineering experience, including 3+ years in a lead role architecting large-scale data platforms.
- You possess expert-level proficiency in Python and Java for building cloud-native data processing solutions.
- You have deep hands-on experience with Apache Airflow, Snowflake (data warehousing, modeling, optimization), and Databricks.
- You have strong AWS expertise, including S3, Lambda, Glue, EMR, Kinesis, EKS, and RDS.
- You have production database experience with PostgreSQL (design, optimization, replication) and MongoDB (document modeling, sharding, replica sets).
- You have solid experience with containerization and orchestration using Docker, Kubernetes, and AWS EKS, including cluster management and autoscaling.
- You have proven CI/CD and GitOps experience using GitHub, GitHub Actions, and ArgoCD for automated deployments and multi-environment management.
- You are proficient with agile tools such as JIRA for sprint management and Confluence for technical documentation and knowledge sharing.
- You have excellent analytical, problem-solving, and communication skills, with the ability to explain complex concepts to non-technical stakeholders and drive initiatives in complex environments.
- You have working knowledge of AI/ML frameworks (LangChain, LlamaIndex, AutoGen, etc.) and understand how Agentic AI can enhance data engineering workflows through automated data validation, intelligent orchestration, and self-healing pipelines.
- You have practical understanding of AI integration patterns in data platforms, including prompt engineering, RAG architectures, and vector database implementations.
- You are familiar with Model Context Protocol (MCP) or similar frameworks for enabling AI agents to interact securely and efficiently with data sources, APIs, and tools.
- You have experience with AI-powered development tools such as GitHub Copilot and Amazon Q.
- Experience with code quality metrics and shift-left principles.
- Experience testing container resiliency (Docker/Kubernetes).
- Experience designing large end-to-end performance scenarios.
- Experience building large and high-performing data pipelines.
- Exposure to Playwright and BDD for automated testing.
- Exposure to the financial industry and data platforms (data warehouses, data lakes).
- Experience with modern data stack tools, data mesh/fabric architectures, and streaming platforms (Kafka, Kinesis).
- Proficiency with observability tools (Datadog) and data quality/governance frameworks.
- Understanding of data security and compliance standards (GDPR, SOC 2, CCPA) and contributions to open-source data projects.
- Relevant certifications (AWS Data Analytics/Solutions Architect, Databricks/Snowflake Data Engineer, CKA).
- Hands-on experience building production Agentic AI systems that operate on data platforms, including multi-agent orchestration and intelligent pipeline optimization.
- Deep expertise with Model Context Protocol (MCP) implementation, including building custom MCP servers or integration patterns for enterprise data platforms.
- Hybrid Work Environment: On-site presence required two days per week.
- A Culture of Learning & Mobility: Access to dedicated training, leadership development, and mentorship programs to support continuous learning.
- Investing in Your Future: Retirement planning and tuition reimbursement programs to help you meet your short- and long-term goals.
- Promoting Health & Wellbeing: Comprehensive healthcare offerings that support physical, mental, financial, social, and occupational wellbeing.
- Supportive Parenting Policies: Family-friendly policies, including a generous global parental leave plan, designed to help you balance work and family life.
- Inclusive Work Environment: A collaborative workplace where all voices are valued, supported by Employee Resource Groups that unite and empower colleagues worldwide.
- Dedication to Giving Back: Paid volunteer days, matched donation programs, and ample opportunities to volunteer in your community.