Senior Staff Software Engineer, Data Infrastructure
Instacart View all jobs
- Canada
- Permanent
- Full-time
- Define and Drive Data Infrastructure Vision: Own the multi-year technical vision and roadmap for Instacart's core data platform (storage, compute, streaming, orchestration, analytical serving). Translate company data strategy (monetization, federated access, real-time) into a coherent, actionable architecture plan. Align with leadership and proactively evolve the architecture for scale, maturity, and cost.
- Lead Platform Strategy (Build, Buy, Ownership): Architect the ownership strategy for the data platform, determining build vs. buy (including managed services vs. open-source self-hosting). Lead technical/business case evaluations, full cost-benefit modeling, and risk analysis for major investments. Design phased migrations to ensure reliability while achieving long-term independence and cost efficiency.
- Own the Data Lakehouse Foundation: Drive the architecture and delivery of the open lakehouse, including unified table format, compute engine portfolio, and storage governance. Expand multi-engine compute (interactive, batch, stream processing). Define standards for data storage, access, governance, and sharing to enable compute portability and prevent lock-in. Ensure reliable scaling without proportional cost increase.
- Drive Real-Time and Streaming Infrastructure: Own the architecture for streaming data, event-driven pipelines, stream processing, and real-time serving for critical use cases (Ads, Fraud, ML). Make principled decisions on deployment models balancing cost, availability, and operational maturity.
- Pioneer AI-native Data Infrastructure Engineering: Lead the adoption, application, and cultural integration of AI/LLM tools across the data platform lifecycle, setting a high standard for AI-augmented workflows, driving high-leverage opportunities from automation to cost optimization, and partnering with other teams to embed AI-powered capabilities into the platform itself.
- Elevate Engineering Excellence: Serve as the senior technical voice, setting standards for system design and reliability. Lead architecture reviews. Mentor staff/senior engineers, fostering ownership and execution. Be a visible engineering leader, contributing to hiring and cross-org alignment.
- Partner Deeply with Stakeholders: Collaborate with Data Science, ML Platform, Ads Infra, Product Eng, Finance Eng, and Security to translate needs into reliable, self-serve infrastructure. Represent Data Infra in architectural forums, ensuring decisions support business priorities (monetization, compliance, AI). Clearly communicate complex trade-offs to technical and executive audiences.
- 10+ years of software engineering, focused on data infrastructure or distributed systems at scale.Sets technical direction for large-scale data platforms, defining multi-year architecture roadmaps and influencing strategy. Experience in high-growth, data-intensive environments with significant infrastructure scale and spend.
- Expertise in modern data lakehouse architectures, open table formats (Iceberg, Delta Lake, Hudi), and compute/storage trade-offs, in distributed query/compute systems (Trino, Spark, ClickHouse, etc.) for performance tuning and production reliability and event-driven infrastructure (Kafka, Flink, etc.)
- Proven track record owning and executing major infrastructure platform transitions, including build vs. buy, migration design, and risk management.
- Experience building compelling business cases for infrastructure investments, including cost-benefit analysis and TCO modeling.
- Exceptional technical communication for clear architecture documents, strategy memos, and proposals to drive leadership alignment. Strong ownership, comfort with ambiguity, and organizational influence to drive large, multi-team initiatives from concept to production.
- Familiarity with data governance, compliance frameworks (SOX, CPRA, GDPR), and designing governance controls into the platform architecture.
- Experience with FinOps and data platform cost optimization, including managing multi-million dollar infrastructure budgets and negotiating vendor contracts.
- Deep knowledge of SQL and strong proficiency in Python or Scala for systems-level work.
- Experience with orchestration systems (e.g., Apache Airflow) and data transformation pipelines (e.g., dbt) in large-scale production environments.
- Track record of building and growing high-performing data infrastructure teams.
- Bachelor's, Master's, or PhD in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience.