
Machine Learning Framework, Compiler & Performance Engineer (Staff level and Up)
- Markham, ON
- Permanent
- Full-time
Machine Learning EngineeringGeneral Summary:Today, more intelligence is moving to end devices, and mobile is becoming the pervasive AI platform. Building on the smartphone foundation and the scale of mobile, Qualcomm envisions making AI ubiquitous-expanding beyond mobile and powering other end devices, machines, vehicles, and things.We are inventing, developing, and commercializing power-efficient on-device AI, edge cloud AI, and 5G to make this a reality.Job Purpose & ResponsibilitiesAs a member of Qualcomm's ML Systems Team, you will participate in two activities:
- Development and evolution of ML/AI compilers (production and exploratory versions) for efficient mappings of ML/AI algorithms on existing and future HW
- Analysis of ML/AI algorithms and workloads to drive future features in Qualcomm's ML HW/SW offerings
OR
Master's degree in Computer Science, Engineering, Information Systems, or related field and 3+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
OR
PhD in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.Key Responsibilities:
- Contributing to the development and evolution of ML/AI compilers within Qualcomm
- Defining and implementing algorithms for compiling ML/AI workloads to achieve high performance and low power on Qualcomm HW
- Creating and implementing algorithms that couple PyTorch framework efficiently to Qualcomm ML/AI Compiler flows.
- Understanding trends in ML network design, through customer engagements and latest academic research, and how this affects both SW and HW design
- Exploration and analysis of performance/area/power trade-offs for future HW and SW ML algorithms
- Creation of performance-driven simulation components (using C++, Python) for analysis and design of high-performance HW/SW algorithms on future SoCs
- Pre-Silicon prediction of performance for various ML algorithms
- Running, debugging and analyzing performance simulations to suggest enhancements to Qualcomm hardware and software to tackle framework, compute and system memory-related bottlenecks
- Demonstrated ability to learn, think and adapt in fast changing environment
- Detail-oriented with strong problem-solving, analytical and debugging skills
- Strong communication skills (written and verbal)
- Strong background in algorithm development and performance analysis is essential
- Strong object-oriented design principles
- Strong knowledge of C++
- Strong knowledge of Python
- Experience in compiler and/or GPU design and development is an asset
- Knowledge of network model formats/platforms (eg. Pytorch, ONNX) is a strong asset.
- Knowledge of software development processes (revision control, CD/CI, etc.) · Familiarity with tools such as git, Jenkins, Docker, clang/MSVC
- On-silicon debug skills of high-performance compute algorithms · Knowledge of algorithms and data structures
- Knowledge of computer architecture, digital circuits and event-driven transactional models/simulators