
ROCm AI System Software Architect
- Markham, ON
- Permanent
- Full-time
- Develop and deliver advanced AI software solutions for AMD customers and users.
- Enable and optimize the AI software stack for frameworks like PyTorch, vLLM, and GGML/llama.cpp, as well as emerging open-source AI software.
- Implement state-of-the-art AI models and enhance their performance.
- Lead the full AI software development lifecycle, including scoping, implementation, integration, verification, and customer enablement.
- Drive and deliver powerful debugging and profiling tools with consistent user experience across Radeon GPUs on both Linux and Windows
- Expertise in system software design and optimization, with experience on cross-platform development and programming languages C/C++ and Python.
- Proven problem-solving skills and effective communication abilities.
- Experience with AI frameworks, inference stacks, and GPU-accelerated development, including GEMM, CONV, Flash Attention, fused MoE, and non-linear operators.
- Compiler and low-level GPU programming is a plus.
- Solid understanding of AI/ML concepts, with knowledge of the performance impact of compute, memory, and communication configurations.
- Open-source software development experience is a plus.
- Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or a related field.