Senior Engineer - GPU Performance Optimization

Advanced Micro Devices View all jobs

Toronto, ON Waterloo, ON
Permanent
Full-time

1 month ago

Overview:WHAT YOU DO AT AMD CHANGES EVERYTHINGAt AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. Responsibilities:Senior Engineer – GPU Performance OptimizationTHE ROLEAs Senior Software Engineer you will serve as part of our performance engineering team for AMD’s core deep‑learning libraries—hipDNN, MIOpen, and Composable Kernel (CK)—with a primary focus on new GPU products, and optimization on leading and bleeding edge hardware.You will drive performance optimization across these libraries, ensuring strong out‑of‑the‑box performance and a clear path to parity and leadership for compute workloads. This role operates across multiple ASIC generations, requiring strong cross‑architecture software engineering skills and the ability to adapt, analyze, and modify kernels, heuristics, and execution strategies as hardware evolves.You will work at the intersection of kernel development, library integration, and framework enablement, contributing directly to the success of new ROCm releases and new product introductions.THE PERSON
You are a performance‑driven engineer who thrives during new hardware bring‑up and ambiguous early‑silicon phases. You are comfortable working across abstraction layers—from kernel code and library APIs to framework‑visible performance—and you enjoy translating architectural characteristics into concrete software optimizations.You collaborate effectively across teams, communicate performance trade‑offs clearly, and are trusted to take ownership of critical performance paths during time‑sensitive product ramps. You grow influence through technical execution, deep expertise, and mentoring.KEY RESPONSIBILITIES:

Performance Engineering for New Products: Lead performance optimization efforts for hipDNN, MIOpen, and CK on new AMD GPU architectures. Drive early performance characterization, gap analysis, and optimization plans during pre‑silicon and post‑silicon bring‑up.

Kernel & Library Optimization: Implement and optimize performance‑critical kernels and operators used by hipDNN and MIOpen, leveraging other libraries where appropriate. Improve kernel selection, fusion strategies, and heuristics to maximize efficiency across diverse workloads.

Cross‑ASIC Software Engineering: Adapt library implementations and tuning strategies across multiple ASICs, balancing portability with architecture‑specific optimization. Identify when shared abstractions are sufficient versus when targeted specialization is required.

hipDNN Enablement & Transition: Contribute to hipDNN’s role as the primary execution and fusion layer, including plugin integration and performance validation. Support the transition of functionality from MIOpen into hipDNN while maintaining performance and compatibility.

Framework‑Facing Performance: Work closely with framework teams (e.g., PyTorch, JAX, Triton) to ensure optimized library paths are exercised in real workloads. Validate performance improvements using representative training and inference models.

Performance Validation & Regression Control: Define and execute performance benchmarks for new products. Help detect, diagnose, and resolve performance regressions across releases and architectures.

Collaboration & Mentorship: Partner with Principal engineers, architecture teams, and kernel specialists to align optimization efforts. Share best practices in kernel tuning, performance analysis, and cross‑ASIC optimization with the broader organization.

Leverages AI‑assisted software development tools to accelerate design, implementation, review, and documentation of complex software libraries. Establishes best practices for responsible use of AI assistance, including validation, review, and traceability of generated code and technical artifacts.

PREFERRED EXPERIENCE:

Strong hands‑on experience with GPU performance engineering and kernel optimization. Practical experience with deep‑learning libraries.

Experience supporting new hardware bring‑up or optimizing software across multiple GPU architectures. Solid understanding of deep‑learning operator patterns (GEMM, convolution, attention, normalization, fusion).

Proficiency in C/C++, with Python used for tooling, benchmarking, and analysis.

Experience using GPU profiling, tracing, and performance analysis tools. Familiarity with framework‑level integration and validation (e.g., PyTorch or JAX).

Applied experience using AI‑assisted coding tools in professional software engineering workflows, including code generation, refactoring, test creation, documentation, and design exploration.

ACADEMIC CREDENTIALS:

Advanced degrees, such as M.Sc., M.Eng., Ph.D. are preferred

LOCATION:Hybrid/Remote in Waterloo-Toronto Corridor Ontario, or Calgary Alberta, or Vancouver British Columbia Qualifications:Benefits offered are described: .AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is availableThis posting is for an existing vacancy.

Advanced Micro Devices

Apply Now