Senior GPU Kernel Performance Lead Job at NVIDIA, Santa Clara, CA

SzF2K3h2NFBEU295TkxHbkhMalRKUUtPamc9PQ==
  • NVIDIA
  • Santa Clara, CA

Job Description

We're now looking for a Senior GPU Kernel Performance Lead. Do you enjoy analyzing and reporting on GPU kernel performance? If so, consider applying for the role of Senior GPU Kernel Performance Analysis Lead! Our team delivers high-performance GPU math kernels to NVIDIA’s cuDNN, cuBLAS, and TensorRT libraries to accelerate deep learning models. The team is proud to play an integral part in enabling breakthroughs in domains such as image classification, speech recognition, natural language processing,and large language models. We’re always striving for peak performance and energy efficiency on current and future-generation GPUs. As a kernel performance analysis lead, you will oversee all efforts pertaining to the performance of our kernels. Join the team that is building the underlying software used across the world to power the revolution in artificial intelligence! To get a sense of the code we write, check out our CUTLASS open-source project showcasing performant matrix multiply on NVIDIA’s Tensor Cores with CUDA. While there will be the opportunity for hands-on development, this position specifically is to lead a team for validating the performance of the kernels. What you’ll be doing:

  • Specify test cases, derived from Deep Learning workloads, to provide adequate directed and use-case coverage across all kernels on both simulation and silicon targets
  • Determine performance theory through the development and use of analytical models
  • Track and report on kernel performance throughout the development lifecycle by using and expanding upon current infrastructure
  • Provide feedback to the kernel developers by identifying performance regressions and opportunities to reach the achievable peak performance
What we need to see:
  • PhD degree in Computer Science, Computer Engineering, Applied Math, or related field (or equivalent experience) with 8+ years of relevant industry experience.
  • Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test design
  • Experience leading or managing a team relating to the performance of CPUs, GPUs, or other DL accelerators
Ways to stand out from the crowd:
  • Experience with analytical models and cycle-accurate HW simulators
  • Knowledgeable about performance tools like Nsight or VTune
  • Programming experience beyond C++ including assembly, MLIR/LLVM, Python, and CUDA/OpenCL
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you a creative and collaborative software leader seeking new challenges? If so, we want to hear from you! Come, join our DL Architecture team and help build the real-time, cost-effective AI computing platform driving our success in this exciting and quickly growing field. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 5, and 272,000 USD - 431,250 USD for Level 6. You will also be eligible for equity and benefits () . Applications for this job will be accepted at least until January 13, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Job Tags

Similar Jobs

Bull City Talent Group

SAP FICO Business Analyst Job at Bull City Talent Group

 ...Overview BCTG's direct client is looking for a Sr SAP FICO Business Analyst to focus on the financial and compliance side of the business processes and will primarily work within FICO and its integration with SD and MM. This Business Analyst is responsible for analysis... 

The Little Gym International

Energetic Kids Gymnastics Coach - Part-Time, Perks Job at The Little Gym International

 ...outstanding service to high-energy kids and their families. This part-time role offers a fun, interactive environment for developing...  ...but not required. Join a team that values personal connections and community involvement!#J-18808-Ljbffr The Little Gym International

Zenex Partners

Therapy/Rehabilitation - PT Outpatient Job at Zenex Partners

 ...PT - Napa, CA (Adult Day Care Community/Outpatient Community) - Contract Dates: ASAP - Shift: 5x8hr Days (Mon-Fri; Weekends not required) - Years of Experience REQ: 1 year of PT experience required (Preferred settings: adult day care, case management, home healthcare,... 

Crime Scene Resources, Inc

Forensic Attendant Job at Crime Scene Resources, Inc

Duties and Requirements Click to read more Duties Essential Job Functions Drives a County vehicle to locations throughout the County to recover and transport bodies, specimens, or records from hospitals, nursing home, or other death scenes to the Medical...

Sunshine Enterprise USA

Environmental Analyst Job at Sunshine Enterprise USA

 ...Job Description An Environmental Analyst in the Metal Section performs laboratory-based testing and analysis of metals in environmental...  ...Experience Education : B.S. or M.S. in Chemistry, Environmental Science, or related discipline Experience : 13+ years in an...