NVIDIA
Deep Learning Performance Architect
Job Description
NVIDIA has continuously reinvented itself. Our invention of the GPU sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. Today, research in artificial intelligence is booming worldwide, which calls for highly scalable and massively parallel computation horsepower that NVIDIA GPUs excel.
NVIDIA is a “learning machine” that constantly evolves by adapting to new opportunities that are hard to solve, that only we can address, and that matter to the world. This is our life’s work , to amplify human creativity and intelligence. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join our diverse team and see how you can make a lasting impact on the world!
Intelligent machines powered by Artificial Intelligence computers that can learn, reason and interact with people are no longer science fiction. GPU Deep Learning has provided the foundation for machines to learn, perceive, reason and solve problems. NVIDIA’s GPUs run AI algorithms, simulating human intelligence, and act as the brains of computers, robots and self-driving cars that can perceive and understand the world. Increasingly known as “the AI computing company”, NVIDIA wants you. Come, join our Deep Learning Architecture team, where you can help build real-time, cost-effective computing platforms driving our success in this exciting and rapidly growing field!
What you’ll be doing:
-
Benchmark and analyze AI workloads in single and multi-node configurations.
-
High level simulator and debugger development in C++/Python.
-
Evaluate PPA (performance, power, area) for hardware features and system-level architectural trade-offs.
-
Work closely with wider architecture teams, architecture and product management to help with trade-off analysis at every stage of the project.
-
Keep abreast with emerging trends and research in deep learning.
What we need to see:
-
MS or PhD in a relevant discipline (CS, EE, Math).
-
2+ years of experience in parallel computing architectures, interconnect fabrics and deep learning applications.
-
Strong programming skills in C, C++ and Python.
-
Proficiency in architecture analysis and performance modeling.
-
Curious mindset with excellent problem solving skills.
Ways to stand out from the crowd:
-
Understanding of modern transformer-based model architectures.
-
Experience with benchmarking, projections methodologies, workload profiling and correlation.
-
Ability to simplify and communicate rich technical concepts with non-technical audience.
#LI-Hybrid