C++ Software Engineer - GPU Performance

Number of employees

4100

Foster City, CA, United States

Posted on: 2025-11-12

Category: emobility

Ready to make this your next chapter?

Let zoox know you found them on WorkInGreen. It helps more companies post climate jobs here.

Expired

Employment type:

Full time

Experience required:

Intermediate

Salary

Salary not provided

About the company:

Zoox is transforming mobility-as-a-service by developing a fully autonomous, purpose-built fleet designed for AI to drive and humans to enjoy.


Zoox is building the world's most advanced self-driving hardware and software solution. The efficiency demands of such a system require an expert fine tuning of both the compute hardware architecture as well as the algorithms and middleware that runs on it to achieve maximum throughput at the most optimal power levels. 

The Software Core Performance team’s mission is to analyze, optimize and provide guidance to the software and hardware teams in order to meet the required specifications.   

As a GPU performance software engineer within the Software Performance team, you will instrument, monitor, analyze and optimize GPU based algorithms that are performance-critical for our solution. The scope for GPU usage ranges from traditional computer vision and deep learning architectures to complex geometric reasoning and multi-agent decision making. Your work will strongly influence design decisions of future compute platforms & resource allocation.

In this role, you will:

  • Build real-time instrumentation for performance monitoring (CPU, GPU, latency, memory) and develop offline benchmarking frameworks, tools, and scripts to evaluate & analyze performance at scale in CI/vehicle, and establish budgets for next-gen architectures.
  • Analyze performance metrics to identify GPU hotspots and root causes, and propose and co-implement actionable solutions with component teams.
  • Support teams on bringing serial algorithms to the GPU to maximize compute utilization and improve overall latency.
  • Work as part of the Core team to design a middleware framework that promotes by default efficient and performant code development by maximizing CPU and GPU.
  • Qualifications

  • BS in computer science or related field and 3+ years of experience.
  • Strong knowledge of CUDA as applied to recent GPU microarchitectures (e.g., Ampere, Blackwell) and experience debugging/optimizing GPU kernels using tools like Nsight.
  • Strong knowledge of C++ and experience in large code bases, comfortable in Linux development environments.
  • Experience in development, debugging, and profiling of complex multiprocess systems (e.g., robotic systems, game engines).
  • Bonus Qualifications

  • Experience with GPU kernel development in a real-time environment, including PTX-level programming, CPU SIMD instructions (e.g., AVX intrinsics), and custom CUDA layers with frameworks like TensorRT & XLA.
  • Hands-on work with ML model optimization (post-training quantization, layer pruning, etc) or hand-tuning GPU kernels (in OpenGL, CUDA, RocM or similar).
  • Proficiency with SQL, DataBricks, Looker, or other business intelligence tools.
  • Not quite the right fit? Keep looking.

    More climate roles that match your skills and values.

    View all jobs
    Harbinger logo
    United States
    Harbinger logo
    United States
    Lucid Motors logo
    United States
    Number of employees

    7200

    Full time
    Emobility
    zoox logo
    United States
    zoox logo
    United States

    1829 Emobility jobs at zoox

    zoox is hiring PhD Research Intern, Offline Driving Intelligence,PT Student Worker-Robot Platform Validation,Senior Software Engineer - Core Sensors, and more.

    View all jobs at zoox