Please let Crusoe know you found this job on Work in Green. This will help us grow!
Employment type:
Full time
Experience required:
Intermediate
Salary
Salary not provided
About the company:
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.
Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.
The Crusoe Cloud Software Development team is seeking a passionate and experienced Senior/Staff Software Engineer specializing in Systems Applications. This pivotal role is critical in the design and development of our compute platform, specifically focusing on building compute applications for virtualized AI-platforms. An understanding of the linux kernel, virtualization, hardware tuning, distributed systems, object oriented programming, and low-level systems programming are critical to this role. Excellent communication skills and a desire to work with a wide range of technologies across the linux stack are both a must. This is a full-time position.
Compute Application Development & Scaleout: Design highly reliable and performant Linux applications used to manage our virtualization stack across thousands of AI compute servers in multiple global datacenters.
AI Hardware Platform Integration: Integrate Crusoe applications with a wide variety of hardware and software AI chip-vendor stacks. Build solutions to optimize and monitor virtualized hardware (GPUs, Infiniband/ROCe NICs, Ephemeral Storage, etc.) in cutting-edge AI/HPC environments.
Kernel & Hypervisor Integration - Work side by side with our Linux Kernel and Hypervisor teams to ensure our Crusoe applications are seamlessly integrated with a variety of kernels and hypervisors.
Performance Analysis & Tuning: Analyze and enhance the performance of the entire virtualization stack, from the hypervisor to the virtualized guest OS, with a specific focus on optimizing AI/ML workloads. This includes profiling, bottleneck identification, and implementing low-level optimizations.
System-Level Troubleshooting: Diagnose and resolve complex system issues across our virtualization stack (drivers, kernel, hypervisor, guest OS, and crusoe applications). Work closely with kernel and hypervisor teams to debug and resolve integration challenges.
Code Review and Quality Assurance: Conduct thorough code reviews to ensure the highest level of software quality, reliability, and security within compute applications and virtualization stack.
Cross-Functional Collaboration: Collaborate with other engineering teams, including hardware design, OS development, and AI/ML application teams, to ensure cohesive and integrated product development.
Technical Leadership: Provide technical guidance and mentorship to junior engineers, fostering a culture of technical excellence and collaborative problem-solving within the compute applications team.
Linux Systems Familiarity: Experience building applications on Linux kernels, specifically pertaining to virtualization, device drivers, memory management, and process scheduling.
Hardware Integration: Solid understanding of hardware devices such as GPUs, CPUs, Infiniband and Ethernet NICs, Ephemeral Disks, and PCI Express.
Systems Design: Strong grasp of distributed applications and highly-scalable systems design. Specific focus around communications protocols (GRPC, REST, TCP/IP, etc.), databases (Postgres, Redis), and systems design applications (Pub/Sub, Kafka).
Software Architecture: Strong experience building software applications, both at the higher (Golang, Java, Python) and lower (C, C++, Rust) levels. Keen eye for clean, maintainable code, and a unit-test driven mindset.
Excellent Communication Skills: Ability to collaborate with teams across an organization, blocking out noise, and focusing on what needs to get done to get a project across the line.
Rapid and Agile Learner: Capable of adapting quickly, eager to research new technology and not get overwhelmed by unfamiliar tech stacks.
Virtualization Concepts: General knowledge of hypervisors, virtual machine lifecycles, and Linux KVM tooling.
CI/CD and Validation: Understanding of how to build Gitlab or Github CI/CD pipelines that deliver bug-free code across a multitude of compute platforms.
Experience with virtualization specifically for AI/ML workloads, including GPU virtualization.
Previous work debugging or contributing to kernel or hypervisor code, particularly around device management.
Experience with configuring thousands of live compute nodes in a bare-metal production environment.
Industry competitive pay
Restricted Stock Units in a fast growing, well-funded technology company
Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
Employer contributions to HSA accounts
Paid Parental Leave
Paid life insurance, short-term and long-term disability
Teladoc
401(k) with a 100% match up to 4% of salary
Generous paid time off and holiday schedule
Cell phone reimbursement
Tuition reimbursement
Subscription to the Calm app
MetLife Legal
Company paid Commuter FSA benefit of $300 per month
Compensation will be paid in the range of $137,000 - $161,000. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data.
Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.
These are some of our top picks for great climate jobs on Work in Green.
Crusoe is hiring Senior Systems Software Engineer,Systems Software Engineer,Detection Engineer, and more.