Software Engineer, Machine Learning Systems
We are looking for a machine learning systems expert enthusiastic about engaging with all facets of the ML system stack. We’re looking for someone who is eager to traverse the entire ML system stack, iterate fast on building new ML cloud systems, and is hungry to build and own enormous contributions.
About the role:
ML System engineers in our team are responsible for one or more of the followingDeployment and management of high-performing compute clusters.Enhancing inference and training performance through optimizations across the system stack, encompassing high-level mechanisms such as queuing and scheduling, medium-level optimizations within inference and training engines, and low-level optimizations targeting GPU kernel efficiency. Qualifications:
Experience building and rapidly prototyping production cloud-based softwareDemonstrated fluency with data structures, algorithms, architecture, and agile software best practices in any languageExperience in Python and C++/RustUnderstanding of the latest technologies in LLMs, like LoRa, Mamba, etc.Understanding or willingness to learn about the entire system stackDesire to work in an inclusive and collaborative environmentAn interest in continually learning from others, teaching others, and digging into new challenges Nice to have:
Desire to create speed of light training and inference systems for next-generation AIDeep technology expertise in machine learning systems, e.g. TinyML, Triton, CUDA, ROCm, Exo, MLIR, Halide, etc We believe in hiring passionate individuals who believe in the AI revolution to make software accessible to all. If you’re excited about this role but are not sure if your past experience aligns perfectly, we still encourage you to apply and meet with us.
Apply for this job →