ML Engineering Manager, Model Graph

About the Team

The Workload ML team builds the core ML libraries of our state-of-the-art internal training framework used to train our cutting-edge models. The Model-Graph team within ML works primarily on optimizing distributed model execution.

Our priorities are to maximize training/inference throughput (how quickly we can train a new model) and researcher throughput (how quickly we can develop new models) with the goal of accelerating progress towards AGI.

About the Role

We are looking for an experienced engineering manager to help lead critical work on model definition and efficient distributed execution via graph compiler within our shared internal training stack. Our internal training stack is used by Research for large scale and small scale runs.

In this role, you will:

Reduce the time it takes to try out new architecture ideas for training new models and increase the robustness of model code.
Collaborate closely with researchers and other systems engineers to maximize the benefits of our shared internal training stack.
Enable SOTA throughput for our most important research models.
Hire world-class AI systems engineers in one of the most competitive hiring markets.
Coordinate the training/inference needs of OpenAI's research teams.
Create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think.

You might thrive in this role if you:

Have 3+ years of experience in engineering management and 5+ years as an IC working with high scale distributed systems and ML systems.
Have experience with ML systems, particularly large scale distributed training or inference of modern LLMs, as well as graph compilers.
Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.
Care deeply about diversity, equity, and inclusion, and have a track record of building inclusive teams.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Apply for this job

ML Engineering Manager, Model Graph

Other AI Jobs like this

ML Infrastructure Engineer

AI Infrastructure Engineer, Model Serving Platform

ML Research Engineer, ML Systems

Engineering

Data

Other Roles

Locations