Applied ML Engineer, Speech

About us

Our mission is to reinvent the way people learn, starting with language. We begin by teaching the next billion people English, Spanish, and French.

English is the global language of business, culture, and communication, and over 1.5 billion people around the world are actively trying to learn right now. Others dream of communicating with the half-billion native Spanish speakers across the globe. The problem is that it's nearly impossible to learn to speak a language without constant access to a speaking partner. Grammar and vocab apps don't really help – you need to actually converse with someone.

Speak is on a journey to fix this. We're creating an AI-powered experience that replicates the flow of a conversation, without needing a human on the other end. The goal is to make it radically more accessible to be able to have conversations in a foreign language and eventually help hundreds of millions of people gain fluency who otherwise wouldn't be able to.

We started on this journey over five years ago and we've still got a long ways to go. We're thoughtfully adding new team members only when we think they can truly play a big role in our mission.

Speak launched first in South Korea where we have quickly grown to become the top grossing education app in the country. We have now delivered this winning product to more than 30 countries globally and are continuing to expand to more markets in the coming months. The company is well funded, raising a recent Series B backed by investors like OpenAI, Founders Fund, Y Combinator, Khosla Ventures, Lachy Groom, Josh Buckley, and others. We’re a team of 75 based primarily in SF, Seoul, Tokyo, and Ljubljana.

About this role

We are looking for an experienced Machine Learning Engineer to join our team and help develop cutting-edge phoneme recognition models to provide learners with effective pronunciation feedback. In this role you will take ownership of the end-to-end modeling pipeline, from training and experimentation to deployment and monitoring. You will also work closely with Product teams to design innovative learning experiences and measure the efficacy of production models as they affect our end users. We are a small, dynamic team where you will likely also contribute as a developer and thought partner on broader team projects like ASR, assessment, content personalization, and much more. This is an incredibly exciting time to join an ML team designing a personalized learning experience that will revolutionize language learning for millions of learners worldwide — come join us!

What you'll be doing

  • Training and deploying phoneme recognition models end-to-end, including monitoring, performance tracking, and retraining
  • Expanding our Pronunciation Coach to provide precise feedback and integrate more broadly across our learning platform
  • Tracking metrics to measure performance of existing phoneme models across markets
  • Building out an assessment system to provide nuanced feedback on pronunciation, intonation, prosody, and more
  • Building and maintaining data infrastructure; i.e. audio data pipelines, training/evaluation datasets creation and management, labeling/active learning loop
  • Supporting the broader Speech & ML team (e.g. developing and localizing ASR models)

What we're looking for

  • Extensive experience training and deploying custom deep learning models to production (experience with audio/speech strongly preferred)
  • Proficiency in Python and common Deep Learning frameworks like PyTorch
  • Strong communication skills and the ability to explain complex ML concepts to non-technical stakeholders
  • Sharp product sense and an ability to think broadly and cross-functionally about model quality in the context of user experience
  • 4 - 10 years of industry experience


  • San Francisco, CA

Why work at Speak

  1. Join a fantastic, tight-knit team at the right time: we're growing very quickly, we've raised our Series B and an additional extension from some of the top investors in the valley, and we've achieved product-market fit in our initial markets. You'd join at a magical time when a single person could significantly change the course of the company.
  2. Do your life's work with people you’ll love working with: we care strongly about our craft and want every person at Speak to feel like they're growing every day. We believe in the idea that working with people you both enjoy and have respect for makes everything better. We hire thoughtfully and only work with people we admire deeply.
  3. Global in nature: We're live in over 40 countries and launching in a number of new markets soon. We have dedicated offices in San Francisco, Ljubljana, Seoul, and Tokyo, and you’ll have the opportunity to talk to users in each of these regions on a regular basis as well as travel.
  4. Impact people's lives in a major way: Learning a language is one of the single most life-changing skills one can learn, and right now 99% of people never achieve their goal because the process is broken. We’re helping millions of people achieve their goals and improve their lives.

Speak does not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Apply for this job

Other AI Jobs like this

logo Speak Machine Learning Engineer FullTime 🌎 Remote 📍 San Francisco Apply Now
Your subscription could not be saved. Please try again.
Your subscription has been successful.


Subscribe and stay updated.

Your subscription could not be saved. Please try again.
Your subscription has been successful.

Join our newsletter