ML Data Engineer - Generative AI

We’re looking for an ambitious ML Engineer to play a leading role in managing the infrastructure to efficiently store & process 100’s of TB of audio data. You’ll also collaborate with Vatsal (ex Amazon Alexa & Cambridge) to train or fine-tune generative models on this data. The resulting models will be used to enhance our core products to better meet our customer’s needs.

You’ll be the first full-time ML hire at a well-funded startup and play a critical role in shaping the future of MetaVoice as a core member of the founding team. We’re particularly interested in candidates who have been founders themselves or built impressive side projects.

If you think you have what it takes to be a part of an ambitious & high-performing team, please share links to Github code you've written, contributions you've made & papers you’ve published. Check out our Notion careers page for more information & FAQs on MetaVoice.

KEY RESPONSIBILITIES

Set up cost efficient & highly performant data infrastructure for storing & transforming large quantities of audio data.
Work with ML Audio or Digital Signal Processing techniques to analyse, clean, segment & filter speech data
Manage the storage & usage of terrabytes of cloud data & deeply understand of the cost/benefit tradeoffs of each solution
Build efficient ML model training pipelines in PyTorch that utilise large datasets
Participate in research activities, including the application and evaluation of generative voice & speech-to-speech techniques
Research and implement novel ML and statistical approaches to add value to the business.
Setup testing pipelines to evaluate ML model performance on audio data

BASIC REQUIREMENTS

PhD in fields such as Deep Generative Models, STS, Deep Learning, TTS, ASR, NLU. Bachelor’s/Master’s degree considered with existing applied experience in industry
Deep Knowledge in fields such as Voice Conversion, Deep Generative Models, Machine Learning, Deep Learning, TTS, ASR, NLU or Statistical modelling
Hands on experience with machine learning frameworks such as PyTorch, Keras, Tensorflow
Significant experience with scaleable data processing tools like PySpark, Kubernetes, Databricks, Apache Arrow etc.
Experience with Python & C/C++.
Experience managing GPU intensive data processing jobs
4 years of applied research experience
Creative thinker & problem solver who can execute independently & quickly with a bias for action
An unrelenting desire to built world-class products which delight users
Outstanding written, spoken & interpersonal communication skills

PREFERRED REQUIREMENTS

Extensive experience of applied research. Ideally developing voice conversion, speech synthesis and natural language processing models
PhD with specialisation in voice conversion, text-to-speech, natural language processing, or machine learning
Scientific thinking and the ability to invent, a track record of thought leadership and contributions that have advanced the field

Apply for this job

ML Data Engineer - Generative AI

Other AI Jobs like this

Director of Treasury

Enterprise Account Executive - Pennsylvania

Enterprise Account Executive, Digital Native Business

Engineering

Data

Other Roles

Locations