Research Engineer
We’re on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!
As a Research Engineer, you will innovate and translate cutting edge research into user experiences. If you find yourself thinking about any of these questions:
- How to prompt a model like GPT-4 effectively?
- How to build a foundation model for a specific domain like medical records?
- How to use models like LayoutLM V3 for weakly labeling documents at scale?
- How to blend prompt engineering, retrieval augmentation, and fine-tuning to customize models with the least human time and effort?
We’re looking for a talented generalist ML engineer with software development skills to join the team to work on foundational multimodal problems with the focus on data development techniques.
Main Responsibilities
- Establish and empirically demonstrate the state-of-the-art approaches for data-centric model iteration and analysis
- Prototype end-to-end workflows with novel techniques and algorithms, synthesize results, and help to transfer learnings into Snorkel products
- Work closely with design partners to validate your work on real-world use cases with measurable impact
- Contribute to novel research on topics of interest to Snorkel AI by collaborating with other Snorkel Research scientists and affiliate scientists (academic, government, and industry researchers)
- Work Remotely or in our Bay Area Offices!
Preferred Qualifications
- PhD and 1+ years of experience in applied machine learning (ML) on computer vision (CV) or natural language processing (NLP) tasks.
- MS and 3+ years of work experience with focus on delivering products (including internships and co-ops)
- Strong coding and problem solving skills
- Experience as a researcher/developer of ML taking projects from conception to production
- Strong ML experience in building datasets and models
- Work that focused on multimodal (a combination of image, text, video, audio, pointcloud, MRI, etc.) data and using active learning, semi-supervised, self-supervised, weakly supervised, etc.
- Work with larger models (e.g. 1B+ parameters)
- Experience with database (PostgreSQL, SQL, MySQL, etc.)
- Experience with front-end prototyping tools such as streamlit and dash
- Experience with cloud ML platforms such as Vertex AI, Azure AI, and SageMaker
- Publications in top-tier conferences (such as ICML, ICLR, AAAI, NeurIPS, CVPR, ICCV, ICRA, ECCV, ACCV, EMNLP, CoRL, etc.) are highly desirable.