Staff Software Engineer — Enterprise & Data Infra
We’re on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!
As a Staff Software Engineer on the Enterprise Infrastructure team, you'll accelerate the Snorkel AI team and our customers by improving our developer platform and services for user and data management across the stack. You’ll work closely with other engineers, researchers, and product management to align on the highest leverage improvements for enterprise readiness and maturity, security, observability, supportability authentication/authorization/audit, and more.You are expected to lead and mentor junior members of the team and set technical direction and roadmaps.
Main Responsibilities
- Design, develop and maintain enterprise features for the platform (including but not limited to authentication/authorization, RBAC,Single Sign-On, data governance)
- Build effective logging, monitoring and alerting platform observability and supportability
- Improve platform security by maintaining regular scanning and patching of CVEs, applying security best practice in source code
- Collaborate with enterprise customers to understand product use cases and translate into engineering specifications, and deliver high-quality solutions
- Lead and mentor junior members of the team and set technical directions and roadmaps in collaboration with internal stakeholders (e.g. PM, GTM)
- Participate in on-call responsibilities in rotation with the engineering team
- Work a hybrid schedule with three days per week in our Redwood City HQ or the SF office and work remotely with "No Meeting" Tuesdays and Thursdays
Required Qualifications
- Bachelor's degree in Computer Science or related field, or equivalent demonstrated experience
- 8+ years of experience in distributed systems and cloud-native applications
- Strong experience with IAM, Data Security or Data Governance
- Strong experience with structured and unstructured data storage technologies
- Regularly follows the best software engineering practices and hold a high bar for the team by leading design, code review and test plan reviews
- Proven ability to lead and mentor teams of engineers.
Preferred Qualifications
- Strong development experience in Python or other language like Java, golang, scala etc
- Extremely well versed in building and managing cloud infrastructure for enterprise platforms on (AWS, GCP, Azure) and services like EC2, EKS, VPC etc
- Experience in one or more of the build tools like Bazel, Gradle, Make etc. Extra points for someone who has hands on experience in building and managing large code bases with these tools
- Designed and implemented developer-friendly APIs or tools to boost developer productivity
- Familiarity in deployment, monitoring and maintenance of large-scale enterprise software products
- Follow the best software development practices, and hold the high engineering bar for the team by regularly leading design, code review and test plan reviews
- Experience working cross-functionally across teams including product, design, customer success and support
- Familiarity in developing and releasing infrastructure software for SaaS and on-prem platforms
- Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others
- [Nice to have]: Hands-on experience setting up and operating Kubernetes clusters in production at scale
- [Nice to have]: Experience leading teams building large scale distributed computing systems for ML Training or Serving, eg: Ray, Spark, Tensorflow etc
- [Nice to have]: Hands-on experience in creating and maintaining metrics and dashboards on observability platforms such as New Relic, DataDog, Chronosphere, or similar tools
- [Nice to have]: Experience building services and infrastructure for Machine learning and AI Systems
The salary range for this position based in the San Francisco Bay Area is $200,000.00 - $270,000.00. All offers include equity compensation in the form of employee stock options.