Research Engineer, Metaprompting

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the role:

Anthropic’s AI technology is amongst the most capable and safe in the world. However, large language models are a new type of intelligence, and the art of instructing and evaluating them in a way that delivers the best results is still in its infancy — it’s a hybrid between research, engineering, and behavioral science. We’re bringing rigor to this new discipline by applying a range of techniques to systematically discover and document prompting best practices, using our models to improve training and evaluation, developing prompt self-improvement techniques to automatically optimize the model’s performance on any given task, , and finding ways of making it easy for our customers to do the same.

Given that this is a nascent field, we ask that you share with us a specific prompting, model evaluation, synthetic data generation, model finetuning, or application built on LLMs that you're proud of in your application! Ideally this project should show off a complex and clever prompting architecture, or a systematic evaluation of an LLM's behavior in response to different prompts, or an example of using LLMs for a relevant ML task such as careful dataset curation and processing. There is no preferred task; we just want to see how you create and experiment with prompts. You can also include a short description of the process you used or any roadblocks you hit and how to deal with them, but this is not a requirement.


  • Develop automated prompting techniques for our models (eg extensions to the Metaprompter)
  • Finetune new capabilities into Claude that maximize Claude’s performance or ease of use given particular prompting innovations
  • Lead automated evaluation of Claude models and prompts across the training and product lifecycle
  • Help create and optimize data mixes for model training
  • Develop and systematically test new, creative, and original prompting strategies for a wide range of research tasks relevant to our fine-tuning and end product efforts.
  • Help to create and maintain the infrastructure required for efficient prompt iteration and testing.
  • Develop future Anthropic products built on top of Claude.
  • Stay up-to-date with the latest research in prompting and model orchestration, and share knowledge with the team.

You may be a good fit if you:

  • Have significant ML research or software engineering experience
  • Have at least a high level familiarity with the architecture and operation of large language models.
  • Have extensive prior experience exploring and testing language model behavior.
  • Have spent time prompting and/or building products with language models
  • Have good communication skills and an interest in working with other researchers on difficult prompting tasks.
  • Have a passion for making powerful technology safe and societally beneficial.
  • Stay up-to-date and informed by taking an active interest in emerging research and industry trends.
  • Enjoy pair programming (we love to pair!)
Strong candidates may also have:
  • Advanced degree in computer science, mathematics, statistics, physics, or a related technical field, or an advanced degree in a relevant non-technical field alongside evidence of programming experience.
  • Experience with large-scale model training and evaluation.
  • Language modeling with transformers
  • Reinforcement learning
  • Large-scale ETL

Representative projects:

  • Building the prompting and model orchestration for a production application backed by a language model
  • Finetuning Claude to maximize its performance when a particular prompting technique is used.
  • Building and testing an automatic prompt optimizer or automatic LLM-driven evaluation system for judging a prompt’s performance on a task.
  • Implementing a novel retrieval, tool use, sub-agent, or memory architecture for language models.
  • Building a scaled model evaluation framework driven by model-based evaluation techniques.

Deadline to apply: None. Applications will be reviewed on a rolling basis. 

The expected salary range for this position is:

Annual Salary:
$315,000$510,000 USD


Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

US visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate; operations roles are especially difficult to support. But if we make you an offer, we will make every effort to get you into the United States, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed.  Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Compensation and Benefits*

Anthropic’s compensation package consists of three elements: salary, equity, and benefits. We are committed to pay fairness and aim for these three elements collectively to be highly competitive with market rates.

Equity - For eligible roles, equity will be a major component of the total compensation. We aim to offer higher-than-average equity compensation for a company of our size, and communicate equity amounts at the time of offer issuance.

US Benefits -  The following benefits are for our US-based employees:

  • Optional equity donation matching.
  • Comprehensive health, dental, and vision insurance for you and all your dependents.
  • 401(k) plan with 4% matching.
  • 22 weeks of paid parental leave.
  • Unlimited PTO – most staff take between 4-6 weeks each year, sometimes more!
  • Stipends for education, home office improvements, commuting, and wellness.
  • Fertility benefits via Carrot.
  • Daily lunches and snacks in our office.
  • Relocation support for those moving to the Bay Area.

UK Benefits -  The following benefits are for our UK-based employees:

  • Optional equity donation matching.
  • Private health, dental, and vision insurance for you and your dependents.
  • Pension contribution (matching 4% of your salary).
  • 21 weeks of paid parental leave.
  • Unlimited PTO – most staff take between 4-6 weeks each year, sometimes more!
  • Health cash plan.
  • Life insurance and income protection.
  • Daily lunches and snacks in our office.

* This compensation and benefits information is based on Anthropic’s good faith estimate for this position as of the date of publication and may be modified in the future. Employees based outside of the UK or US will receive a different benefits package. The level of pay within the range will depend on a variety of job-related factors, including where you place on our internal performance ladders, which is based on factors including past work experience, relevant education, and performance on our interviews or in a work trial.

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.

Apply for this job

Other AI Jobs like this

logo Anthropic AI Research Full-time Hybrid 📍 Remote-Friendly (Travel-Required) | San Francisco, CA | Seattle, WA | New York City, NY Apply Now
Your subscription could not be saved. Please try again.
Your subscription has been successful.


Subscribe and stay updated.

Your subscription could not be saved. Please try again.
Your subscription has been successful.

Join our newsletter