Anthropic Hires 1,000 Engineers To Train Claude Code & Make It Think Like A Developer

Anthropic’s Claude Code may look like another fast-moving AI coding breakthrough, but behind the tool’s rise is a quiet global workflow powered by human software engineers.

A project run through Snorkel AI is using contractor feedback to refine how Claude Code writes, reviews, and maintains code.

TL;DR

Anthropic’s Claude Code is reportedly being improved through a Snorkel AI project called Marlin.
Around 1,000 software engineers are helping test, compare, and refine AI-generated code.
Some contractors said they are paid $280 per task, with each task taking about an hour.
The work focuses on cleaner, safer, more maintainable code.

How Project Marlin Is Helping Claude Code Improve AI Coding

Training an AI coding tool to behave more like an experienced developer takes more than model upgrades. According to contractor interviews and training material reviewed by Business Insider, Anthropic’s Claude Code is being strengthened through a Snorkel AI project internally known as Marlin.

The project uses feedback from about 1,000 human software engineers who create prompts, review code, and compare outputs from two different models. The aim is to fine-tune Claude Code’s answers so they more closely resemble what a professional developer would deliver.

Two contractors working on the project said they were paid $280 per task. They added that a task usually takes about an hour, though some submissions require additional review and back-and-forth with Snorkel’s approval layer.

TechDogs-"An Image Of Anthropic CEO Dario Amodei"

Source

Why Human Software Engineers Are Still Central To AI Coding Tools

Project Marlin shows how AI coding systems still depend heavily on specialist human expertise. Contractors with software engineering backgrounds are asked to A/B test code written by two models, select the stronger answer, and evaluate whether the response meets the level of detail requested in the prompt.

One contractor said the goal was to train Claude Code to produce simplified, easier-to-maintain code. That focus matters because coding tools are no longer being judged only on whether they can generate working snippets. They are increasingly expected to produce production-ready code that is readable, reliable, secure, and practical for real software teams.

The contractors did not know which versions of the models they were testing. The project is also ongoing, meaning the work may continue feeding into future Claude Code improvements.

What Contractors Are Testing Inside Claude Code’s Workflow

The tasks are designed to mirror real software development scenarios. Contractors were instructed to pick a GitHub repository from a list of thousands, create a pull request, and write a prompt explaining what the model should do.

In one example, a contractor asked the model to reorganize how a system stores and handles execution metadata. The goal was to make the code clearer and easier for developers to work with, without changing the underlying product or feature behavior.

The model then returned two sets of code, and the contractor selected the version they considered more efficient. Contractors were also asked to issue follow-up prompts to test how the models handled conversational context across a coding task.

Another task focused on security. A contractor prompted the model to fix how MLFlow, an open-source machine learning platform, downloads Python packages when loading certain models.

The task instructions told the contractor to “evaluate production-ready code based on correctness, security, reliability, and maintainability. The fix must properly block command injection attempts while still allowing all legitimate whitelisted pip options.”

That detail highlights the bar AI coding tools now need to clear. For enterprise developers, clean code is only part of the equation. The output also needs to avoid security pitfalls, preserve legitimate functionality, and withstand real-world production use.

How AI Data Work Is Becoming More Specialized

The Claude Code project also reflects a broader shift in the AI training industry. Data-labeling platforms that once leaned on generalist contractors are increasingly hiring experts with technical, legal, medical, and research backgrounds.

Snorkel says it works with people who have advanced degrees such as Ph.Ds, MDs, and JDs, or equivalent experience. The company also says top experts earn more than $3,000 a week.

Software engineering has become one of the clearest examples of this specialization. Besides Snorkel, platforms such as Scale AI and Mercor also offer up to $110 an hour for software engineers’ AI training work.

Topics For More Insights

The AI Training Economy Behind Claude Code

Snorkel AI, founded in 2019 by Stanford researchers, creates datasets and evaluation systems to improve AI models and test chatbots for AI companies. Its customer list includes Google, Mistral, and Anthropic.

The company raised $100 million in Series D funding at a $1.3 billion valuation in May 2025. It later cut 13% of its workforce in September, according to Business Insider.

Snorkel is part of a wider ecosystem that includes Scale AI, Mercor, and Handshake, where large pools of contractors help rank, filter, and train AI responses for major technology companies. This kind of data work supports everything from autonomous vehicles to AI chatbots from companies such as OpenAI and Meta.

Neither Anthropic nor Snorkel responded to requests for comment.

What This Means For AI Coding Tools

Claude Code’s progress may appear to come from model intelligence alone, but Project Marlin suggests a more layered reality. Human engineers are still shaping how AI systems reason through software tasks, identify cleaner solutions, and respond to complex technical prompts.

As AI coding tools become more central to software development, the invisible labor behind them is becoming more specialized, better paid, and more critical. The result is a new kind of coding pipeline, where AI writes the code, but human engineers are still teaching it what “good” looks like.