Does AI Make You a Worse Developer? Anthropic's Study Has Answers

    Anthropic ran a randomized controlled trial on 52 software engineers to find out if AI coding tools hurt skill development. The results are worth reading before you hand everything to Copilot.

    Tob

    Tob

    Backend Developer

    6 min readAI Engineering
    Does AI Make You a Worse Developer? Anthropic's Study Has Answers

    We already know AI speeds up coding. Anthropic's own data shows it can cut task time by up to 80%. But a new randomized controlled trial from Anthropic asks the harder question: what happens to your skills when you let AI do the work?

    TL;DR: AI users scored 17% lower on a coding quiz right after completing a task with AI help. The speed gain was not statistically significant. But how you use AI matters a lot, and not all AI reliance leads to skill loss.

    Study Design

    Anthropic recruited 52 junior-to-mid software engineers, all active Python users, none of them familiar with the Trio async library. That last part matters because Trio was the subject of the task, meaning no one had prior experience to fall back on.

    Participants were split into two groups:

    • AI group: Coded using an online platform with an AI assistant in the sidebar that had full access to their code and could produce correct code on request
    • No AI group: Coded the same tasks by hand with no AI assistance

    Each participant completed two coding features using Trio concepts around asynchronous programming, then immediately took a quiz. The quiz tested four skills: debugging, code reading, code writing, and conceptual understanding. Debugging, code reading, and conceptual questions were weighted more heavily since those are the skills you need to actually oversee AI-generated code.

    Results

    The speed difference was small and not statistically significant. The AI group finished about two minutes faster on average.

    The quiz scores told a different story. AI group averaged 50%. No-AI group averaged 67%. That gap (Cohen's d = 0.738, p = 0.01) is nearly two letter grades. The biggest gap showed up on debugging questions, which is notable because debugging is exactly the skill you need when AI-generated code fails silently or does something subtly wrong.

    Anthropic also annotated screen recordings of every participant to understand how they actually used the AI. What they found was that interaction style predicted outcomes almost as much as whether someone used AI at all.

    Low-scoring patterns (average quiz score below 40%):

    • AI delegation: Handed everything to the AI, completed tasks fastest, learned the least
    • Progressive AI reliance: Started solo but gradually offloaded everything to AI
    • Iterative AI debugging: Used AI to fix errors without trying to understand why they happened

    High-scoring patterns (average quiz score 65% or higher):

    • Generation-then-comprehension: Let AI generate code, then asked follow-up questions to understand it
    • Hybrid code-explanation: Asked AI for code and explanations at the same time
    • Conceptual inquiry: Only asked AI conceptual questions, wrote code themselves, resolved their own errors

    The fastest overall group was AI delegation (full offload). The second fastest was conceptual inquiry (AI for understanding only). The difference: one group learned almost nothing, the other retained strong understanding.

    Conclusion

    The study makes a specific and important point. AI probably does not hurt productivity on tasks where you already know what you are doing. The 80% speed gain Anthropic observed in earlier research was measured on tasks where participants had existing skills. This study measured what happens when you are learning something new.

    The two findings can both be true: AI accelerates work on familiar tasks and slows skill acquisition on unfamiliar ones.

    For junior developers, the risk is real. If you are in the stage where most of your work involves learning new concepts, tools, or patterns, and you are offloading that to AI, you might be completing tickets faster while building a much weaker foundation. When something breaks at 2am and the AI gives you the wrong answer, you need to know enough to catch it.

    The practical takeaway is not "stop using AI." It is about how you use it. Asking AI to explain what it generated, asking conceptual questions before asking for code, resolving errors yourself before reaching for AI help: these behaviors correlated with retained understanding in the study.

    Both Claude and ChatGPT have learning-focused modes (Claude Code Explanatory mode, ChatGPT Study Mode) built for exactly this. If you are learning something new, using those instead of the fastest-output mode is probably worth the extra time.

    Source: Anthropic Research, "How AI assistance impacts the formation of coding skills" (2026) — arxiv.org/abs/2601.20245

    Related Blog

    Does AI Make You a Worse Developer? Anthropic's Study Has Answers | Tob