---
title: "How AI Impacts Skill Formation"
type: source
tags: [ai-skill-formation, cognitive-offloading, software-engineering, rct, anthropic, learning]
authors: [Shen, Judy Hanwen; Tamkin, Alex]
year: 2026
venue: "arXiv:2601.20245 (cs.CY); Anthropic Fellows Program"
kind: paper
raw_path: "raw/AI Capabilities & Adoption/How AI Impacts Skill Formation.pdf"
sources: []
key_claims:
  - "Randomised experiment with developers learning the Trio asynchronous Python library with vs. without AI assistance; AI group scored 17% lower (~two grade points) on a post-task competency quiz, Cohen's d = 0.738, p = 0.010."
  - "Competency deficit spans conceptual understanding, code reading, and debugging — the skills needed to supervise AI-generated code."
  - "No statistically significant completion-time speedup from AI assistance on average; participants who fully delegated saw some productivity gains but at the cost of learning."
  - "Six AI-interaction patterns identified; three preserve skill formation (Generation-then-Comprehension, Hybrid Code-Explanation, Conceptual Inquiry) and three do not (AI Delegation, Progressive AI Reliance, Iterative AI Debugging)."
  - "Ratio of debugging queries to total queries correlates negatively with quiz score (r = -0.41, p = 0.043) — heavy debugging-via-AI predicts weaker skill acquisition."
  - "Control-group skill gains come from the process of encountering and independently resolving errors."
  - "Productivity gains from AI do not equal competence; workflows must be designed to preserve skill formation, especially in safety-critical domains."
  - "Findings extend overreliance / cognitive-offloading literature (Buçinca 2021; Gerlich 2025; Lee 2025) with a causal randomised design rather than observational survey."
created: 2026-04-20
updated: 2026-04-20
---

# How AI Impacts Skill Formation

## Summary
Shen and Tamkin (Anthropic Fellows Program, January 2026) run a randomised controlled trial to measure whether AI coding assistants hinder junior developers' acquisition of new technical skills. Software engineering is used as the testbed because (a) productivity gains from AI are well-documented and largest for novices (Peng et al. 2023; Cui et al. 2024) and (b) the supervisor-of-automation framing (Autor et al. 2001; Bleher & Braun 2022) makes deep skill the key safety prerequisite.

**Design.** Participants complete coding tasks using Trio, a relatively new asynchronous Python library. Treatment participants have access to a Claude-powered assistant ("Cosmo"); control participants do not. After the coding phase, all participants take a 30-minute closed-book evaluation covering seven Trio concepts (async/await, starting Trio functions, error handling, coroutines, memory channels, nurseries, sequential vs. concurrent execution) across three question types: **conceptual understanding, code reading, debugging**.

**Main findings.** The AI group averaged 50% on the quiz; the control group averaged 67% — nearly two letter grades lower (Cohen's d = 0.738, p = 0.010). The debugging sub-score showed the largest gap. Completion time was not significantly shorter with AI: some AI-users spent 15+ queries or >30% of the task window composing prompts. Authors attribute control-group gains to the learning value of encountering and independently resolving errors.

**Taxonomy of AI usage.** Through qualitative review of every screen recording, six interaction patterns emerge:

*High-skill-retention (cognitively engaged):*
- **Generation-then-Comprehension** — generate code, then ask follow-up clarification questions.
- **Hybrid Code-Explanation** — interleave code and explanation requests.
- **Conceptual Inquiry** — ask only conceptual questions, resolve errors independently.

*Low-skill-retention (cognitive offloading):*
- **AI Delegation** — hand the task to the AI.
- **Progressive AI Reliance** — start engaged, slide into delegation.
- **Iterative AI Debugging** — outsource error-resolution to AI.

**Implications.** Productivity and skill formation can be decoupled; AI-enhanced output is not a shortcut to competence. As human ability to supervise AI becomes the scaling bottleneck (Bowman et al. 2022), workflow design — learning modes, cognitive-engagement nudges, time budgets for independent debugging — must actively preserve the cognitive process that produces expertise.

## Connections
- Hub evidence for [[concepts/ai-skill-formation]] — with companion blog post [[sources/2026-shen-anthropic-coding-skills-post]].
- Contrasts with synthetic-task productivity findings in [[sources/2025-becker-metr-ai-developer-productivity]] (experienced devs in large OSS repos actually slower with AI).
- Bottom-up usage complement: [[sources/2025-handa-which-economic-tasks-ai]] (57% augmentation / 43% automation split).
- Enterprise-side skill-atrophy concern mirrored in [[sources/2025-korst-wharton-gen-ai-enterprise-adoption]] (43% of leaders warn of skill declines).
- Anthropic internal-usage hub links to [[entities/alex-tamkin]].
- New entities: [[entities/judy-hanwen-shen]], [[entities/alex-tamkin]].