A note before we begin. I have spent the last few weeks publishing Talking to Machines, a nine-chapter serial publication on my own practice with AI: what worked, what failed, and what each failure revealed. It is available in three installments: Ch. 1-2, Ch. 3-4, Ch. 5-9. This piece turns from practice to theory, or rather, to the theory underneath everyone else's practice.
The debate about AI and human cognition has followed predictable patterns since February of 2025 when the MIT paper dropped. A study appears showing that students who lean on AI show weaker neural engagement, and the headlines declare that ChatGPT is making us dumber. A counter-study appears showing that strategic AI delegation produces deeper learning, and the response is that the doomers were wrong all along. But how do different definitions of learning shape studies and the interpretation of data?
What follows is an attempt to name these critical factors and to position different definitions of learning embedded as sequential rather than mutually exclusive. In other words, the conflicting evidence is partially an artifact of definitional and methodological disconnects between reported as diametrically opposed empirical data.
What Do We Mean by Learning?
The if... constructions below are not rhetorical hedges. Each one is a definition of learning in disguise.
If learning is primarily about retention and recall, the relevant question is whether the learner can reproduce and build on what they encountered. The Kosmyna et al. (2025) MIT study is alarming by this measure. Using EEG to measure neural connectivity across multiple sessions, the researchers found that students who wrote essays with ChatGPT showed the weakest brain connectivity of any group, and that 83 percent could not quote from essays they had just written when subsequently asked to write without AI. Passively consuming AI output leaves no trace in memory. On this definition, AI use during the generative phase is costly regardless of how it is designed.
If learning is primarily about perspective transformation, the relevant question is whether the learner comes out thinking differently than they went in. The Wang and Zhang (2026) study, published in the International Journal of Educational Technology in Higher Education, looks promising by this measure. Surveying 912 students across three continents, Wang and Zhang found that treating AI as an intellectual partner predicted simultaneous increases in critical vigilance toward AI outputs and strategic delegation of tasks, and that both independently predicted transformative learning. A U-shaped curve suggested that scattered, partial AI use produces worse outcomes than either no AI or committed strategic delegation. On this definition, how AI is used matters more than whether it is used.
If learning requires cognitive ownership, the felt sense of having done the thinking yourself, then the generative phase is where ownership is either secured or surrendered. This is not the friction argument, which says difficulty is the active ingredient. It is the more specific claim that you need to have been the one generating before you can meaningfully evaluate whether someone else generated well. On this definition, well-intentioned AI partnership may compromise something that better sequencing cannot restore.
If effort is the active ingredient in learning, the prescription follows directly: preserve productive friction, restrict AI, make it harder. This is the folk theory underlying Vivienne Ming’s widely-circulated claims about AI and cognitive atrophy. The behavioral literature on desirable difficulties and spaced repetition documents real phenomena, but the friction model mistakes a correlate for a cause. Plenty of hard tasks produce no learning. Plenty of easy-feeling tasks produce profound learning.
If learning is fundamentally affective and social, grounded in embodied communal meaning-making, then the terms of the debate shift at a deeper level. The evolutionary and neuroanatomical record locates human cognitive flexibility in an affective and social substrate. Abstract concepts are grounded in interoceptive and communal experience. As Terry Underwood notes in a recent article, a challenging passage of Nietzsche, read with absorption and pleasure, can rearrange your thinking more profoundly than a task made artificially harder for a grade. The mechanism is engagement, not friction, and the peer-reviewed literature on human cognition already describes it with considerably more precision than any metaphor borrowed from physics.
If AI integration into education is the situation we are in and the task is to navigate it wisely, practical design questions are the right ones to pursue. If the inevitability of AI integration is itself a narrative worth examining, the design-first approach looks less like wisdom and more like a concession to institutional and commercial momentum.
The Definitions Are a Sequence
These definitions are not equally foundational, and they are not simply parallel. They describe different stages of a complex, multi-phased and multi-dimensional learning process.
The affective and social substrate is where learning usually begins. Before retention, before transformation, before ownership, there is the question of whether the learner is genuinely engaged with the material as a social and embodied being. This is the most fundamental level, and it is the one most dependent on conditions that have nothing to do with AI: the quality of the learning environment, the presence of other minds, the learner’s felt sense of why the material matters.
Cognitive ownership is what develops when that engagement is sustained through the work of generating. You have to have made something, even provisionally, before you can evaluate what AI makes. Ownership is not a virtue to be cultivated; it is a functional precondition for the critical vigilance that productive AI use requires.
Perspective transformation is what becomes possible when ownership is secure. The learner who has done the generative work can interrogate AI outputs because they have something to interrogate from. This is the stage Wang and Zhang are measuring, and their findings make sense at this stage: a learner with sufficient ownership can use AI partnership to push their thinking further than unassisted study typically allows.
Durable recall is a downstream consequence. What you have owned and transformed tends to stay. What you have consumed passively tends not to. This is the stage Kosmyna et al. are measuring, and their findings also make sense: students who delegated the generative work to AI had nothing to recall because they had not passed through the earlier stages.
Read this way, the studies are not in conflict. They are looking at different points in a sequence and finding, predictably, that AI use at each point has different effects.
The Methodological Problem
The sequence matters for a reason that goes beyond theory. Each stage requires a different research instrument, and the instruments are not interchangeable as the ongoing debate assumes.
EEG in a lab measures neural connectivity during a specific task. It is well suited to detecting whether the brain is doing generative work at the ownership stage. It cannot see perspective transformation, which unfolds over weeks and requires the kind of reflective survey instrument Wang and Zhang used. A cross-continental survey of 912 students can detect shifts in how learners understand their subjects. It cannot detect what is happening in the brain during a 20-minute writing session.
The press reports these studies as if they are testing the same hypothesis with different results. They are testing different hypotheses with instruments appropriate to each. When a headline announces that AI damages learning, it is usually citing evidence from the ownership or recall stage. When a headline announces that AI deepens learning, it is usually citing evidence from the transformation stage. The conflict is real, but it is a conflict between stages, not between findings about the same thing.
This has a practical consequence for how the field should develop. The question is not which study is right. It is whether we can build research programs that follow learners through the sequence, tracking what happens at each stage when AI is introduced in different ways at different points. That is a harder and more expensive research program than any single study, and it is the one the debate actually requires.
Four Researchers, Three Stages
Dr. Philippa Hardman is working at the transformation stage. Her three-zone model and her recovery of the generation effect as a design principle are most useful to learners who already have sufficient ownership to interrogate AI outputs critically. Her practical recommendations presuppose that precondition.
Dr. Terry Underwood is working at the substrate level. His argument is that a better-evidenced account of human cognition already exists in the peer-reviewed literature, and that public prescriptions about AI should be grounded there rather than in unpublished experiments and physics metaphors. He is also raising an epistemic concern that cuts across all stages: when unpublished claims travel through credentialed venues, repetition begins to function as corroboration.
Leon Furze, a PhD researcher studying AI’s implications for writing instruction, is working at the ownership stage. His argument that resistance to AI can be a coherent pedagogical stance is most legible here: you need to have done the work before you can meaningfully judge whether AI has done it well. The gradual release of responsibility model is disrupted when a third party absorbs all the responsibility without intending to return it.
Tina Austin, whose LAK'26 paper develops a framework for metacognition in AI-saturated learning environments, is also working at the ownership stage. Where Furze frames resistance as a pedagogical stance, Austin operationalizes both resistance and visibility as measurable outcomes, where resistance is understood an explicit Level 5 achievement alongside Create. In her Substack essays, Austin argues that students who generate first develop ownership over their thinking that makes subsequent AI use meaningful. Ownership, on this view, is what makes metacognition possible and her goal is to make it measurable.
None of these positions is sufficient alone because no single stage is the whole of learning.
The Common Footing
We are not standing outside this transition looking in. Educators, students, and researchers are negotiating a shared shift in the conditions of thinking, doing it in real time without a settled account of what thinking is.
The question the debate needs is not which theory of learning is correct, but how the field builds research programs that can actually follow a learner through the sequence. That requires instruments suited to each stage, longitudinal designs that can track transfer across stages, and enough methodological transparency that a press cycle cannot mistake a finding about one stage for a verdict about learning as a whole.
Progress will come from getting clearer about what each study is actually measuring, and from being honest about which stage of learning is doing the work behind each conclusion.
Nick Potkalitsky, Ph.D.
References
Hardman, P. (2026, April 16). The “Cognitive Offloading” Paradox. Dr Phil’s Newsletter.
Underwood, T. (2026, April 26). Cognitive Friction: The Folk Intuition About Learning That Distorts the LLM Debate. Learning to Read, Reading to Learn.
Furze, L. (2026). AI Resistance Training Toolkit. leonfurze.com.
Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X. H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv preprint arXiv:2506.08872.
Wang, X., & Zhang, J. (2026). Pedagogical partnerships with generative AI in higher education: how dual cognitive pathways paradoxically enable transformative learning. International Journal of Educational Technology in Higher Education.
Austin, T. (2025). The IKEA Effect and process-based learning: Why students stop caring when AI does the thinking. Tina Austin’s Substack.
Check out some of our favorite Substacks:
Mike Kentz’s AI EduPathways: Insights from one of our most insightful, creative, and eloquent AI educators in the business!!!
Terry Underwood’s Learning to Read, Reading to Learn: The most penetrating investigation of the intersections between compositional theory, literacy studies, and AI on the internet!!!
Suzi’s When Life Gives You AI: A cutting-edge exploration of the intersection among computer science, neuroscience, and philosophy
Alejandro Piad Morffis’s The Computerist Journal: Unmatched investigations into coding, machine learning, computational theory, and practical AI applications
Michael Woudenberg’s Polymathic Being: Polymathic wisdom brought to you every Sunday morning with your first cup of coffee
Rob Nelson’s AI Log: Incredibly deep and insightful essay about AI’s impact on higher ed, society, and culture.
Michael Spencer’s AI Supremacy: The most comprehensive and current analysis of AI news and trends, featuring numerous intriguing guest posts
Daniel Bashir’s The Gradient Podcast: The top interviews with leading AI experts, researchers, developers, and linguists.
Daniel Nest’s Why Try AI?: The most amazing updates on AI tools and techniques
Jason Gulya’s The AI Edventure: An important exploration of cutting-edge innovations in AI-responsive curriculum and pedagogy





Every journalist who runs to their editor so they can write a piece about the latest study of AI in education needs to read this essay. And everyone making dispositive claims in public about the impact of AI on education needs to read this as well.
Just what I needed to read today - an insightful unpicking of the different meanings of 'learning'. It also helps me reflect on why I simultaneously feel that AI has made me learn more and forget more than at any other period of my life.