retrieval practice – Education & Teacher Conferences Skip to main content
Early Thoughts on A.I. Research in Schools
Andrew Watson
Andrew Watson

I hope that one of my strengths as a blogger is: I know what I don’t know — and I don’t write about those topics.

While I DO know a lot about cognitive science — working memory, self-determination theory, retrieval practice — I DON’T know a lot about technology. And: I’m only a few miles into my own A.I. journey; no doubt there will be thousands of miles to go. (My first foray along the ChatGPT path, back in February of this year, did not go well…)

A young child types on a laptop; a small robot points out answers on a see-through screen that hovers between them

Recently I came across research that looks at A.I.’s potential benefits for studying. Because I know studying research quite well, I feel confident enough to describe this particular experiment and consider its implications for our work.

But before I describe that study…

Guiding Principles

Although I’m not a student of A.I., I AM a student of thinking. Few cognitive principles have proven more enduring than Dan Willingham’s immortal sentence: “memory is the residue of thought.”

In other words, if teachers want students to remember something, we must ensure that they think about it.

More specifically:

  • they should think about it successfully (so we don’t want to overload working memory)
  • they should think about it many times (so spacing and interleaving will be important cognitive principles
  • they should think hard about it (so desirable difficulty is a thing)

And so forth.

This core principle — “memory is the residue of thought” — prompts an obvious concern about A.I. in education.

In theory, A.I. simplifies complex tasks. In other words, it reduces the amount of time I think about that complexity.

If artificial intelligence reduces the amount of time I that I’m required to think about doing the thing, it necessarily reduces the amount of learning I’ll do about the thing.

If “memory is the residue of thought,” then less thinking means less memory, and less learning…

Who Did What?

Although discussions of generative A.I. often sound impenetrable to me, this study followed a clear and sensible design.

Researchers from the University of Pennslyvania worked with almost 1000 students at a high school in Turkey. (In this kind of research, 1000 is an unusually high number.)

These students spent time REVIEWING math concepts they had already learned. This review happened in three phases:

Phase 1: the teacher re-explained math concepts.

Phase 2: the students practiced independently.

Phase 3: the students took a test on those math concepts. (No book; no notes; nada.)

For all students, phases 1 and 3 were identical. Phase 2, however, gave researchers a chance to explore their question.

Some students (let’s call them Group A) practiced in the usual way: the textbook, their notes, paper and pencil.

Group B, on the other hand, practiced with ChatGPT at hand. They could ask it questions to assist with their review.

Group C practiced with a specially designed ChatGPT tutor. This tutor was programed not to give answers to students’ questions, but to provide hints. (There were other differences between the ChatGPT and the ChatGPT tutor, but this difference strikes me as most pertinent.)

So: did ChatGPT help?

Did the students in Groups B and C have greater success on the practice problems, compared to Group A?

Did they do better on the test?

Intriguing Results

The students who used A.I. did better on the practice problems.

Those who used ChatGPT scored 48% higher than their peers in Group A.

Those who used the ChatGPT tutor scored (are you sitting down?) 127% higher than their peers in Group A.

Numbers like these really get our attention!

And yet…we’re more interested in knowing how they did on the test; that is, how well did they do when they couldn’t look at their books, or ask Chatty questions.

In brief: had they LEARNED the math concepts?

The students who used regular ChatGPT scored 17% lower than their notes-n-textbook peers.

Those who used the ChatGPT tutor scored the same as those peers.

In brief:

A.I. helped students succeed during practice.

But, because it reduced the amount of time they had to THINK about the problems, it didn’t help them learn.

Case closed.

Case Closed?

In education, we all too easily rush to extremes. In this case, we might easily summarize this study in two sentences:

“A.I. certainly didn’t help students learn; in some cases it harmed their learning. Banish A.I.!”

While I understand that summary, I don’t think it captures the full message that this study gives us.

Yes: if we let students ask ChatGPT questions, they think less and therefore learn less. (Why do they think less? Probably they simply ask for the answer to the question.)

But: if we design a tutor that offers hints not answers, we reduce that problem … and eliminate the difference in learning. (Yes: the reseachers have data showing that the students spent more time asking the tutor questions; presumably they had to think harder while doing so.)

As a non-expert in this field, I suspect that — sooner or later — wise people somewhere will be able to design A.I. tutors that are better at asking thought-provoking hints. That is: perhaps an A.I. tutor might cause students to think even MORE than other students praticing the old-fashioned way.

That two sentence summary above might hold true today. But we’ve learned this year that A.I. evolves VERY rapidly. Who knows what next month will bring.

TL;DR

Although THIS study suggests that A.I. doesn’t help (and might harm) learning, it also suggests that more beneficial A.I. tutors might exist in the future.

If — and this is the ESSENTIAL “if” — if A.I. can prompt students to THINK MORE than they currently do while practicing, then well-established cog-sci principles suggest that our students will learn more.


* A note about the publication status of this study. It has not yet been peer reviewed and published, although it is “under review” at a well-known journal. So, it’s technically a “working paper.” If you want to get your research geek on, you can check out the link above.


Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, O., & Mariman, R. (2024). Generative ai can harm learning. Available at SSRN4895486.

Again with the Questions (Second of a Series)
Andrew Watson
Andrew Watson

Three weeks ago, I started a short series of blog posts about asking questions.

After all, we’ve got SO MUCH RESEARCH about questions, we need to keep track and make sense of it all.

To structure these posts, I’ve been focusing on these three topics:

When to ask a particular kind of question?

Who benefits most immediately from doing so?

What do we do with the answers?

So, for questions that we ask BEFORE learning starts (“before” is the “when”):

Teachers check our students’ prior knowledge for our own benefit; now we know how best to plan an upcoming unit.

and

We ask “prequestions” for students’ benefit; it turns out that — even though students don’t know the answers to prequestions — they benefit from trying to answer them.

So: here’s the next “when”…

DURING Class

We’ve explored the questions to ask BEFORE students start learning (prior knowledge, “prequestions”).

What about DURING the learning process?

Two students raising their hands and laughing as they look at each other

Here again, I think we’ve got two essential categories. Importantly, we should plan and think about these questions differently.

Firstchecking for understanding.

Of course, we want our students to understand the ideas and processes that we’re discussing in class today. Alas, “understanding” happens invisibly, inside our students’ minds.

The only way to ensure that they understand: ask them questions. Their answers will make that invisible understanding visible — or, perhaps, audible.

When checking for understanding, we should keep some key principles in mind:

We should check for understanding frequently throughout a lesson. The correct number of times will vary depending on context. As a high school teacher, I rarely go more than seven or eight minutes without some kind of check.

As Doug Lemov says: “reject self report.” Our students don’t understand today’s topic well enough to know whether or not they understand — so it NEVER helps to ask students “got it?”

Be sure that everyone answers the checking-for-understanding questions. Whether we use mini-whiteboards or cold calling or quizlet, we want as broad a sample as possible of our students before we move on to the next step of the topic.

We should ask tricky questions, but not trick questions. That is: the questions should be difficult enough to ensure that students genuinely understand the topic (that’s “tricky”), but we’re not trying to fool them (“trick”).

Of course, wise thinkers have LOTS more to say about checking for understanding, but these few principles give us a strong start.

Important Distinctions

So, “who benefits” from checking for understanding? And: “what do we do with the answers”?

Roughly speaking, the teacher benefits from checking for understanding. If I C4U and discover that my students DO understand, I know:

a) that my teaching method for those several minutes worked as I had hoped, and

b) that I can continue the lesson.

If my students DON’T understand, I know:

a) it didn’t, and

b) I shouldn’t.

In other words: checking for understanding provides useful feedback to the teacher.

What should I do with the answers to these questions?

The right and wrong answers I see/hear will guide me as I decide what to do next.

If, for instance, my students answer a question about direct objects incorrectly, I might not provide the correct answer right away. But I will draw on that feedback when I think about revising my lesson plan for the next 10 minutes.

During Class, Part 2

Of course, not all in-class questions focus on checking for understanding new material. I might — in fact, I should — devote some class time to reviewing and consolidating ideas that students have already learned.

To meet this goal, I will almost certainly rely on retrieval practice.

This blog has written EXTENSIVELY about retrieval practice, so I won’t do a deep dive here. You can check out previous posts, or savor this awesome website.

The headline is: students learn more not by reviewing material but by actively trying to retrieve it.

Rather than say: “Remember, the Ideal Gas Law says that pv=nrt.”

I should say

“Okay, please try to remember the Ideal Gas Law. Don’t should out; I’ll ask you to write it on your mini-whiteboards in a moment.”

We’ve got heaps o’ research showing that the extra mental effort required by retrieval helps consolidate memories.

Notice; I’m NOT trying to see if students have an initial understanding. When I taught this concept last week, I checked for understanding. My students DID understand it.

Instead, I’m trying to consolidate the understanding they had back then.

Important Distinctions

Once again: “who benefits” from retrieval practice? And: “what do we do with the answers”?

Whereas I, the teacher, benefit from  checking for understanding, MY STUDENTS benefit from retrieval practice. That mental effort helps them consolidate and transfer the ideas they retrieved.

(Of course, I do get useful feedback about the stickiness of their prior learning, but that’s not the primary goal of RP.)

What should I do with the answers? Especially wrong answers?

This question leads to a surprisingly intricate answer. The short version goes like this:

If I have time, it’s helpful to correct wrong answers to retrieval practice questions ASAP.

I should do so ESPECIALLY if the question touches on an important core idea or procedure.

But:

Students get the benefit of retrieval practice even if they get the answer wrong. As long as they come across the correct answer eventually, they’ll benefit.

This topic gets nuanced quickly, but the headline is: wrong answers aren’t tragedies in retrieval-practice world.

To Sum Up

We ask students questions BEFORE learning; we take stock of their prior knowledge, and seed future learning with prequestions.

DURING class, we frequently check for understanding to ensure that current learning is happening as we hoped.

And we ask retrieval practice questions about ideas and procedures learned before, in order to help them consolidate that learning.

If we understand the differences among the purpose for and response to these questions, we will use them more effectively.

Getting the Details Just Right: Retrieval Practice
Andrew Watson
Andrew Watson

As we gear up for the start of a new school year, we’re probably hearing two words over and over: retrieval practice.

That is: students have two basic options when they go back over the facts, concepts, and procedures they’ve learned.

Option 1: they could review it; that is, reread a passage, or rewatch a video, or review their notes.

Option 2: they could retrieve it; that is, ask themselves what they remember about a passage, a video, or a page of notes.

Well, the research verdict is clear: lots of research shows that OPTION 2 is the winner. The more that students practice by retrieving, the better they remember and apply their learning in the long term.

This clear verdict, however, raises lots of questions.

How, exactly, should we use retrieval practice in classrooms.

Does it work in all disciplines and all grades?

Is its effectiveness different for boys and girls?

Does retrieval practice help students remember material that they didn’t practice?

Do multiple choice questions count as retrieval practice?

And so forth.

Given that we have, literally, HUNDREDS of studies looking at these questions, we teachers would like someone to sort through all these sub-questions and give us clear answers.

Student contentrating on taking notes and reading books in the library

Happily, a research team recently produced just such a meta-analysis. They looked at 222 studies including more than 48,000 students, and asked nineteen specific questions.

These numbers are enormous.

Studies often get published with a few dozen participants – which is to say, a lot less than 48,000.

Researchers often ask 2 or 3 questions – or even 1. I don’t recall ever seeing a study or meta-analysis considering nineteen questions.

As a result, we’ve got a lot to learn from this meta-analysis, and can feel more confidence than usual in its conclusions.

The Big Picture

For obvious reasons, I won’t discuss all nineteen questions in detail. Instead, I’ll touch on the big-picture conclusions, highlight some important questions about practical classroom implementation, and point out a few surprises.

The high-level findings of this meta-analysis couldn’t be more reassuring.

YES: retrieval practice enhances long-term memory.

YES: in fact, it enhances memory of facts and concepts, and improves subsequent problem solving. (WOW.)

YES: it benefits students from kindergarten to college, and helps in all 18 (!!) disciplines that the researchers considered.

NO: the student’s gender doesn’t matter. (I was honestly a little surprised they studied this question, but since they’ve got an answer I’m reporting it here.)

I should note that these statistical results mostly fall in the “medium effect size” range: a hedges g of something like 0.50. Because I’m commenting on so many findings, I won’t comment on statistical values unless they’re especially high or low.

So the easy headline here is: retrieval practice rocks.

Making Retrieval Practice Work in the Classroom

Once teachers know that we should use retrieval practice, we’ve got some practical questions about putting it into practice.

Here again, this meta-analysis offers lots of helpful guidance.

Does it help for students to answer similar questions over multiple days?

Yes. (Honestly, not really surprising – but good to know.)

More specifically: “There is a positive relationship between the number of [retrieval practice] repetitions and the [ultimate learning outcome], indicating that the more occasions on which class content is quizzed, the larger the learning gains.”

Don’t just use retrieval practice; REPEAT retrieval practice.

Is feedback necessary?

Feedback significantly increases the benefit of retrieval practice – but the technique provides benefits even without feedback.

Does the mode matter?

Pen and paper, clicker quizzes, online platforms: all work equally well.

Me: I write “do now” questions on the board and my students write down their answers. If you want to use quizlet or mini-white boards, those strategies will work just as well.

Does retrieval practice help students learn untested material?

This question takes a bit of explaining.

Imagine I design a retrieval exercise about Their Eyes Were Watching God. If I ask my students to recall the name of Janie’s first husband (Logan Killocks), that question will help them remember his name later on.

But: will it help them remember the name of her second husband? Or, her third (sort-of) husband?

The answer is: direct retrieval practice questions help more, but this sort of indirect prompt has a small effect.

In brief, if I want my students to remember the names Jody Starks and Vergible Woods, I should ask them direct questions about those husbands.

Shiver Me Timbers

So far, these answers reassure me, but they don’t surprise me.

However, the meta-analysis did include a few unexpected findings.

Does the retrieval question format matter? That is: is “matching” better than “short answer” or “free recall” or “multiple choice”?

To my surprise, “matching” and “fill-in-the-blank” produce the greatest benefits, and “free recall” the least.

This finding suggests that the popular “brain dump” approach (“write down everything you remember about our class discussion yesterday!”) produces the fewest benefits.

I suspect that “brain dumps” don’t work as well because, contrary to the advice above, they don’t directly target the information we want students to remember.

Which is more effective: a high-stakes or a low-stakes format?

To my astonishment, both worked (roughly) equally well.

So, according to this meta-analysis, you can grade or not grade retrieval practice exercises. (I will come back to this point below.)

Should students collaborate or work independently on retrieval practice answers?

The studies included in the meta-analysis suggest no significant difference between these approaches. However, the researchers note that they don’t have all that many studies on the topic, so they’re not confident about this answer. (For a number of reasons, I would have predicted that individual work helps more.)

Beyond the Research

I want to conclude by offering an opinion that springs not from research but from experience.

For historical reasons, “retrieval practice” had a different name. Believe it or not, it was initially called “the testing effect.” (In fact, the authors of this meta-analysis use this term.)

While I understand why researchers use it, I think we can agree that “the testing effect” is a TERRIBLE name.

No student anywhere wants to volunteer for more testing. No teacher anywhere either.

And – crucially – the benefits have nothing to do with “testing.” We don’t need to grade them. Students don’t need to study. The retrieving itself IS the studying.

For that reason, I think teachers and schools should focus as much as possible on the “retrieval” part, and as little as possible on the “testing.”

No, HONESTLY, students don’t need to be tested/graded for this effect to work.

TL;DR

Retrieval practice — in almost any form — helps almost everybody learn, remember, and use almost anything.

As long as we don’t call it “testing,” schools should employ retrieval strategically and frequently.


Yang, C., Luo, L., Vadillo, M. A., Yu, R., & Shanks, D. R. (2021). Testing (quizzing) boosts classroom learning: A systematic and meta-analytic review. Psychological Bulletin147(4), 399.

“Seductive Details” meet “Retrieval Practice”: A Match Made in Cognitive Heaven
Andrew Watson
Andrew Watson

Here’s a common problem: your job today is to teach a boring topic. (You don’t think it’s boring, but your students always complain…)

What’s a teacher to do?

One plausible strategy: You might enliven this topic in some entertaining way.

You’ve got a funny video,

or a clever cartoon,

or a GREAT anecdote about a colleague’s misadventure.

Okay, so this video/cartoon/anecdote isn’t one of today’s learning objectives. BUT: it just might capture your students’ interest and help them pay attention.

However tempting, this strategy does create its own problems. We’ve got lots of research showing that these intriguing-but-off-topic details can get in the way of learning.

That is: students rTwo baby goats, one brown and white, theo other black and white, frolicking in a field.emember the seductive details (as they’re known in the research literature), but less of the actual content we want them to know.

Womp womp.

Some time ago, I wrote about a meta-analysis showing that — yup — seductive details ACTUALLY DO interfere with learning: especially for beginners, especially in shorter lessons.

What could we do to fix this problem? If we can’t use our anecdotes and cartoons, do we just have to bore our students?

“Right-Sized” Retrieval Practice

Here’s one approach we might try: right-sized retrieval practice.

What does “right-sized” mean? Here goes:

One retrieval practice strategy is a brain dump. The instructions sounds something like this: “write down everything you remember about today’s grammar lesson.”

Another retrieval practice strategy calls for more specific questions: “what’s the differenece between a gerund and a participle?” “How might a participle create a dangling modifier?”

A group of scholars in Germany studied this hypothesis:

If teachers use the brain dump approach, students will remember the seductive detail — and it will become a part of their long-term memory.

If, on the other hand, teachers ask specific questions, students will remember the important ideas of the lesson — and not consolidate memory of the seductive detail.

They ran a straightforward study, considering a topic close to every teacher’s heart: coffee.

100+ college students in Germany read a lengthy passage on coffee: information about the coffee plant, its harvesting, its preparation, and its processing.

Half of them read a version including fun-but-extraneous information. For instance: do you know coffee was discovered?

Turns out: goat herders noticed that their goats ate the coffee beans and then did a kind of happy dance. Those herders wondered: could we get the same happy effects? Thus was born today’s coffee industry…

Remembering the GOAT

After reading these coffee passages — with or without seductive details — students answered retrieval practice questions.

Some got a “brain dump” promt: “What do you remember about coffee?”

Others got the specific questions: “What harvesting methods do you remember, and how do they differ?”

So, what effect did those specific questions have on memory of seductive details one week later?

Sure enough, as the researchers had hypothesized, students who answered specific retrieval practice questions remembered MORE of the lesson’s meaningful content.

And, they remembered LESS (actually, NONE) of the seductive details. (Of course, the details get complicated, but this summary captures the main idea.)

BOOM.

So, what’s a classroom teacher to do?

As is so often the case, we should remember that researchers ISOLATE variables and teachers COMBINE variables.

We always have to think about many (many!) topics at once, while research typically tries to find out the importance of exactly one thing.

Putting all these ideas together, I’d recommend the following path:

If I have to teach a topic my students find dull, I can indeed include some seductive details (Ha ha! Goats!) to capture their interest — as long as I conclude that lesson with some highly specific retrieval practice questioning.

And, based on this earlier post on seductive details, this extra step will be especially important if the lesson is short, or the students are beginners with this topic.

TL;DR

Seductive details can capture students’ interest, but also distract them from the important topics of the lesson.

To counteract this problem, teachers should plan for retriveal practice including specific questions — not just a brain dump.


By the way: I first heard about this “retrieval practice vs. seductive details” study from Bradley Busch (Twitter: @BradleyKBusch) and Jade Pearce (Twitter: @PearceMrs). If you’re not familiar with their work, be sure to look them up!


Eitel, A., Endres, T., & Renkl, A. (2022). Specific questions during retrieval practice are better for texts containing seductive details. Applied Cognitive Psychology36(5), 996-c1008.

Sundararajan, N., & Adesope, O. (2020). Keep it coherent: A meta-analysis of the seductive details effect. Educational Psychology Review32(3), 707-734.

The Limitations of Retrieval Practice (Yes, You Read That Right)
Andrew Watson
Andrew Watson

Last week, I wrote that “upsides always have downsides.”

African American student wearing a bow tie, hand to forehead, looking frustrated and disappointed

That is: anything that teachers do to foster learning (in this way) might also hamper learning (in that way).

We should always be looking for side effects.

So, let me take a dose of my own medicine.

Are there teaching suggestions that I champion that have both upsides and conspicuous downsides?

Case in Point: Retrieval Practice

This blog has long advocated for retrieval practice.

We have lots (and LOTS) of research showing that students learn more when they  study by “taking information out of their brains” than “putting information back into their brains.” (This phrasing comes from Agarwal and Bain.)

So:

Students shouldn’t study vocabulary lists; they should make flash cards.

They shouldn’t review notes; insted, they should quiz one another on their notes.

Don’t reread the book; try to outline its key concepts from memory.

In each of these cases (and hundred more), learners start by rummaging around in their memory banks to see if they can remember. All that extra mental work results in more learning.

SO MUCH UPSIDE.

But wait: are there any downsides?

Let the Buyer Beware: Retrieval-Induced Forgetting

Sure enough, some researchers have focused on “retrieval-induced forgetting.”

Yup. That means remembering can cause forgetting.

How on earth can that be? Here’s the story…

Step 1: Let’s say I learn the definitions of ten words.

Step 2: I use retrieval practice to study the definitions of five of them. So, I remembered five words.

Step 3: Good news! Retrieval practice means I’ll remember the five words that I practiced better.

Step 4: Bad news! Retrieval-induced forgetting means I’ll remember the five words I didn’t practice worse. Yes: worse than if I hadn’t practiced those other five words.

In brief: when I remember part of a topic, I’m likelier to FORGET the part I didn’t practice. (Although, of course, I’m likelier to REMEMBER the part I did practice.)

So, retrieving induces forgetting. Now that’s what I call a downside.

Potential solution?

How do our students get the good stuff (memories enhanced by retrieval practice) without the bad stuff (other memories inhibited by retrieval practice)?

Here’s an obvious solution: tell our students about retrieval-induced forgetting.

Heck, let’s go one step further: tell them about it, and encourage them to resist its effects.

One research group — led by Dr. Jodi Price — tried just this strategy.

The research design here gets quite complicated, but the headline is:

They ran the same “retrieval-induced forgetting” study that others had run, and this time added a brief description of the problem.

In some cases, they added encouragement on how to overcome this effect.

So, what happened when they warned students?

Nothing. Students kept right on forgetting the un-practiced information (although they kept right on remembering the practiced information).

In brief: warnings about retrieval-induced forgetting just didn’t help. (Heck: in some cases, they seemed to promote even more forgetting.)

Alternative Solutions?

Much of the time, we benefit our students by telling them about reserach in cognitive science.

I routinely tell my high-school students about retrieval practice. I show them exactly the same studies and graphs that I show teachers in my consulting work.

In this case, however, it seems that sharing the research doesn’t help. Telling students about retrieval-induced forgetting didn’t stop retrieval induced forgetting.

Conclusion: it’s up to teachers to manage this side effect.

How? We should require retrieval of all essential elements.

For example:

When I teach my students about comedy and tragey, the definitions of those terms include lots of moving pieces.

I know that ALL THE PIECES are equally important. So I need to ensure that my retrieval practice exercises include ALL THE PARTS of those definitions.

Students don’t need to remember everything I say. But if I want them to remember, I need to ensure retrieval practice happens.

Each of us will devise different strategies to accomplish this goal. But to get the upside (from retrieval practice) we should mitigate the downside (from retrieval-induced forgetting).

TL;DR

Retrieval practice is great, but it might cause students to forget the parts they don’t retrieve.

Alas, we can’t solve this problem simply by warning our students.

So, we should structure our review sessions so that students do in fact retrieve EVERYTHING we want them to remember.

If we create such comprehensive retrieval, students can get the upsides and without the downsides.

 

 


Price, J., Jones, L. W., & Mueller, M. L. (2015). The role of warnings in younger and older adults’ retrieval-induced forgetting. Aging, Neuropsychology, and Cognition22(1), 1-24.

How To Make Sure Homework Really Helps (a.k.a.: “Retrieval Practice Fails”)
Andrew Watson
Andrew Watson

Most research focuses narrowly on just a few questions. For instance:

“Does mindful meditation help 5th grade students reduce anxiety?”

“How many instructions overwhelm college students’ working memory?”

“Do quizzes improve attention when students learn from online videos?”

Very occasionally, however, just one study results in LOTS of teaching advice. For instance, this recent research looks at data from ELEVEN YEARS of classroom teaching.

Student Doing Homework with Laptop

Professor Arnold Glass (writing with Mengxue Kang) has been looking at the benefits of various teaching strategies since 2008.

For that reason, he can draw conclusions about those strategies. AND, he can draw conclusions about changes over time.

The result: LOTS of useful guidance.

Here’s the story…

The Research

Glass has been teaching college courses in Memory and Cognition for over a decade. Of course, he wants to practice what he preaches. For instance:

First, when Glass’s students learn about concepts, he begins by asking them to make plausible predictions about the topics they’re going to study.

Of course, his students haven’t studied the topic yet, so they’re unlikely to get the answers right. But simply thinking about these questions helps them remember the correct answers that they do learn.

In research world, we often call this strategy “pretesting” or “prequestions.”

Second, after students learn the topics, he asks them to answer questions about them from memory.

That is: he doesn’t want them to look up the correct answers, but to try and remember the correct answers.

In research world, we call this technique “retrieval practice” or “the testing effect.”

Third, Glass spreads these questions out over time. His students don’t answer retrieval practice questions once; they do so several times.

In research world, we call this technique “spacing.”

Because Glass connects all those pretesting and retrieval practice questions to exam questions, he can see which strategies benefit.

And, because he’s been tracking data for years, he can see how those benefits change over time.

The Results: Good & Bad

Obviously, Glass’s approach generates LOTS of results. So, let’s keep things simple.

First Headline: these strategies work.

Pretesting and retrieval practice and spacing all help students learn.

These results don’t surprise us, but we’re happy to have confirmation.

Second Headline: but sometimes these strategies don’t work.

In other words: most of the time, students get questions right on the final exam more often than they did for the pretesting and the retrieval practice.

But, occasionally, students do better on the pretest question (or the retrieval practice question) than on the final exam.

Technically speaking, that result is BIZARRE.

How can Glass explain this finding?

Tentative Explanations, Alarming Trends

Glass and Kang have a hypothesis to explain this “bizarre” finding. In fact, this study explores their hypothesis.

Glass’s students answer the “pretesting” questions for homework. What if, instead of speculating to answer those pretesting questions, the students look the answer up on the interwebs?

What if, instead of answering “retrieval practice” questions by trying to remember, the students look up the answers?

In these cases, the students would almost certainly get the answers right — so they would have high scores on these practice exercises.

But they wouldn’t learn the information well, so they would have low scores on the final exam.

So, pretesting and retrieval practice work if students actually do it.

But if the students look up answer instead of predicting, they don’t get the benefits of prequestions.

If they look up the answer instead of trying to remember, they don’t get the benefit of retrieval practice.

And, here’s the “alarming trend”: the percentage of students who look up the answers has been rising dramatically.

How dramatically? In 2008, it was about 15%. In 2018, it was about 50%.

Promises Fulfilled

The title of the blog post promises to make homework helpful (and to point out when retrieval practice fails).

So, here goes.

Retrieval practice fails when students don’t try to retrieve.

Homework that includes retrieval practice won’t help if students look up the answers.

So, to make homework help (and to get the benefits of retrieval practice), we should do everything we reasonably can to prevent this shortcut.

Three strategies come quickly to mind.

First: don’t just use prequestions and retrieval practice. Instead, explain the logic and the research behind them. Students should know: they won’t get the benefits if they don’t do the thinking.

Second: as must as is reasonably possible, make homework low-stakes or no-stakes. Students have less incentive to cheat if doing so doesn’t get them any points. (And, they know that it harms their learning.)

Third: use class time for both strategies.

In other words: we teachers ultimately can’t force students to “make educated predictions” or “try to remember” when they’re at home. But we can monitor them in class to ensure they’re doing so.

These strategies, to be blunt, might not work well as homework — especially not at the beginning of the year. We should plan accordingly.

TL;DR

Prequestions and retrieval practice do help students learn, but only if students actually do the thinking these strategies require.

We teachers should be realistic about our students’ homework habits and incentives, and design assignments that nudge them in the right directions.

 

Glass, A. L., & Kang, M. (2022). Fewer students are benefiting from doing their homework: an eleven-year study. Educational Psychology42(2), 185-199.

Getting the Order Just Right: When to “Generate,” When to “Retrieve”?
Andrew Watson
Andrew Watson

When teachers get advice from psychology and neuroscience, we start by getting individual bits of guidance. For instance…

… mindful meditation reduces stress, or

… growth mindset strategies (done the right way) can produce modest benefits, or

… cell phones both distract students and reduce working memory.

Each single suggestion has its uses. We can weave them, one at a time, into our teaching practices.

After a while, we start asking broader questions: how can we best combine all those individual advice bits?

For instance: might the benefits of growth mindset strategies offset the detriments of cell phones?

Happily, in recent years, researchers have started to explore these combination questions.

Retrieval Practice, Generative Learning

Long time readers know about the benefits of retrieval practice. Rather than simply review material, students benefit when they actively try to recall it first.

So too, generative learning strategies have lots of good research behind them. When students have to select, organize, and integrate information on their own, this mental exercise leads to greater learning. (Check out a handy book review here.)

Now that we have those individual bits of guidance, can we put them together? What’s the best way to combine retrieval practice with generative learning?

A recent study explored exactly this question.

Researchers in Germany had college students study definitions of 8 terms in the field of “social attribution.”

So, for instance, they studied one-sentence definitions of “social norms” or “distinctiveness” or “self-serving bias.”

One group — the control group — simply studied these definitions twice.

A second group FIRST reviewed these words with retrieval practice, and THEN generated examples for these concepts (that’s generative learning).

A third group FIRST generated examples, and THEN used retrieval practice.

So, how well did these students remember the concepts — 5 minutes later, or one day later?

The Envelope Please

The researchers wanted to know: does the order (retrieval first? generation first?) matter?

The title of their study says it all: “Sequence Matters! Retrieval practice before generative learning is more effective than the reverse order.”

Both 5 minutes later and the next day, students who did retrieval practice first remembered more than those who came up with examples first (and, more than the control group).

For a variety of statistical reasons, I can’t describe how much better they did. That is: I can’t say “These student scored a B, and these score a B-.” But, they did “better enough” for statistical models to notice the difference.

And so, very tentativelyI think we teachers can plan lessons in this way: first instruct, then have students practice with retrieval, then have them practice with generation.

Wait, Why “Tentatively”?

If the research shows that “retrieval first” helps students more than “generation first,” why am I being tentative?

Here’s why:

We can’t yet say that “research shows” the benefits of a retrieval-first strategy.

Instead, we can say that this one study with these German college students who learned definitions of words suggests that conclusion.

But: we need many more studies of this question before we can spot a clear pattern.

And: we’d like some 1st grade students in Los Angeles, and some 8th grade students in Reykjavik, and some adult learners in Cairo before we start thinking of this conclusion as broadly applicable.

And: we’d like to see different kinds of retrieval practice, and different kinds of generative learning strategies, before we reach a firm conclusion.

After all, Garvin Brod has found that different generative learning strategies have different levels of effectiveness in various grades. (Check out this table from this study.)

To me, it seems entirely plausible that students need to retrieve ideas fluently before they can generate new ideas with them: hence, retrieval practice before generative learning.

But, “entirely plausible” isn’t a research-based justification. It’s a gut feeling. (In fact, for various reasons, the researchers had predicted the opposite finding.)

So, I think teachers should know about this study, and should include it our thinking.

But, we shouldn’t think it’s an absolute conclusion. If our own students simply don’t learn well with this combination, we might think about switching up the order.

TL;DR

Students learn more from retrieval practice, and they learn more from generative learning strategies.

If we want to combine those individual strategies, we’ll (probably) help students more if we start with retrieval practice.

And: we should keep an eye out for future research that confirms — or complicates — this advice.


Roelle, J., Froese, L., Krebs, R., Obergassel, N., & Waldeyer, J. (2022). Sequence matters! Retrieval practice before generative learning is more effective than the reverse order. Learning and Instruction80, 101634.

Brod, G. (2020). Generative learning: Which strategies for what age?. Educational Psychology Review, 1-24.

Let’s Get Practical: What Works Best in the Classroom?
Andrew Watson
Andrew Watson

At times, this blog explores big-picture hypotheticals — the “what if” questions that can inspire researchers and teachers.

And, at times, we just want practical information. Teachers are busy folks. We simply want to know: what works? What really helps my students learn?

That question, in fact, implies a wise skepticism. If research shows a teaching strategy works well, we shouldn’t just stop with a study or two.

Instead, we should keep researching and asking more questions.

Does this strategy work with …

… older students as well as younger students?

… history classes as well as music classes as well as sports practice?

… Montessori classrooms, military academies, and public school classrooms?

this cultural cultural context as well as that cultural context?

And so forth.

In other words, we want to know: what have you got for me lately?

Today’s News

Long-time readers know of my admiration for Dr. Pooja Agarwal.

Her research into retrieval practice has helped clarify and deepen our understanding of this teaching strategy.

Her book, written with classroom teacher Patrice Bain, remains one of my favorites in the field.

And she’s deeply invested in understanding the complexity of translating research into the classroom.

That is: she doesn’t just see if a strategy works in the psychology lab (work that’s certainly important). Instead, she goes the next step to see if that strategy works with the messiness of classrooms and students and schedule changes and school muddle.

So: what has she done for us lately? I’m glad you asked.

Working with two other scholars, Agarwal asked all of those questions I listed above about retrieval practice.

That is: we think that retrieval practice works. But: does it work with different ages, and various subjects, in different countries?

Agarwal and Co. wanted to find out. They went though an exhaustive process to identify retrieval practice research in classrooms, and studied the results. They found:

First: yup, retrieval practice really does help. In 57% of the studies, the Cohen’s d value was 0.50 or greater. That’s an impressively large result for such a simple, low-cost strategy.

Second: yup, it works it in different fields. By far the most research is done in science and psychology (19 and 16 studies), but it works in every discipline where we look — including, say, history or spelling or CPR.

Third: yup, it works at all ages. Most research is done with college students (and, strangely, medical students), but works in K-12 as well.

Fourth: most retrieval practice research is done with multiple choice. (Yes: a well-designed multiple choice test can be retrieval practice. “Well-designed” = “students have to THINK about the distractors.”)

Fifth: we don’t have enough research to know what the optimal gap is between RP and final test.

Sixth: surprisingly, not enough classroom research focused on FEEDBACK. You’d think that would be an essential component…but Team Agarwal didn’t find enough research here to draw strong conclusions.

Seventh: Of the 50 studies, only 3 were from “non-Western” countries. So, this research gap really stands out.

In brief: if we want to know what really works, we have an increasingly clear answer: retrieval practice works. We had good evidence before; we’ve got better evidence now.

Examples Please

If you’re persuaded that retrieval practice is a good idea, you might want to be sure exactly what it is.

You can always use the “tags” menu on the right; we blog about retrieval practice quite frequently, so you’ve got lots of examples.

But, here’s a handy description (which I first heard in Agarwal and Bain’s book):

When students review, they put information back into their brains. So: “rereading the textbook” = “review,” because students try to redownload the book into their memory systems.

When students use retrieval practice, they take information out of their brains. So, “flashcards” = “retrieval practice,” because students have to remember what that word means.

So:

Reviewing class notes = review.

Outlining the chapter from memory = retrieval practice.

Short answer questions = retrieval practice.

Watching a lecture video = review.

When you strive for retrieval practice, the precise strategy is less important than the cognitive goal. We want student to try to remember before they get the correct answer. That desirable difficulty improves learning.

And, yes, retrieval practice works.

To Grade or Not to Grade: Should Retrieval Practice Quizzes Be Scored? [Repost]
Andrew Watson
Andrew Watson

We’ve seen enough research on retrieval practice to know: it rocks.

When students simply review material (review their notes; reread the chapter), that mental work doesn’t help them learn.

However, when they try to remember (quiz themselves, use flashcards), this kind of mental work does result in greater learning.

In Agarwal and Bain’s elegant phrasing: don’t ask students to put information back into their brains. Instead, ask them to pull information out of their brains.

Like all teaching guidance, however, the suggestion “use retrieval practice!” requires nuanced exploration.

What are the best methods for doing so?

Are some retrieval practice strategies more effective?

Are some frankly harmful?

Any on-point research would be welcomed.

On-Point Research

Here’s a simple and practical question. If we use pop quizzes as a form of retrieval practice, should we grade them?

In other words: do graded pop quizzes result in more or less learning, compared to their ungraded cousins?

This study, it turns out, can be run fairly easily.

Dr. Maya Khanna taught three sections of an Intro to Psychology course. The first section had no pop quizzes. In the second section, Khanna gave six graded pop quizzes. In the third, six ungraded pop quizzes.

Students also filled out a questionnaire about their experience taking those quizzes.

What did Khanna learn? Did the quizzes help? Did grading them matter?

The Envelope Please

The big headline: the ungraded quizzes helped students on the final exam.

Roughly: students who took the ungraded pop quizzes averaged a B- on the final exam.

Students in the other two groups averaged in the mid-to-high C range. (The precise comparisons require lots of stats speak.)

An important note: students in the “ungraded” group scored higher even though the final exam did not repeat the questions from those pop quizzes. (The same material was covered on the exam, but the questions themselves were different.)

Of course, we also wonder about our students’ stress. Did these quizzes raise anxiety levels?

According to the questionnaires, nope.

Khanna’s students responded to this statement: “The inclusion of quizzes in this course made me feel anxious.”

A 1 meant “strongly disagree.”

A 9 meant “strongly agree.”

In other words, a LOWER rating suggests that the quizzes didn’t increase stress.

Students who took the graded quizzes averaged an answer of 4.20.

Students who took the ungraded quizzes averaged an answer of 2.96.

So, neither group felt much stress as a result of the quizzes. And, the students in the ungraded group felt even less.

In the Classroom

I myself use this technique as one of a great many retrieval practice strategies.

My students’ homework sometimes includes retrieval practice exercises.

I often begin class with some lively cold-calling to promote retrieval practice.

Occasionally — last Thursday, in fact — I begin class by saying: “Take out a blank piece of paper. This is NOT a quiz. It will NOT be graded. We’re using a different kind of retrieval practice to start us off today.”

As is always true, I’m combining this research with my own experience and classroom circumstances.

Khanna gave her quizzes at the end of class; I do mine at the beginning.

Because I’ve taught high school for centuries, I’m confident my students feel comfortable doing this kind of written work. If you teach younger grades, or in a different school context, your own experience might suggest a different approach.

To promote interleaving, I include questions from many topics (Define “bildungsroman.” Write a sentence with a participle. Give an example of Janie exercising agency in last night’s reading.) You might focus on one topic to build your students’ confidence.

Whichever approach you take, Khanna’s research suggests that retrieval practice quizzes don’t increase stress and don’t require grades.

As I said: retrieval practice rocks!

“Compared to What”: Is Retrieval Practice Really Better?
Andrew Watson
Andrew Watson

When teachers turn to brain research, we want to know: which way is better?

Are handwritten notes better than laptop notes?

Is cold-calling better than calling on students who raise their hands?

Is it better to spread practice out over time, or concentrate practice in intensive bursts?

For that reason, we’re excited to discover research that shows: plan A gets better results than plan B. Now we know what to do.

Right?

Better than What?

More often than not, research in this field compares two options: for instance, retrieval practice vs. rereading.

Often, research compares one option to nothing: starting class WITH learning objectives, or starting class WITHOUT learning objectives.

These studies can give us useful information. We might find that, say, brief exercise breaks help students concentrate during lectures.

However, they DON’T tell us what the best option is. Are exercise breaks more helpful than retrieval practice? How about video breaks? How about turn-n-talks?

When research compares two options, we get information only about the relative benefits of those two options.

For that reason, we’re really excited to find studies that compare more than two.

Enriching Encoding

A recent podcast* highlighted this point for me.

A 2018 study compared THREE different study strategies: rereading, enriched encoding, and retrieval practice.

Participants studied word pairs: say, “moon-galaxy.” Some of them studied by reviewing those pairs. Some studied with retrieval practice (“moon-__?__”).

Some studied with enriched encoding. This strategy urges students to connect new information to ideas already in long-term memory. In this case, they were asked, “what word do you associate with both “moon” and “galaxy”?

My answer to that question: “planet.” Whatever answer you came up with, you had to think about those two words and their associated ideas. You enriched your encoding.

Because this experiment looked at three different study strategies, it gives us richer insights into teaching and learning.

For instance, students who reviewed remembered 61% of the word pairs, whereas those who enriched their encoding remembered 75% (Cohen’s d = 0.72). Clearly, enriched encoding is better.

But wait, what about students who used retrieval practice?

Even Richer

Students in the retrieval practice group remembered 84% of their word pairs.

So, yes: “research shows” that enriched encoding is “better than review.” But it’s clearly not better than retrieval practice. **

In fact, this point may sound familiar if you read last week’s blog post about learning objectives. As that post summarized Dr. Faria Sana’s research:

Starting class with traditional learning objectives > starting class without traditional learning objectives

but

Starting class with learning objectives phrased as questions  > starting class with learning objectives phrased as statements

In fact, Sana looked at a fourth choice:

Teachers immediately answer the questions posed in the learning objectives >?< teachers don’t immediately answer the questions posed in the learning objectives.

It turns out: providing answers right away reduces students’ learning.

Because Sana studied so many different combinations, her research really gives us insight into our starting question: which way is better?

Friendly Reminders

No one study can answer all the questions we have. We ALWAYS put many studies together, looking for trends, patterns, exceptions, and gaps.

For instance, boundary conditions might limit the applicability of a study. Sana’s research took place in a college setting. Do her conclusions apply to 10th graders? 6th graders? 1st graders? We just don’t know (yet).

Or, if you teach in a school for children with a history of trauma, or in a school for students with learning differences, or in a culture with different expectations for teachers and students, those factors might shape the usefulness of this research.

By comparing multiple studies, and by looking for studies that compare more than two options, we can gradually uncover the most promising strategies to help our students learn.


* If you’re not following The Learning Scientists — their website, their blog, their podcast — I HIGHLY recommend them.

** To be clear: this study focuses on a further question: the participants’ “judgments of learning” as a result of those study practices. Those results are interesting and helpful, but not my primary interest here.