Skip to main content
EduTwitter Can Be Great. No, Really…
Andrew Watson
Andrew Watson

Twitter has a terrible reputation, and EduTwitter isn’t an exception.

The misinformation.

The name-calling.

The “team” rivalries: all heat and little light.

Did I mention the misinformation?

You might wonder: why bother? Honestly, I wouldn’t blame you if you didn’t. I myself was hesitant to sign up.

Despite all these flaws — none of which is exaggerated, by the way — I do find lots of benefits. This experience recently got my attention.

The Setup

On my personal Twitter account, I posted a link to research that had me puzzled. According to a small study, the motor cortex does not “remap” to represent prosthetic limbs.

Given all the research we have into neuroplasticity, I was genuinely shocked by that finding.

In fact, I’m currently reading Barbara Tversky’s book Mind in Motion, which talks about brains remapping in response to TOOL USE.

If brains remap because of tools, but not because of prosthetics — which are, from one perspective, tools that have been attached to the body — well: that’s a very strange.

But, people on Twitter know things I don’t. I thought: maybe someone knows more about this research pool than I…

Rising Action

Soon after I posted that link, my Twitter friend Rob McEntarffer (@rmcenta) retweeted it, sharing my curiosity. (By the way: “Twitter friends” are really a thing. I know LOTS of people — too many to name here — whom I have come to respect and like entirely by “meeting” them on Twitter. I would NOT have predicted that.)

One of his Twitter followers — someone I have never met and don’t know — retweeted Rob’s retweet, with a question to her professor.

So, we’re now at 3 or 4 degrees of separation. What happens next?

The Payoff

Turns out: this professor — whom I also don’t know — has lots of expertise in this research field. He briskly explained why the study couldn’t draw strong conclusions. (If I understand him correctly, its measurement methodology doesn’t allow it to make those claims.)

In other words: within a few hours, I went from…

being ASTONISHED because a research finding dramatically contradicted my (fairly basic) understanding of neural remapping,

to…

having a SUCCINCT AND CLEAR EXPLANATION why that research shouldn’t concern me,

and…

feeling RELIEVED that my understanding of neuroplasticity wasn’t so wrongheaded.

And, what made those changes possible — or, at least, a whole lot easier? Twitter.

Caveats

To be clear, Twitter really does include (and produce) foul, cruel nonsense. If you look for that, you’ll find it. (Tom Lehrer says: “Life is like a sewer. What you get out of it depends [at least in part] on what you put into it.”)

At the same time, I routinely come across generous teachers & researchers. They freely share perspectives and resources and contacts and information.

If you can stand the background noise, you might give it a look.

One place to start: @LearningAndTheB. Perhaps I’ll see you there.

How Psychologists and Teachers Can Talk about Research Most Wisely
Andrew Watson
Andrew Watson

Dr. Neil Lewis thinks a lot about science communication: in fact, his appointment at Cornell is in both the Psychology AND the Communications departments. (For a complete bio, click here.)

He and Dr. Jonathan Wai recently posted an article focusing on a troubling communication paradox:

Researchers are encouraged to “give science away”; however, because of the “replication crisis,” it’s hard to know what science is worth being given.

Here at Learning and the Brain, we think about that question frequently — so I was delighted that Dr. Lewis agreed to chat with me about his article.

In this conversation, we talk about…

… how teachers can ask psychologists good questions

… the dangers of “eminence”

… what we should think about growth mindset research

… the research “hype cycle.”

I hope you enjoy this conversation as much as I did.


Andrew Watson:

Thank you, Dr. Lewis, for sharing your ideas with our readers.

In your recent article, you and Dr. Wai write about tensions between two imperatives in the field of psychology.

First, psychologists are being asked to “give research away.” And second, our field worries about the “replication crisis.”

Both of those phrases mean more or less what they say. Could you define them a little more precisely, and talk about the tensions that these imperatives are creating?

Dr. Lewis:

There has been a long-standing call in psychology—going back, really, to the 60’s when George Miller first issued this call—to “give psychology away.”

As scholars, we spend our time doing all this research: we should try to communicate it with the world so that people can use it and improve lives.

Professional psychology societies and organizations really encourage researchers to “get our work out there.”

But at the same time, over the past decade or so, there has been a movement to reflect on what we really know in psychology.

A “replication crisis” has occurred—not only in psychology, it’s been happening in many areas.

We are having a hard time replicating many research findings. And that [failure] is making us, the scientists, wrestle with: what do we know? How do we know it? How robust are some of our findings?

And so there’s a tension here. We’re supposed to be “giving our findings away,” but at the same time we’re not sure which ones are robust enough to be worth giving away.

Andrew Watson:

That does sound like a problem. In that tension, do you see any special concerns about the field of education?

Dr. Lewis:

One of the things I’ve been thinking about for education researchers is: how do we know what we know? We have to look very closely at the details of the paper to figure those things out.

Which students are being studied in the papers you’re reading?

What kinds of schools?

What kind of teachers?

At least in the US, there’s so much segregation in our school systems that schools look very different.

If studies are run—let’s say—with kids in the Ithaca school district where I live in upstate New York: those kids, those parents, those schools are very different than studies run—let’s say—in the Detroit public school district, which is the district I thought a lot about during my graduate training when I lived in Michigan.

There are big differences between these districts. We have to figure out: are the schools that we’re trying to intervene in, similar to the studies that were run? Or are they different?

Andrew Watson:

I have a question about that process.

Here’s a problem: to know what questions teachers ought to be asking, we need expert knowledge. Because we’re teachers not psychologists, it’s hard to know the right questions.

So: what’s the best question that a nonspecialist teacher can ask of a researcher, in order to get an answer that we can genuinely understand?

Dr. Lewis:

I think there are some basic things that teachers can ask of researchers.

The teachers can ask what kinds of schools were these studies run in. Are they urban schools, rural schools?

What percentage of the students are on free lunch? (That’s an indicator of poverty levels of the school. Research findings are often influenced by background characteristics about the students.)

What do we know about the kinds of students that were involved in studies?

What do we know about the teachers?

Those are basic things that the researchers should be able to tell you. And then you can figure out whether those are similar to:

the students that you’re working with,

the kinds of schools that you have,

the kind of leadership in your school district, and the like.

Those basic characteristics about how the study was done will help you figure out whether or not you can use it.

Andrew Watson:

I spend a lot of time talking with teachers about this concern. Most psychology research is done with college undergraduates. That research is obviously important. But if you’re teaching reading to third graders, maybe that research translates to your context and maybe it doesn’t.

Dr. Lewis:

Right.

Andrew Watson:

One of the more intriguing points you made in the article has to do with the idea of eminence.

In the world of education, we’re often drawn to Big Names. You argue that the things scholars do to achieve eminence don’t necessarily help them produce high quality research.

As teachers, how do we sort through this paradox? How can we be wise when we think about that?

Dr. Lewis:

We brought up eminence to reinforce what I just noted. Look at the details of the study and don’t rely on the “cue” of eminence as your signal that research must be good.

Researchers are judged by many metrics. Once you put those metrics in place, people do what they can to… I hesitate to use the word “game,” but to optimize their standing in those metrics.

Andrew Watson:

Which is a lot like “gaming,” isn’t it?

Dr. Lewis:

Yes. In the research world, there are a few metrics that don’t necessarily help [produce meaningful results]. One of them, for instance, is that researchers are incentivized to publish as much as we can.

Unfortunately, publishing fast is the way to rise up the ranks. But sometimes figuring out these differences that I have been talking about—like, between contexts and samples—it takes some time. It slows you down from churning out papers; and unfortunately, researchers often aren’t incentivized to take that slower, more careful approach.

And so there’s that tension again too. I don’t want to leave the impression that we just shouldn’t trust eminent people. That’s not the point I want to make.

The point is: eminence in and of itself is not a useful signal of quality. You have to look very closely at the details of the studies in front of you. Then compare those details to your own situation and judge the work on that. Judge the work, don’t judge based on how famous the person is.

Andrew Watson:

It occurs to me as you’re explaining this, there’s a real problem with the emphasis on rapid publication. One of the consistent findings in education research is that short-term performance isn’t a good indicator of long-term learning.

But if scholars are incentivized to publish quickly, they’re incentivized to the study short-term, which doesn’t tell us much about what we really want to know: learning that lasts.

Dr. Lewis:

Absolutely right. As I’ve written in other articles, we don’t have enough longitudinal studies for the very reasons we’re talking about: longitudinal studies take forever—and, again, the incentive is to publish fast, publish often.

The outcomes that are often measured in psychology studies are these shorter term things. You have the student do something, and you measure at the end of the session. Maybe you look again at the end of the semester.

But [we should] look next year, two years, three years, because we know some of these effects take time to accumulate.

Some older studies have looked at long-term outcomes. I’ve seen a few fascinating studies showing, initially, no significant findings. But if you look far enough down the road, you start to see meaningful effects. It just takes time for the benefits to accumulate.

In education, we shouldn’t assume that research results “generalize.” [Editor: That is, we shouldn’t assume that research with 1st graders applies to 10th graders; or that short term findings will also be true in the long term.]

Now, until I see more evidence, I assume findings are context-specific. [Editor: That is, research with 1st graders applies to 1st graders—but not much beyond that age/grade. Research from the United States applies to the US cultural context, but not—perhaps—to Korea.]

For instance: “growth mindset.” In recent studies, authors have been looking at how much the effect varies by context and by population. Those details matter in thinking about mindset studies.

Andrew Watson:

Yes, I think mindset is a really interesting case study for the topic we’re talking about. My impression is that teachers got super excited about growth mindset. We went to a highly simplistic “poster-on-the-wall” version of the theory.

And in the last 18 months or so, there has been a real backlash. Now we hear: “growth mindset means nothing whatsoever! Why are you wasting your time?”

We need to find our way to a nuanced middle ground. No, growth mindset is not a panacea. But nothing is a panacea. At the same time, in a specific set of circumstances, mindset can help certain students in specific ways.

That balanced conclusion can be a hard place to get the conversation to go.

Dr. Lewis

Yes, issues like that motivated us to write our paper.

If we [researchers] are able to communicate those nuances clearly, then I think we avoid these misunderstandings. It’s not that mindset is useless; instead, mindset will have a small effect under certain conditions. We should just say that.

We have a problem with the “hype cycle.”

If something is over-hyped one day, then you’re really setting people’s expectations unreasonably high. Later, when the research doesn’t meet those expectations, teachers are disappointed.

And so researchers should set expectations appropriately. Mindset is not a panacea. We shouldn’t expect enormous impacts. And that’s fine. Let’s just say that.

Andrew Watson:

I think this “hype cycle” is part of the challenge that we’re facing.

For instance, with learning styles, teachers thought that it had a lot of scientific backing. We embraced it because it was “research based.”

Now the message is: “no, research got that wrong; learning styles aren’t a thing. But here’s another research-based thing instead.”

And teachers are saying: “wait, if I shouldn’t have followed research about learning styles, why should I believe new research about new teaching suggestions?”

Dr. Lewis:

That’s a tricky problem.

One way to think about science is: science is a way of reducing uncertainty.

We had this idea about learning styles. We gathered some initial evidence about it. It seemed like a good idea for a while.

But as we continued studying it, we realized, well, maybe there is not as much good evidence as we thought.

And that’s part of the scientific process. I think it’s important to explain that.

But: that shift without an explanation naturally leads teachers to be suspicious.

Teachers think: “why are you telling me, just make this change. You have to explain to me what is going on and why should I make that change.”

This explanation does take more time. But that’s what is necessary to get people to update their understanding of the world.

Something that we all have to keep in mind: just as every year teachers are learning new ways to teach the new generations of students, scientists are doing the same thing too. We’re constantly trying to update our knowledge.

So there will be changes in the recommendations over time. If there weren’t changes, none of us would be doing our best. So we’re learning and improving constantly.

But we have to have that conversation. How are we updating our knowledge? And what are ways that we can implement that new knowledge into curriculum?

And, the conversation has to go both ways. Researchers communicate things to teachers, but teachers also need to be telling things to researchers. So we can keep that real classroom context in mind as we’re developing research advice.

Andrew Watson:

In your article, you and Dr. Wai remind researchers that they’re not communicating with one undifferentiated public. They are talking with many distinct, smaller audiences—audiences which have different interests and needs.

Are there difficulties that make it especially hard to communicate with teachers about psychology research? Is there some way that we’re an extra challenging audience? Or maybe, an especially easy audience?

Dr. Lewis:

I think what’s hard for presenters is not knowing details about the audience, where they’re coming from. That section of the paper is about is really getting to know your audience, and tailoring your message from there.

If I’m going to go explain psychology findings to a group of STEM teachers, that talk might be different than if the audience is a broader cross-section of teachers.

In the university setting, it’s easier to figure out those distinctions because you know which department invited you to speak.

In broader K-12 settings you don’t always know. A school district invites you. You can do some Googling to try to figure something out about the district. But you don’t know who’s going to be in the room, and what is happening [in that district]. So you might end up giving too broad a talk, that might be less informative than if you did get some more information.

Andrew Watson:

Are there questions I haven’t asked that I ought to have asked?

Dr. Lewis:

The key point for me is: when we communicate about science in the world, we really have to look at key research details and have serious conversations about them. Nuances matter, and we just can’t gloss over them.

Andrew Watson:

Dr. Lewis, I very much appreciate your taking the time to talk with me today.

Dr. Lewis:

Thank you.

Laptop Notes or Handwritten Notes? Even the New York Times Has It Wrong [Reposted]
Andrew Watson
Andrew Watson

You’ll often hear the claim: “research says students remember more when they take notes by hand than when they use laptops.”

The best-known research on the topic was done in 2014.

You’ll be surprised to discover that this conclusion in fact CONTRADICTS the researchers’ own findings. Here’s the story, which I wrote about back in 2018…


Here’s a hypothetical situation:

Let’s say that psychology researchers clearly demonstrate that retrieval practice helps students form long-term memories better than rereading the textbook does.

However, despite this clear evidence, these researchers emphatically tell students to avoid retrieval practice and instead reread the textbook. These researchers have two justifications for their perverse recommendation:

First: students aren’t currently doing retrieval practice, and

Second: they can’t possibly learn how to do so.

Because we are teachers, we are likely to respond this way: “Wait a minute! Students learn how to do new things all the time. If retrieval practice is better, we should teach them how to do it, and then they’ll learn more. This solution is perfectly obvious.”

Of course it is. It’s PERFECTLY OBVIOUS.

Believe It Or Not…

This hypothetical situation is, in fact, all too real.

In 2014, Pam Mueller and Dan Oppenheimer did a blockbuster study comparing the learning advantages of handwritten notes to laptop notes.

Their data clearly suggest that laptop notes ought to be superior to handwritten notes as long as students learn to take notes the correct way.

(The correct way is: students should reword the professor’s lecture, rather than simply copy the words down verbatim.)

However — amazingly — the study concludes

First: students aren’t currently rewording their professor’s lecture, and

Second: they can’t possibly learn how to do so.

Because of these two beliefs, Mueller and Oppenheimer argue that — in their witty title — “The Pen is Mightier than the Laptop.”

But, as we’ve seen in the hypothetical above, this conclusion is PERFECTLY OBVIOUSLY incorrect.

Students can learn how to do new things. They do so all the time. Learning to do new things is the point of school.

If students can learn to reword the professor’s lecture when taking notes on a laptop, then Mueller and Oppenheimer’s own data suggest that they’ll learn more. And yes, I do mean “learn more than people who take handwritten notes.”

(Why? Because laptop note-takers can write more words than handwriters, and in M&O’s research, more words lead to more learning.)

And yet, despite the self-evident logic of this argument, the belief that handwritten notes are superior to laptop notes has won the day.

That argument is commonplace is the field of psychology. (Here‘s a recent example.)

Even the New York Times has embraced it.

The Fine Print

I do need to be clear about the limits of my argument:

First: I do NOT argue that a study has been done supporting my specific hypothesis. That is: as far as I know, no one has trained students to take reworded laptop notes, and found a learning benefit over reworded handwritten notes. That conclusion is the logical hypothesis based on Mueller and Oppenheimer’s research, but we have no explicit research support yet.

Second: I do NOT discount the importance of internet distractions. Of course students using laptops might be easily distracted by Twinsta-face-gram-book. (Like everyone else, I cite Faria Sana’s research to emphasize this point.)

However, that’s not the argument that Mueller and Oppenheimer are making. Their research isn’t about internet distractions; it’s about the importance of reworded notes vs. verbatim notes.

Third: I often hear the argument that the physical act of writing helps encode learning more richly than the physical act of typing. When I ask for research supporting that contention, people send me articles about 1st and 2nd graders learning to write.

It is, I suppose, possible that this research about 1st graders applies to college students taking notes. But, that’s a very substantial extrapolation–much grander than my own modest extrapolation of Mueller and Oppenheimer’s research.

And, again, it’s NOT the argument that M&O are making.

To believe that the kinesthetics of handwriting make an essential difference to learning, I want to find a study showing that the physical act of writing helps high school/college students who are taking handwritten notes learn more. Absent that research, this argument is even more hypothetical than my own.

Hopeful Conclusion

The field of Mind, Brain, & Education promises that the whole will be greater than the sum of the parts.

That is: if psychologists and neuroscientists and teachers work together, we can all help each other understand how to do our work better.

Frequently, advice from the world of psychology gives teachers wise guidance. (For example: retrieval practice.)

In this case, we teachers can give psychology wise guidance. The founding assumption of the Mueller and Oppenheimer study — that students can’t learn to do new things — simply isn’t true. No one knows that better than teachers do.

If we can keep this essential truth at the front of psychology and neuroscience research, we can benefit the work that they do, and improve the advice that they give.

The Limits of “Desirable Difficulties”: Catching Up with Sans Forgetica
Andrew Watson
Andrew Watson

We have lots of research suggesting that “desirable difficulties” enhance learning.

That is: we want our students to think just a little bit harder as they practice concepts they’re learning.

Why is retrieval practice so effective ? Because it requires students to think harder than mere review.

Why do students learn more when they space practice out over time? Because they have to think back over a longer stretch — and that’s more difficult.

We’ve even had some evidence for a very strange idea: maybe the font matters. If students have to read material in a hard-to-read font, perhaps their additional effort/concentration involved will boost their learning.

As I wrote last year, a research team has developed a font designed for exactly that reason: Sans Forgetica. (Clever name, no?) According to their claims, this font creates the optimal level of reading difficulty and thereby could enhance learning.

However — as noted back then — their results weren’t published in a peer-reviewed journal. (All efforts to communicate with them go to their university’s publicity team. That’s REALLY unusual.)

So: what happens when another group of researchers tests Sans Forgetica?

Testing Sans Forgetica

Testing this question is unusually straightforward.

Researchers first asked participants to read passages in Sans Forgetica and similar passages in Arial. Sure enough, they rated Sans Forgetica harder to read.

They then ran three more studies.

First, they tested participants’ memory of word pairs.

Second, they tested memory of factual information.

Third, they tested understanding of conceptual understanding.

In other words, they were SUPER thorough. This research team didn’t just measure one thing and claim they knew the answer. To ensure they had good support behind their claims, they tested the potential benefits of Sans Forgetica in many ways.

So, after all this thorough testing, what effect did Sans Forgetica have?

Nada. Bupkis. Nuthin.

For example: when they tested recall of factual information, participants remembered 74.73% of the facts they read in Sans Forgetica. They remembered 73.24% of the facts they read in Arial.

When they tested word pairs, Sans Forgetica resulted in lower results. Participants remembered 40.26% of the Sans Forgetica word pairs, and 50.51% of the Arial word pairs.

In brief, this hard-to-read font certainly doesn’t help, and it might hurt.

Practical Implications

First, don’t use Sans Forgetica. As the study’s authors write:

If students put their study materials into Sans Forgetica in the mistaken belief that the feeling of difficulty created is benefiting them, they might forgo other, effective study techniques.

Instead, we should encourage learners to rely on the robust, theoretically-grounded techniques […] that really do enhance learning.

Second, to repeat that final sentence: we have LOTS of study techniques that do work. Students should use retrieval practice. They should space practice out over time. They should manage working memory load. Obviously, they should minimize distractions — put the cell phone down!

We have good evidence that those techniques work.

Third, don’t change teaching practices based on unpublished research. Sans Forgetica has a great publicity arm — they were trumpeted on NPR! But publicity isn’t evidence.

Now more than ever, teachers should keep this rule in mind.

Unbearable Irony: When Dunning-Kruger Bites Back…
Andrew Watson
Andrew Watson

More than most psychology findings, the Dunning-Kruger effect gets a laugh every time.

Here goes:

Imagine that I give 100 people a grammar test. If my test is well-designed, it gives me insight into their actual knowledge of grammar.

I could divide them into 4 groups: those who know the least about grammar (the 25 who got the lowest scores), those who know the most (the 25 highest scores), and two groups of 25 in between.

I could also ask those same 100 people to predict how well they did on that test.

Here’s the question: what’s the relationship between actual grammar knowledge and confidence about grammar knowledge?

John Cleese — who is friends with David Dunning — sums up the findings this way:

In order to know how good you are at something requires exactly the same skills as it does to be good at that thing in the first place.

Which means — and this is terribly funny — that if you’re absolutely no good at something at all, then you lack exactly the skills that you need to know that you’re absolutely no good at it. [Link]

In other words:

The students who got the lowest 25 scores averaged 17% on that quiz. And, they predicted (on average) that they got a 60%.

Because they don’t know much grammar, they don’t know enough to recognize how little they know.

In Dunning’s research, people who don’t know much about a discipline consistently overestimate their skill, competence, and knowledge base.

Here’s a graph, adapted from figure 3 of Dunning and Kruger’s 1999 study, showing that relationship:

Adapted from figure 3 of Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77(6), 1121-1134.

Let the Ironies Begin

That graph might surprise you. In fact, you might be expecting a graph that looks like this:

Certainly that was the graph I was expecting to find when I looked at Kruger & Dunning’s 1999 study. After all, you can find that graph — or some variant — practically everywhere you look for information about Dunning-Kruger.

It seems that the best-known Dunning-Kruger graph wasn’t created by Dunning or Kruger.

If that’s true, that’s really weird. (I hope I’m wrong.)

But this story gets curiouser. Check out this version:

This one has thrown in the label “Mount Stupid.” (You’ll find that on several Dunning-Kruger graphs.) And, amazingly, it explicitly credits the 1999 study for this image.

That’s right. This website is calling other people stupid while providing an inaccurate source for its graph of stupidity. It is — on the one hand — mocking people for overestimating their knowledge, while — on the other hand — demonstrating the conspicuous limits of its own knowledge.

Let’s try one more:

I am, quite honestly, praying that this is a joke. (The version I found is behind a paywall, so I can’t be sure.)

If it’s not a joke, I have some suggestions. When you want to make fun of someone else for overestimating their knowledge,

First: remember that “no nothing” and “know nothing” don’t mean the same thing. Choose your spelling carefully. (“No nothing” is how an 8-year-old responds to this parental sentence: “Did you break the priceless vase and what are you holding behind your back?’)

Second: The Nobel Prize in Psychology didn’t write this study. Kruger and Dunning did.

Third: The Nobel Prize in Psychology doesn’t exist. There is no such thing.

Fourth: Dunning and Kruger won the Ig Nobel Prize in Psychology in 2000. The Ig Nobel Prize is, of course, a parody.

So, either this version is a coy collection of jokes, or someone who can’t spell the word “know” correctly is posting a graph about others’ incompetence.

At this point, I honestly don’t know which is true. I do know that the god of Irony is tired and wants a nap.

Closing Points

First: Karma dictates that in a post where I rib people for making obviously foolish mistakes, I will make an obviously foolish mistake. Please point it out to me. We’ll both get a laugh. You’ll get a free box of Triscuits.

Second: I haven’t provided sources for the graphs I’m decrying. My point is not to put down individuals, but to critique a culture-wide habit: passing along “knowledge” without making basic attempts to verify the source.

Third: I really want to know where this well-known graph comes from. If you know, please tell me! I’ve reached out to a few websites posting its early versions — I hope they’ll pass along the correct source.

Training in Effective Skepticism: Retraction Watch
Andrew Watson
Andrew Watson

When we teachers first get interested in research, we regularly hear this word of caution: “you should base your teaching on research — but be skeptical!

Of course, we should be skeptical. But, like every skill, skepticism requires practice. And experience.

How can we best practice our skepticism?

The Company We Keep

Of course, the more time we spend listening to effective skeptics, the likelier we are to learn from their methodologies.

Many well-known sources frequently explore the strengths and weaknesses of research suggestions.

Dan Willingham regularly takes a helpfully skeptical view of research. (He’s also a regular, amusing twitter voice.)

Ditto The Learning Scientists.

Certainly this blog takes on the topic frequently.

Today, I’d like to add to your skepticism repertoire: Retraction Watch.

Unlike the other sources I mentioned, Retraction Watch doesn’t focus on education particularly. Instead, it takes in the full range of scientific research — focusing specifically on published research that has been (or should be?) retracted.

If you get in the habit of reading their blog, you’ll learn more about the ways that researchers can dissemble — even cheat — on their way to publication. And, the ways that their deceptions are unmasked.

You’ll also learn how much research relies on trust, and the way that such trust can be violated. That is: sometimes researchers retract their work when they learn a colleague — without their knowledge — fudged the data.

In brief, if you’d like to tune up your skepticism chops, Retraction Watch will help you do so.

And: the topic might sound a bit dry. But, when you get into the human stories behind the clinical sounding “retraction,” you’ll be fascinated.


Back in December, I wrote about another website that can help you see if a study has been cited, replicated, or contradicted. You can read about that here.

A Fresh Approach to Evaluating Working Memory Training
Andrew Watson
Andrew Watson

Because working memory is SO IMPORTANT for learning, we would love to enhance our students’ WM capacity.

Alas, over and over, we find that WM training programs just don’t work (here and here and here). I’ve written about this question so often that I’ve called an informal moratorium. Unless there’s something new to say, or a resurgence of attempts to promote such products, I’ll stop repeating this point.

Recently I’ve come across a book chapter that does offer something new. A research team led by Claudia C. von Bastian used a very powerful statistical method to analyze the effectiveness of WM training programs.

This new methodology (which I’ll talk about below) encourages us to approach the question with fresh eyes. That is: before I read von Bastian’s work, I reminded myself that it might well contradict my prior beliefs.

It might show that WM training does work. And, if it shows that, I need to announce that conclusion as loudly as I’ve announced earlier doubts.

In other words: there’s no point in reading this chapter simply to confirm what I already believe. And, reader, the same applies for you. I hereby encourage you: prepare to have your beliefs about WM training challenged. You shouldn’t read the rest of this post unless you’re open to that possibility.

New Methodology

One problem with arguments about WM training is that sample sizes are so small. In one recent meta-analysis, the average sample size per study was 20 participants.

In a recent book on cognitive training, von Bastian, Guye, and De Simoni note that small sample sizes lead to quirky p-values. In other words, we struggle to be sure that the findings of small studies don’t result from chance or error.

Instead, von Bastian & Co. propose using Bayes factors: an alternate technique for evaluating the reliability of a finding, especially with small sample sizes. The specifics here go WAY beyond the level of this blog, but the authors summarize handy tags for interpreting Bayes factors:

1-3               Ambiguous

3-10            Substantial

10-30         Strong

30-100      Very Strong

100+         Decisive

They then calculate Bayes factors for 28 studies of WM training.

Drum Roll, Please…

We’ve braced ourselves for the possibility that a new analytical method will overturn our prior convictions. Does it?

Well, two of the 28 studies “very strongly” suggest WM training works. 1 of the 28 “substantially” supports WM training. 19 are “ambiguous.” And 6 “substantially” suggest that WM training has no effect.

In other words: 3 of the 28 show meaningful support of the hypothesis. The other 25 are neutral or negative.

So, in a word: “no.” Whichever method you use to evaluate the success of WM training, we just don’t have good reason to believe that it works.

Especially when such training takes a long time, and costs lost of money, schools should continue to be wary.

Three Final Notes

First: I’ve focused on p-values and Bayes factors in this blog post. But, von Bastian’s team emphasizes a number of problems in this field. For instance: WM training research frequently lacks an “active” control group. And, it often lacks a substantial theory, beyond “cognitive capacities should be trainable.”

Second: This research team is itself working on an intriguing hypothesis right now. They wonder if working memory capacity cannot be trained, but working memory efficiency can be trained. That’s a subtle but meaningful distinction, and I’m glad to see they’re exploring this question.

So far they’re getting mixed results, and don’t make strong claims. But, I’ll keep an eye on this possibility — and I’ll report back if they develop helpful strategies.

Third: I encouraged you to read von Bastian’s chapter because it might change your mind. As it turns out, the chapter probably didn’t. Instead it confirmed what you (and certainly I) already thought.

Nonetheless, that was an important mental exercise. Those of us committed to relying on research for teaching guidance should be prepared to change our approach when research leads us in a new direction.

Because, you know, some day a new WM training paradigm just might work.


von Bastian, C. C., Guye, S., & De Simoni, C. (2019). How strong is the evidence for the effectiveness of working memory training? In M. F. Bunting, J. M. Novick, M. R. Dougherty & R. W. Engle (Eds.), Cognitive and Working Memory Training: Perspectives from Psychology, Neuroscience, and Human Development (pp. 58–75). Oxford University Press.

Whose Online Teaching Advice Do You Trust?
Andrew Watson
Andrew Watson

Many people who offer teaching advice cite psychology and neuroscience research to support their arguments.

If you don’t have time to read that underlying research — or even the expertise to evaluate its nuances — how can you know whom to trust?

There are, of course, MANY answers to that question (for instance: here and here and here). I want to focus on a very simple one today.

My advice comes in the form of a paradox: You should be likelier to TRUST people who tell you to DOUBT them.

Example #1

I thought of this paradox last week when reading a blogger’s thoughts on Jeffrey Bowers. Here’s the third paragraph of the blogger’s post:

I am a researcher working in the field of cognitive load theory. I am also a teacher, a parent and a blogger with a lot of experience of ideological resistance to phonics teaching and some experience of how reading is taught in the wild. All of these incline me towards the systematic teaching of phonics. I am aware that Bowers’ paper will be used by phonics sceptics to bolster their argument and that predisposes me to find fault in it. Bear that in mind.

In this paragraph, the blogger gives the reader some background on his position in an ongoing argument.

He does not claim to read Bowers’s work as an above-the-fray, omniscient deity.

Instead, he comes to this post with perspectives — let’s just say it: biases — that shape his response to Bowers’s research.

And he explicitly urges his reader to “bear [those biases] in mind.”

Of course, in the world of science, “bias” doesn’t need to have a negative connotation. We all have perspectives/biases.

By reminding you of these perspectives — that is, his limitations — the blogger gives you reasons to doubt his opinion.

And my argument is: because he reminded you to doubt him, you should be willing to trust him a little bit more.

The blogger here is Greg Ashman, who writes a blog entitled Filling the Pail. Lots of people disagree with Ashman quite vehemently, and he disagrees right back.

My point in this case is not to endorse his opinions. (I never write about reading instruction, because it’s so complicated and I don’t know enough about it to have a developed opinion.)

But, anyone who highlights his own limitations and knowledge gaps in an argument gets my respect.

Example #2

Over on Twitter, a professor recently tweeted out a quotation from the executive summary of a review. (The specific topic isn’t important for the argument I’m making.)

Soon after, he tweeted this:

“When I tweeted out [X’s] new review of [Y] a few days ago, I pulled a non-representative quote from the exec summary.

It seemed to criticize [Y] by saying [Z] … [However, Z is] not the key criticism in the review. Here I’ve clipped more serious concerns.”

He then posted 4 substantive passages highlighting the review’s core critiques of Y.

In other words, this professor told you “I BLEW IT. I created an incorrect impression of the review’s objections.”

You know what I’m about to say now. Because this professor highlighted reasons you should doubt him — he blew it — I myself think you should trust him more.

We all make mistakes. Unlike many of us (?), this professor admitted the mistake publicly, and then corrected it at length.

In this case, the professor is Daniel Willingham — one of the most important scholars working to translate cognitive psychology for classroom teachers.

He’s written a book on the subject of skepticism: When Can You Trust the Experts. So, it’s entirely in character for Willingham to correct his mistake.

But even if you didn’t know he’d written such a book, you would think you could trust him because he highlighted the reasons you should not.

In Sum

Look for thinkers who highlight the limitations of the research. Who acknowledge their own biases and gaps in understanding. Who admit the strengths of opposing viewpoints.

If you hear from someone who is ENTIRELY CERTAIN that ALL THE RESEARCH shows THIS PSYCHOLOGICAL PRINCIPLE WORKS FOR ALL STUDENTS IN ALL CLASSROOMS — their lack of self-doubt should result in your lack of trust.

Starting the Year Just Right: Healthy Skepticism
Andrew Watson
Andrew Watson

I regularly tell teachers: if you want to be sure you’re right, work hard to prove yourself wrong.

If, for example, you think that dual coding might be a good idea in your classroom, look for all the best evidence you can find against this theory.

Because you’ll find (a lot) more evidence in favor of dual coding than against, you can be confident as you go forward with your new approach.

Well: I got a dose of my own medicine today…

People Prefer Natural Settings. Right?

If you’re a regular reader, you know that I’m a summer camp guy. I’ve spent many of the happiest hours of my life hiking trails and canoeing lakes and building fires.

Many of the best people I know devote their summers to helping children discover their strengths and values surrounded by pines and paddles.

And: I’m not the only one. We’ve got LOTS of research showing that people prefer natural settings to urban ones. Some of that research shows this preference cross-culturally. It’s not just Rousseau-influenced Westerners who feel this way, but humans generally.

In fact, it’s easy to speculate about an evolutionary cause for this preference. Our species has been around for about 250,000 years; only a tiny fraction of that time has included substantial urban development.

If our preference for natural environments has an evolutionary base, then we would expect children to share it. They don’t need adult coaxing to enjoy the natural beauties to which their genes incline them.

Right?

Trying to Prove Ourselves Wrong

If we’re going to follow the advice above — that is, if we’re going to seek out evidence at odds with our own beliefs — we might wonder: can we find research contradicting this line of thought?

Can we find evidence that children prefer urban settings to rural ones? That they adopt adult preferences only slowly, as they age?

Yes, we can.

Researchers in Chicago worked with children and their parents, asking them to say how much they liked (and disliked) images of natural and urban settings.

In every category, children liked the urban images more than adults (and their parents) did, and disliked natural images more than adults (and their parents). (Check out figure 3 in the study.)

And: that preference changed — almost linearly — as the children aged.

That is: the four-year-olds strongly preferred the urban images, whereas that preferential difference decreased as the children got older. (Figure 4 shows this pattern.)

You might reasonably wonder: doesn’t this depend on the environment in which the children grew up and attended school?

The researchers wondered the same thing. The answer is, nope.

They used zip codes to measure the relative urbanization of the places where these children lived. And, that variable didn’t influence their preferences.

So, contrary to my confident predictions, children (in this study, with this research paradigm) don’t share adults’ preferences. They prefer urban to natural settings.

Lessons to Learn

To be clear: this study does NOT suggest that we should give up on outdoor education.

The researchers aren’t even asking that question.

Instead, they’re examining a plausible hypothesis: “our adult love of nature might be an evolutionary inheritance, and therefore we’ll find it in children too.”

These data do not support that plausible hypothesis.

But, they also don’t contradict the many (many benefits) that humans — adults and children — get from interacting with the natural world.

So, for me, the two key lessons here are:

First: when introducing young children to natural environments, don’t be surprised if they don’t love them at first. We might need to plan for their discomfort, anxiety, and uncertainty.

Second: even if we really want to believe something to be true; even if that “something” is super plausible — we really should look for contradictory evidence before we plan our teaching world around it.

By the way: here’s a handy resource to aid you in your quest for more effective skepticism.

A Holiday Present for the Teacher/Skeptic (in Beta)
Andrew Watson
Andrew Watson

Teachers who rely on research to inform our teaching–presumably, that’s YOU–routinely hear that we must be skeptical.

“Don’t just believe everything you hear. When someone says that their suggestion is ‘brain based,’ you’ve got to kick the tires.”

Yes. Of course. But: how EXACTLY do we do that? What’s the most effective method for skepticism?

Pens and Laptops

Let’s take a specific example.

You’ve probably heard “research shows” that handwritten notes are more effective than notes taken on laptops. That is: students who take notes by hand remember more than those who take notes on their computers.

If you hunt down the source of that information, you’ll almost certainly end up at Mueller and Oppenheimer’s wittily titled study The Pen is Mightier than the Laptop. It made a big splash when it came out in 2014, and its waves have been lapping over us ever since.

Once you find that source, your thought process might go like this:

Step 1: “Look! Research shows that handwritten notes are superior. I shall forbid laptops forthwith!”

Step 2: “Wait a minute…I’ve been told to be skeptical. Just because Mueller and Oppenheimer say so (and have research to support their claim), I shouldn’t necessarily believe them.”

Step 3: “Hmmm. How exactly can I be skeptical?”

So, here’s my holiday present for you: a website that makes effective skepticism noticeably easier…

Check the Scitation

The website scite.ai leads with this catchy slogan: “making science reliable.”

At least, it’s helping make science reliabler. Or, more reliable.

Here’s how. Surf over to the website, put in the name of the article, and press the magic button.

Scite will then tell you…

…how many later studies have confirmed its findings,

…how many simply mention its findings,

…and how many contradict its findings.

In this case, you’ll discover that 24 studies mention Mueller and Oppenheimer’s study, 1 contradicts it, and 0 confirm it. That’s right. Zero.

So, according to Scite, you’ve got as much research encouraging laptop notes as you do decrying them. But: one of those studies is remarkably famous. And, the other simply isn’t known.

Next Steps

What should you do with this initial information?

At this point, I think the obvious answer is that we don’t have an obvious answer.

Probably, you want to keep looking for further evidence on both sides of the case.

You might find this summary over at the Learning Scientists, where Dr. Megan Sumeracki walks through the nuances and complexities of the research.

You might also find my own article arguing that the Mueller and Oppenheimer research makes sense only if you believe that students can’t learn to do new things. (That’s a strange belief for a teacher to have.) If you believe students can learn new things, then their own data suggest that laptop notes ought to be better.

At a minimum, I hope, you’ll feel empowered in your skepticism. Now you–unlike most people who quote Mueller and Oppenheimer–have a broader picture of the research field. You can start using your judgment and experience to guide your thinking.

An Important Caveat

I don’t know how long scite.ai has been around, but it’s in beta. And, truth be told, it’s not wholly reliable.

For instance: in 2011, Ramirez and Beilock did a study showing that writing about stress before an exam can reduce that stress (for anxious students).

In 2018, Camerer et al tried and failed to replicate those results (and several other studies as well).

When I searched on Ramirez’s study in scite, it showed only one contradiction: a study about “aural acupuncture.” In other words: scite missed an important non-replication, and included an irrelevant finding.

So, you shouldn’t use this website as your only skepticism strategy.

But, as of today, you’ve got one more than you did before. Happy Holidays!


If you’re looking for other skepticism strategies for the holidays, check out this work by Blake Harvard.