Last year, a paper in Science led to a public spotlight on the scientific process. It pointed to a problem that’s being called the replication crisis (or reproducibility crisis) that has led many to wonder: Is science broken?
Here’s what happened: The Open Science Collaboration asked labs across the nation to repeat others’ experiments as closely as possible and share their results. The original experiments were taken from papers published in three widely respected, peer-reviewed journals in psychology and cognitive science. Of the 100 experiments that were chosen for replication, 97 had statistically significant results when initially published.
But only 36 of the replications of those studies reported significant results.1
Most research studies use tests to gauge how likely the results researchers found are due to chance. “Significant results” means that the results passed those tests according to a generally agreed upon rule of thumb, which is often what’s called a “p-value” of at least .05. Getting a p-value of .05 roughly translates to “there is a 5% chance you would get these results even though they are not accurate.” Using p-values to determine “significant results” is the standard (though there is a lot of longstanding controversy about this practice2) so this was one of the primary measures in the big replication study.
So, statistically speaking, only replicating “significance” in 36 of the original results doesn’t mean that all of the original studies were wrong. If all of the studies were replicated a third time, we would probably see a different array of studies with significant results. This is one reason why replication over time is such an important theoretical part of the scientific process, though replication studies, especially costly ones, are rarely a priority because of the pressures on modern researchers.
This mega-experiment does suggest that many (perhaps even a majority!) of psychology’s published results might be due to error or chance. And this problem isn’t limited to psychology—the biomedical research community is dealing with serious replication challenges, too.*3
Of course, even replication studies can be prone to messiness and error, as many researchers, such as Dr. Dan Gilbert of Harvard University, have contested these results in recent weeks. You can follow the ongoing debate here.
The replication crisis brings to light the reality that the answers to many important questions are buried in messy evidence. Educators will influence how the next generation of scientists and citizens make decisions on challenging issues (sometimes called “wicked problems”) at the intersection of science and society, including climate change and global health crises. In classrooms everywhere, students from Pre-K to college are learning how to understand, integrate, and evaluate evidence.
Here are three ideas on how we can do this better, in all kinds of classrooms.
- Tackle conflicting evidence
In one classroom, students listened to the popular podcast Serial, which reports on the true story of Adnan Syed, convicted of murdering his girlfriend Hae Min Lee in 1999.4 The students dissected its transcripts, mapping out a maze of inconsistent claims and evidence to examine their beliefs about Syed’s guilt or innocence.
Can students learn from this approach? Many teachers worry that introducing conflicting information only confuses students. However, research in higher education suggests that tasks with “cognitive conflict” (involving different viewpoints and no single answer) can lead to better mastery of the basic concepts5, though it’s unclear whether this is true for younger children.
Tackling conflicting information might support deeper learning of the content material, while giving students a chance to develop critical thinking skills in ways that closely mirror the challenges of ambiguity in the professional workplace.
- Consider student’s developing ideas about causation, probability, and statistics
What were Juliet’s motives? What started the War of 1812? And how do kidneys work, anyway? Causes and effects are discussed across the sciences and humanities, but little attention is paid to the structures of causal reasoning.
One distinction worth being aware of is the difference between deterministic and probabilistic causation. In deterministic causation, effects follow causes. In probabilistic causation, effects follow causes, but not always. For example, smoking causes lung cancer, but not always. Plants grow from seedlings, but not always.
Many of the challenges that we face as a society involve complex probabilistic causation, including our changing climate, the collapse of ecosystems, and the global transmission of disease.6 And children struggle to learn and apply models of probabilistic causation (among other types of causal models) in science classrooms.6 Some research recommends probing students’ developing ideas about causality via explicit discussions, introducing and paying careful attention to causal language.6
Others are calling for a greater general emphasis on statistics and probability in mathematics education.7 These subjects present a structured approach to evaluating claims and grappling with uncertainty, while opening the door to interdisciplinary learning as students use mathematical approaches to answer empirical questions.
- Do experiments
A report on a 2011 survey conducted by the National Assessment of Educational Progress states:
Although doing grade 8 hands-on science activities is nearly universal, carrying out the steps of an investigative process is not. Twenty-four percent of the grade 8 students never discuss their results, thirty-five percent never discuss measurement for their science project and thirty-nine percent of the grade 8 students don’t design an experiment.8
This suggests that the majority of hands-on activities occurring in science classrooms do not involve conducting experiments. Limited time, resources, and the pressure to cover content can make it hard to prioritize experimentation.
However, experiments and inquiry are integral to science education9 by supporting content knowledge and fostering critical thinking.10 Other hands-on learning activities (building models, observing demonstrations, etc.) don’t give students experience with the process and tools to answer questions for themselves. The opportunity to conduct experiments pushes students to grapple with challenges of measurement and when to consider new evidence as “proof.”
Conceptual breakthroughs that might push students to understand more complex ideas can come from close examination of issues related to experimental error. When initially confronted with trying to understand why they didn’t see anticipated results, why results look different from one day to the next, or why results look different between groups, students might be tempted to excuse their results or patch their current understandings. But looking more closely at error in discussion and written reports might add to students’ mental models by falsifying certain ideas, or giving room for students to build from counter-evidence.6
Finally, embracing failure has received tremendous attention in education for building character. Viewing “error” in experimentation as a learning experience may have similar potential.
Understanding the replication crisis is a complex, authentic challenge for science and society. It’s the kind of issue that students might examine in a classroom striving to deeply engage students in understanding the nature of science. It takes depth and nuance to reconcile the notion that science is a limited, biased, human endeavor with the idea that it’s also a powerful tool for understanding the world.
If you’re interested in learning more about skills that are critical for evaluating evidence, follow the Twenty-first Century Information Literacy Tools initiative via The People’s Science. Run by the Learning and the Brain Blog Editor, Stephanie Fine Sasse, this non-profit organization is developing a framework for tackling these issues. You can read more about their model and other models in the recently released book, “Four-Dimensional Education,” whose authors include one of our own contributors, Maya Bialik.
The ideas presented here—including student opportunities for experimentation, tackling conflicting evidence, considering causality, and a different outlook on error—can be used across grade and subject levels to help students understand the nature of science and its place in society more deeply.
For a few starting points on how to carry these out in the classroom, check out the teacher resources below. If you know of other resources, feel free to share in the comments!
References & Further Reading
- Collabo, O. S. (2015). Estimating the reproducibility of psychological science, 349(6251). [Paper]
- Cohen, J. (1990). Things I Have Learned (So Far). American Psychologist, 45(12), 1304–1312. [Paper]
- Prinz, F., Schlange, T., & Asadullah, K. (2011). Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews. Drug Discovery, 10(9), 712. [Paper]
- 4. Flanagan, L. (2015, March 11). What Teens are Learning From ‘Serial’ and Other Podcasts. KQED: Mindshift. [Link]
- Springer, C. W., & Borthick, a. F. (2007). Improving Performance in Accounting: Evidence for Insisting on Cognitive Conflict Tasks. Issues in Accounting Education, 22(1), 1–19. [Paper]
- Perkins, D. N., & Grotzer, T. A. (2000). Models and Moves: Focusing on Dimensions of Causal Complexity to Achieve Deeper Scientific Understanding. [Paper]
- Fadel, C. (2014). Mathematics for the 21st Century: What should students learn? Boston, MA. [Paper]
- Ginsburg, A., & Friedman, A. (2013). Monitoring What Matters About Context and Instruction in Science Education : A NAEP Data Analysis Report. [Paper]
- National Science Teachers Association. (2007, February). The Integral Role of Laboratory Investigations in Science Instruction. [Link]
- Committee on the Development of an Addendum to the National Science Education Standards on Scientific Inquiry; Board on Science Education; Division of Behavioral and Social Sciences and Education; National Research Council. (2000). Inquiry and the National Science Education Standards: A Guide for Teaching and Learning. (S. Olson & S. Loucks-Horsley, Eds.). The National Academies Press. [Link]
- Causal Cognition in a Complex World, Teacher Resources. [Link]
- Critical Media Literacy, Teacher Resources [Link]
- Ongoing Reproducibility Debate, Harvard University [Link]
I really enjoyed the post, and especially the rp:p as the jumping off point. I’d like to offer that this post (https://hardsci.wordpress.com/2016/03/03/evaluating-a-new-critique-of-the-reproducibility-project/) discussing the Gilbert commentary might be a more complete introduction to some of the issues than Gilbert’s website. Thanks for writing about this stuff in an accessible way!