For this week’s discussion board, I found a peer reviewed article published by Journal of Statistics Education. The article, written by Nicholas P. Maxwell, University of Washington, uses a flipping coin exercise to introduce p-value.

The p-value can be introduced with a coin flipping exercise. One person, A, flips a coin ten times and has the other, B, call each flip. B records his thoughts after each flip. A reports that the caller calls every flip correctly. In this exercise B intuitively rejects a null hypothesis because the p-value is too small. B is reassured to learn from this concrete example that they intuitively followed the logic of statistical inference before they studied statistics.

The meaning of the p-value is essential for understanding statistical inference. Nonetheless, many people have trouble keeping track of what a p-value is. It is common to confuse the p-value with the probability that the null hypothesis is right, or with the probability that the alternative, experimental hypothesis is wrong (Phillips 1971, p. 80; Freedman et al. 1991, p. 435).

Popham and Sirotnik (1992) provide a compelling example of an intuition that follows the logic of statistical inference. They write:

Suppose Joe and June are betting for cups of coffee on the basis of a tossed coin, with the loser buying. Joe does all the coin flipping, and June decides to call tails every time, figuring to win approximately half of the cups of coffee. If the coin turns up heads ten times in a row, June … might begin to suspect that there is something suspicious about the coin — or the person flipping it! (Popham and Sirotnik 1992, p. 48)

Anyone can follow this story, even without a background in statistics or probability. This example can be brought into the classroom, and students can be placed in June’s position. They can see for themselves that, like June, they would reject an idea when something happens that would have been very unlikely had the idea been true.

Professor Maxwell uses a classroom version of Popham and Sirotnik’s (1992) example. To introduce statistical inference, he flips a coin ten times and has a student call each flip. After each flip, he reports how the caller did and then ask the class to record their thoughts about what is happening. While the students record their thoughts, he records on the blackboard whether the caller was correct, leaving space to add the students’ thoughts later. The trick for teaching statistical inference is that he lies: He reports that the caller calls every flip correctly.

Initially, this is a fairly bland exercise, but as the caller succeeds in calling correctly three flips, four flips, and then five flips, the class gets extremely agitated very quickly. Some students get so excited that he must ask them to be quiet to allow others to write down their thoughts as the exercise progresses. During the exercise, he tries to act surprised, but it is not necessary to do a great acting job. The exercise depends on the students figuring out that something funny is going on, so if his expression betrays the ruse, no harm is done. After the last flip, he asks the students what they thought and when they thought it, and he writes their thoughts about each flip on the blackboard.

So, what can we conclude about what was happening? The students report that they concluded that he was lying or that he had prearranged some sort of magic trick with the caller. “Why did you make that conclusion?” They report that they made this conclusion because there is too small a chance that the caller could call ten coin flips in a row correctly. When pressed, they report that, assuming it is a fair coin, and assuming that he’s not lying, and assuming that the caller is not telepathic, then it is extremely unlikely that the caller would call ten flips correctly. He asks, “Seeing that there are several aspects to your initial conception of what was happening, what is the best way to summarize your conclusions?” Most agree (with some prompting) that all they can conclude from the experiment itself is that at least one aspect of their initial conception is likely to have been false; to reach a final conclusion, they have to consider what they know about each of the assumptions to decide which to discard.

He points out that they started with a particular idea of what was going on: the idea was that it was a fair coin, that the caller was not clairvoyant, and that he was reporting the caller’s successes honestly. He explains that an initial conception of what is going on is called the “null hypothesis,” that the probability of the results occurring, if the null hypothesis is true, is called the “p-value,” and that the cut-off used to decide whether to reject the null hypothesis is called “alpha.”

Professor Maxwell uses this exercise just before introducing significance testing and then refers back to it throughout the course whenever students seem to have lost track of what the p-value is. Because the exercise is very memorable, it takes only brief reminders to bring students back to a clear understanding of the p-value.

I would someone to read this paper and write a one page paper based on their own insight and experience regarding this topic.