Saturday, March 24, 2012
The Problem With Ad Hoc Hypotheses (Bayes' Theorem And Coin Flips)
This post was originally part of a longer post I'm writing which is a follow up to these two posts. But I thought it would be good to dedicate a single post to it because I think it's an important concept.
For Bayes' theorem, evidence can only support hypotheses. Hypotheses can be used to support other hypotheses, but unlike evidence, this usually brings down the probability. Right now I'm having a bit of a discussion on my Facebook with someone who believes in ghosts. He says he knows he's been haunted and been in haunted houses, therefore there is more to life than this universe because there are
other planes of existence. My response was that I thought it was more likely that he was mistaken about his encounter with ghosts instead of the human mind being immortal and entire other planes of existence being true.
It's hard to get people to understand why his reasoning is wrong. He is stacking multiple hypotheses to explain an event. I didn't disagree that he experienced something, but his interpretation of that experience is what I was questioning. I kept my explanation simple since I didn't need to posit any other hypothesis besides human fallibility, which we all know is a large number.
When you add hypotheses to explain some event, the probability of each of those hypotheses has to be multiplied together. When a hypothesis is no longer hypothetical, that is, when we have confirmation of it and it becomes unquestioning fact, *that* is when you can add it as evidence for some initial hypothesis. And that is when you use Bayes'.
For example, in order to flip a coin and get three flips of heads in a row, I would have to first flip two heads in a row. In order to flip two heads in a row I have to flip heads on the first flip. Three heads depends on two heads which depends on one heads. Since the probability of flipping heads once is .5, and each additional heads depends on the previous heads, they all multiply together: .5 * .5 * .5 = .125.
In this case, .125 is the prior probability of flipping three heads in a row. What then happens once I flip heads once? It becomes:
P(H) = prior probability of flipping three heads in a row, .125
P(E) = probability of flipping heads, .5
P(E | H) = probability of flipping heads given that I will flip three heads in a row, 1.00 (it is absolutely necessary to flip heads given that I will flip three heads in a row)
P(E | ~H) = probability of flipping heads given that I won't flip three heads in a row, ??? (not really necessary since I already know P(E), though it can be figured out as I demonstrated in the other two posts).
What is the probability that I will flip three heads in a row given that I have flipped heads once?
P(Flipping Three Heads In A Row | Flipping Heads Once) = P(E | H) * P(H) / P(E)
= 1.00 * .125 / .5
= .125 / .5
Given that I have flipped heads once, my prior has moved from .125 to .25. Which is what we would expect, since all we are really doing is subtracting one of the .5 probabilities from the three coin flips .5 * .5 * .5 and end up with only two flips to go -- .5 * .5 -- which equals .25.
And of course, absence of evidence is evidence of absence; I posted the Bayes' theorem formula for that adage:
P(H | ~E) = P(~E | H) * P(H) / P(~E)
P(~E | H) is the compliment to P(E | H), both have to equal 1.00. Since P(E | H) in this example is already 1.00, this leaves nothing left for P(~E | H). Now we go through the anti-Bayes' for absence of evidence:
P(H | ~E) = P(~E | H) * P(H) / P(~E)
= 0 * .125 / .5
= 0 / .5
So upon flipping tails, or the absence of the evidence of flipping heads, my prior probability of flipping three heads in a row plummets to zero.
Back to the ghost hypothesis to explain whatever it was that my friend experienced, he is doing the equivalent of flipping heads three times in a row. I have only flipped heads once. Notice the chain of probability:
1. Experience I can't explain
2. It must be ghosts
3. Other planes of existence
Only 1 is in evidence. 2 is the explanation for 1, and 3 is the explanation for 2. The experience happened, so that is in evidence. The existence of ghosts is a hypothetical used to explain the experience, and the other planes of existence is a hypothetical used to explain ghosts. Since those two hypotheses aren't in evidence, their probability -- whatever they are -- gets multiplied together just like the coin flips. If each of them was 60% probable, his ghost hypothesis used to explain the event is only 36% probable.
On the other hand, my chain of reasoning went like this:
1. Experience he can't explain
2. The ghost explanation is probably hyperactive agency detection
Again, only 1 is in evidence. 2 is just the alternative to his ghost explanation and is 1 - P(Ghosts). In the above I assumed 60%, so this would be 40%. Since my total hypothesis has a 40% chance of being true, and his total hypothesis has a 36% chance of being true, my explanation is more probable even though I favored his hypothesis much more than I should have (60% probability that ghosts exist? 60% probability of another plane of existence? Really?).
In discussions with people, it's important to distinguish between evidence and hypotheses. Evidence is anything that is factual, and the hypothesis is whatever framework is used to explain those facts. Lots of people equivocate between the two. If someone keeps adding hypotheticals to explain some event, this will exponentially lower the probability of their initial hypothesis being true. That is the problem with ad hoc hypotheses, and why Occam's Razor makes sense.
For example, in historical Jesus research, scholars apply criteriology to discover facts, but this is equivocating between fact and hypothesis. Criteriology can only determine hypotheticals; the probability that some saying or event actually happened. Anything discovered via criteriology is not firmly in the fact bin but in the hypothetical bin. People can disagree about the cogency of some hypothetical, but no one should disagree about certain facts ("people are entitled to their own opinions but not their own facts").
Even though both facts and hypotheticals have a certain probability, their probabilities are not utilized in the same way as hopefully the coin flip analogy showed.