I have a buddy that has one child – a son – with a unique, yet familiar name. His son’s name is Bayes. At the time my buddy and his wife named him Bayes and ascribed it to his birth certificate, social security number…and pretty soon after, 529 plan, he then became the sole proprietor of the first name Bayes. I guess that makes his mother and my buddy original. Weird, too, I guess, but who asked you, anyway?
They could be original, weird, or it could mean that not enough people know about the legend of Thomas Bayes, and how a paper he wrote (An essay towards solving a problem in the doctrine of chances) and the discovery of said paper by good friend Richard Price two years after Bayes died, and later was greatly extended by the genius mathematician Pierre Simon Laplace, how would we ever have the advances we have in medicine, evolutionary biology, environmental biology, manufacturing, criminology, and the hundreds of other applied applications.
If you are one of my graduate students currently or wear Nirvana shirts and have never even listened to Nirvana, you likely never had to deal with the spam epidemic. When I was in undergrad, it was out of control. To the point where checking email was a complete waste. Then came the Bayesian Spam Filter.
How does a Bayesian Spam Filter work? I am so glad you asked!
So any spam filter these days applies from millions and millions of observations of email messages. These messages were coded as Spam or Safe. Once these email observations were coded, we could then go back and run some models to identify what content is more likely to be indicative of Spam (using an Explanatory Model, of course!). But, do you and I, who check our email between one and three-hundred eighty times a day care? No, we just want the model to choose the best predictor, and give me a reasonable probability or odds that an email coming through my mailbox is legit or otherwise.
Allow me to try to offer a simple breakdown of Bayes Rule.
So why did I name my son Bayes, you might be asking. Well, let me first explain to you how Bayes Rule works in simplistic terms.
First, we need an event. Let’s say the event we are considering is whether or not our favorite football team will win their game this weekend. That’s our event, and what we have in this event is a level of uncertainty, because there are multiple things that could happen. Your team could win, could lose, the game could be postponed, forfeited, etc.
Second, we need a probability target. This is called a prior, or prior probability. It is also often the same as the term base rate. We could scour the web for all types of places to get a proper estimate. Odds makers will set their thoughts based on stringent criteria. Or you could be a typical fan and look at how your team has fared so far this season, how your team has performed against the other team in other years, look at the rosters and the like and come up with the chances you think your team will win.
Let’s say, for hypothetical purposes, you give your team a 75% chance of winning, which converts to 3:1 odds (75%/(1-75%) = 3).
Great. So the coin is tossed, your team wins the toss. Your team elects to kick. The other team picks an arbitrary side to start from which is virtually meaningless and would be better switched for what food should be served after the event.
So your team kicks. The opponent rumbles the kickoff to their own 40. OK, not ideal, but let’s GO DEFENSE. First and 10. The opponent completes a pass for 30 yards, now to your thirty. Then two long runs and your opponent is first and goal. WHAT ARE WE DOING! You are probably yelling. Two plays later, the team punches it in for six. Makes the extra point. 7-0 Bad Guys.
It’s OK. It’s fine. It’s just one drive. We get the ball, and it’s our turn.
The Bad Guys kickoff to the Good Guys. The returner for the Good Guys fumbles the football! HOW DO YOU FUMBLE THE BALL ON YOUR OWN 24 YARD LINE! Not good….Not good.
But it’s OK. This is how defenses build character, right? Until three plays later and the Bad Guys score AGAIN! As the ball sails through the uprights, it is now 14-0 Bad Guys with only two minutes into the first quarter.
Ouch. You still feeling good on that 75% change guess? What about when your quarterback drives the ball to the 50, then throws an interception, and the Bad Guys return it for a touchdown. A PICK SIX! SERIOUSLY!
For what it is worth, if you ever watch any level of football, this is how Game Probabilities work. With every play, the inflection changes. The score remains the same but the clock is winding down, and it’s going to favor the winner more so than it did two minutes ago. A 14-0 game goes to 14-14, we are going to see the GP come close to even.
All of this discussion about football gets us closer to closing out Bayes Rule. The third thing we need is a new observation. Or even set of observations. It might confirm the probability target, or completely speak against it. Regardless, this is important, because once the calculation comes out of Bayes Rule, we get a more realistic probability that your target will occur.
Because out of the Bayes Rule calculation, which takes your initial event, your probability target, and new observation, and generates that more realistic outcome, often referred to as posterior probability.
Bayes Rule - probability and conditional probability
So let’s first go through some examples of how Bayes Rule works. One of the most commonly used examples is the probability estimate that someone has a certain disease given a positive test.
Let’s call this hypothetical disease QC Disease, QC, of course, standing for Qualitative Chemistry. Those diagnosed with QC Disease are trapped forever in a career in Qualitative Chemistry.
So, let’s say your cousin goes in for a routine check-up, and the doctor says, there is a new disease floating around and it is simply awful. It’s called….QUALITATIVE CHEMISTRY DISEASE! Your cousin gasps. No, no way will I have QC Disease!
So the doc puts your cousin through this battery of tests, putting those weird colored molecular models in front of them, checking cognitive logic for dots behind letters connected to other letters with or without dots, etc. It’s a grueling process, but worth it to discover that she, does not in fact, have QC Disease.
Your cousin is back in the waiting room. Knees buckled, mind racing, heart pounding, ears thrashing. I can’t be a qualitative chemist! I just can’t be!
Then the nurse calls her back. The doctor is waiting in Room 2. He doesn’t look happy. So your cousin says, “just give it to me straight, doc.”
The doc leans in, face white. That’s when your cousin knows. I can’t believe it. I am going to be a qualitative chemist.
The doctor confirms. The test came up positive for QT Disease. NOOOOOOOOOOOOOOOOOOOOOOO!
After an hour of slow-mo reactions of the five stages of grief. The head shaking NO, IT CAN’T BE of denial. Then the anger, turned to rage, at which point the doctors are considering the tranq, But thankfully, the cousin has given into the bargaining before sinking deeply into depression. As the doc returns, acceptance sets in. This is a part of my life now. This IS my life now!
Until she remembers something that you told her. That beautiful something is called the base rate. “Doc, how prominent is QC disease?” she asks.
“QC only affects about 1% of the entire population,” the doc returns.
Then the other thing you told your cousin about something seemingly random but was highly relevant now. “Doc, how often are these tests wrong?” she asks. As in, what is the false positive rate, meaning the percentage of when the test said you have the disease when it was, in fact, wrong!
The doc consults his notes. Turns out, 5% of those that tested positive did not have QC disease.
Your cousin smiles now, asking the next question. “Well, what about the people that don’t have the disease? How many test positive?” The true negative rate.
It might seem like the same question, but it is not, but I will get there. So, the doc consults his notes, and he says the rate is 10%.
She rushes out the door to call you. Run some calculations for me. 1% of everyone has QC disease, which means 99% of people don’t. When the test comes up positive, it is wrong 5% of the time. And when the test is conducted, it comes up wrong 10% of the time.
Take note here that we have something very important, and that is the condition, also known as the conditional probability.
So you do some quick math. There are 500,000 people in this town. If 1% have QC disease, that means 5,000 have the disease. Which also means 495,000 don’t.
Now to those that have the disease. 95% of those with the disease test for it. That means 4,750 people in that group.
On the other side, among the 495,000 people who don’t, 10% are going to be told they have it. That’s 49,500 people!
WOW! So only 5,000 people have this ridiculous disease. Yet they are telling nearly 50K people that don’t have it that they do, and are correct on only 4,750!
This is important to consider. First, tests are flawed. They miss on both ends. Second, the false positive end makes this whole thing ridiculous, and really skews this whole thing pretty big time. Not to mention, how could we ignore the base rate, and how rare this thing really is?
So you do the calculation. What are the chances your cousin is doomed to a life of Qualitative Chemistry? Is it 95%, like the test suggested?
No. The answer, according to Bayes Rule, is about 9%. The math goes like this:
We take those that had the disease conditioned on testing positive, which is 4,750 people. That’s the denominator (the one on top of the division sign).
Then, we have to add that 4,750 to those that did not have the disease conditioned on testing positive, which is 49,500, and we get 54,450.
We finish the equation like so: 4,750/(4,750 + 49,500) = ~9%.