Bayes' Theorem in Plain Language with a Simple Numerical Example
let's break down Bayes' Theorem in plain language with a simple numerical example. ---
What is Bayes' Theorem?¶
Imagine you have a certain belief about something, like "It might rain today." Then, you get some new information, like "I see dark clouds." Bayes' Theorem is a mathematical way to update your initial belief (your prior) based on this new evidence to get a new, improved belief (your posterior). In simpler terms, it answers the question: "How much should I change my mind about something given new evidence?"
The Formula (don't worry, we'll explain it!):¶
$P(H|E) = \frac{P(E|H) \times P(H)}{P(E)}$
Let's unpack each part: * \(P(H|E)\) (Pronounced "P of H given E"): This is your **Posterior Probability.** * This is what you want to find! It's the probability that your Hypothesis (H) is true given the new Evidence (E) you've observed. This is your updated belief. * \(P(H)\) (Pronounced "P of H"): This is your **Prior Probability.** * This is your initial belief or the general probability that your Hypothesis (H) is true before you see any new evidence. * \(P(E|H)\) (Pronounced "P of E given H"): This is the **Likelihood.** * This tells you how likely it is to observe the Evidence (E) if your Hypothesis (H) were actually true. A high likelihood means the evidence strongly supports your hypothesis. * \(P(E)\) (Pronounced "P of E"): This is the **Evidence (also called the Marginal Likelihood).** * This is the overall probability of observing the Evidence (E), regardless of whether your hypothesis is true or not. It acts as a normalizing factor, ensuring your probabilities add up correctly. We often calculate this using a trick: * \(P(E) = P(E|H)P(H) + P(E|\text{not } H)P(\text{not } H)\) * (This means the probability of seeing the evidence is the probability of seeing it if H is true, plus the probability of seeing it if H is NOT true.) * --- * ### Plain English Summary: * * Bayes' Theorem says: "The updated probability of my idea being true (after seeing evidence) is proportional to how likely the evidence would be if my idea were true, multiplied by my initial belief in the idea." And then you divide by the overall probability of seeing that evidence to keep everything properly scaled .
Numerical Demo: A Rare Disease Test¶
Let's imagine a scenario: * A rare disease affects 1 in 1,000 people in the general population. * There's a test for this disease. * If a person has the disease, the test is positive 99% of the time (it's quite accurate). * If a person does not have the disease, the test is still positive 5% of the time (a false positive). You take the test, and it comes back positive. How likely is it that you actually have the disease?
Step 1: Define our Hypothesis (H) and Evidence (E) * H (Hypothesis): You have the disease. * E (Evidence): Your test result is positive. Step 2: List the Known Probabilities * \(P(H)\) (Prior Probability): The probability of having the disease before you take the test. * \(P(H) = 1/1000 = \textbf{0.001}\) * \(P(\text{not } H)\) (Probability of NOT having the disease): * \(P(\text{not } H) = 1 - P(H) = 1 - 0.001 = \textbf{0.999}\) * \(P(E|H)\) (Likelihood): The probability of a positive test given that you actually have the disease. (This is the test's sensitivity). * \(P(E|H) = \textbf{0.99}\) * \(P(E|\text{not } H)\) (Likelihood of false positive): The probability of a positive test given that you do NOT have the disease. * \(P(E|\text{not } H) = \textbf{0.05}\)
**Step 3: Calculate \(P(E)\) (The overall probability of a positive test)
** We need to consider two ways you could get a positive test: 1. You have the disease AND the test is positive (\(P(E|H) \times P(H)\)) 2. You DON'T have the disease AND the test is positive (\(P(E|\text{not } H) \times P(\text{not } H)\)) \(P(E) = [P(E|H) \times P(H)] + [P(E|\text{not } H) \times P(\text{not } H)]\) 3. \(P(E) = (0.99 \times 0.001) + (0.05 \times 0.999)\) 4. \(P(E) = 0.00099 + 0.04995\) 5. \(P(E) = \textbf{0.05094}\)
So, about 5.094% of all people tested will get a positive result, whether they have the disease or not.
**Step 4: Apply Bayes' Theorem to find \(P(H|E)\)
** Now we have all the pieces: \(P(H|E) = \frac{P(E|H) \times P(H)}{P(E)}\) \(P(H|E) = \frac{0.99 \times 0.001}{0.05094}\) \(P(H|E) = \frac{0.00099}{0.05094}\) \(P(H|E) \approx \textbf{0.0194}\)
Interpretation:¶
If your test result is positive, the probability that you actually have the disease is about 0.0194, or roughly 1.94%. This might seem surprisingly low! Even with a 99% accurate test, if the disease is very rare (0.1% chance initially), a positive result doesn't automatically mean you almost certainly have it. The high rate of false positives among the much larger healthy population significantly impacts the post-test probability. Bayes' Theorem helps us see how our initial low belief (0.1% chance of having the disease) gets updated by the evidence (positive test) to a slightly higher, but still relatively low, belief (1.94% chance).