Intuition vs. mathematical facts

November 14, 2023 7 minutes reading time statistics

I will be discussing the effectiveness of tests for detecting infections. Think of antigen tests for detecting COVID-19 infections. Antigen tests are a rapid testing method used to detect specific proteins from the virus, providing results within minutes. Although they offer quick and scalable testing solutions, their accuracy and reliability in mass screenings like those conducted in schools during the pandemic may not be as clear-cut as one might assume.

Base rates and conditional probability

Let’s begin with some basics. The base rate refers to the prevalence of a condition in a population before considering additional evidence, which in our case, is arbitrarily assumed to be a 0.5% infection rate as an example. Conditional probability is the probability that an event occurs given that another event has already occurred, which is crucial in understanding test results.

Sensitivity and specificity

Sensitivity refers to the effectiveness of a test in accurately identifying individuals who have a specific disease. For the test in question, its sensitivity is 97%. This implies that out of 100 people who are infected, the test will correctly identify 97 as having the disease. This 97% figure represents a conditional probability, calculated under the presumption that the person being tested is indeed infected. The mathematical expression for this is $p(\mathit{test positive} \mid \mathit{infected})$.

Specificity, on the other hand, tells us how well the test identifies those without the disease, which stands at 95% for our tests. This means if we have 100 non-infected individuals, 95 will correctly test negative.

Superficially, these numbers seem to indicate that results with this specific test are highly accurate. I will prove this assumption wrong by using probability trees and Bayes’ theorem to highlight the high potential for false positives.

Probability tree diagram

Visualizing these concepts, a probability tree diagram would branch out, first splitting the population into infected and not infected, and then further into test positive and test negative for each group. This helps us see the proportion of each outcome in a simple, visual form.


%%{ init: {'theme': 'forest'} }%%
flowchart TD
A[Start] --> |"p(infected) =\n0.5%"| B(infected)
A --> |"p(not infected) =\n99.5%"| C(not infected)
B --> |"p(test positive | infected) =\n97%"| D["true positive\n(sensitivity)"]
B --> |"p(test negative | infected) =\n3%"| E[false negative]
C --> |"p(test positive | not infected) =\n5%"| F[false positive]
C --> |"p(test negative | not infected) =\n95%"| G["true negative\n(specificity)"]
D -.- |"p(infected ∩ test positive)"| H[0.485%]
E -.- |"p(infected ∩ test negative)"| I[0.015%]
F -.- |"p(not infected ∩ test positive)"| J[4.975%]
G -.- |"p(not infected ∩ test negative)"| K[94.525%]

style D fill:#b00,stroke:#000,stroke-width:2px,color:#fff
style G fill:#0b0,stroke:#000,stroke-width:2px,color:#fff



Contingency table

Moving on to the contingency table, we can organize our outcomes into four categories: true positives, false positives, true negatives, and false negatives, with their respective probabilities. Additionally, we sum up the rows and columns to get the total probabilities. This table helps in understanding the distribution and frequency of each result type.

The generic form of the contingency table looks like this:

$B$$\bar{B}$$total$
$A$$p(A \cap B)$$p(A \cap \bar{B})$$p(A) \bar{A}$$p(\bar{A} \cap B)$$p(\bar{A} \cap \bar{B})$$p(\bar{A})$
$total$$p(B)$$p(\bar{B})$$1 If we use the following substitutions and the four probabilities from the diagram above, we can construct our final contingency table: • A = infected • \bar{A} = not infected • B = test positive • \bar{B} = test negative test positivetest negativetotal infected0.485%0.015%0.5% not infected4.975%94.525%99.5% total5.46%94.54%100% Reverse tree diagram In the previous sections, we calculated the overall probabilities for getting a positive or negative test result. Now, we’ll use these probabilities as starting points for our reverse tree diagram. This diagram helps us understand the likelihood of someone being actually sick or not after getting a test result. Here’s how it works: we begin with the total percentage of positive and negative test results that we computed earlier. Then, we use these percentages to figure out two things: how many of these results are likely to come from people who are actually infected and how many are from those who are not. This step helps us untangle the probabilities and see the real chances of being sick or not after a positive or negative test. It’s like backtracking from the test result to the actual condition of the person.  %%{ init: {'theme': 'forest'} }%% flowchart TD A[Start] --> |"p(test positive) =\n5.46%"| B(positive) A --> |"p(test negative) =\n94.54%"| C(negative) B --> |"p(infected | test positive) =\n8.883%"| D[true positive] B --> |"p(not infected | test positive) =\n91.117%"| F[false positive] C --> |"p(infected | test negative) =\n0.016%"| E[false negative] C --> |"p(not infected | test negative) =\n99.984%"| G[true negative] D -.- |"p(test positive ∩ infected)"| H[0.485%] E -.- |"p(test negative ∩ infected)"| I[0.015%] F -.- |"p(test positive ∩ not infected)"| J[4.975%] G -.- |"p(test negative ∩ not infected)"| K[94.525%] style F fill:#b00,stroke:#000,stroke-width:2px,color:#fff  As you can see, from those who received a positive test result, only 8.883% are actually infected, but 91.117% are not. That’s a lot of false positives. Bayes’ theorem calculation Now, to calculate the probability that an individual who tested positive is actually infected, we use Bayes’ theorem as an alternative. You can skip the reverse tree diagram and calculate the probability of p(\mathit{not infected} \mid \mathit{test positive}) directly using the formula below. The formula tells us to multiply the probability of testing positive when not infected by the overall probability of being not infected, and divide it by the overall probability of testing positive.$$p(\mathit{not infected} \mid \mathit{test positive}) = \frac{p(\mathit{test positive} \mid \mathit{not infected}) \cdot p(\mathit{not infected})}{p(\mathit{test positive})}$$If we take the values from the first tree diagram and the table above, we get:$$p(\mathit{not infected} \mid \mathit{test positive}) = \frac{0.05 \cdot 0.995}{0.0546} = 0.91117 = 91.117\%$\$

The reverse tree diagram and the application of Bayes’ theorem both lead to an interesting conclusion. Despite high sensitivity and specificity in testing, the probability of an individual not being infected after a positive test result can be surprisingly high. This situation exemplifies the base rate fallacy, a common error where the base rate (prevalence of disease) is frequently ignored in the interpretation of diagnostic test results.

Societal impact and public health policy implications

While antigen tests are useful, their effectiveness as a mass screening tool is brought into question by the potential for high rates of false positives relative to the actual infection rate. The implications of these statistical misunderstandings extend far beyond individual cases of misdiagnosis. False positives on a large scale can lead to significant public health and economic consequences. Unnecessary quarantines based on incorrect test results can strain public health resources, disrupt educational and workplace environments, and contribute to public anxiety and mistrust in health systems.

Furthermore, the reliance on such tests without a proper understanding of their limitations can lead to complacency in other crucial public health measures. It underscores the importance of comprehensive public health strategies that incorporate multiple layers of testing, tracing, and preventive measures.

In light of these findings, it’s crucial for policymakers and health authorities to be guided not just by the apparent accuracy of these tests but also by an understanding of their limitations in the context of prevailing infection rates. This calls for a nuanced approach to mass screening, where antigen tests are used judiciously and in combination with other diagnostic methods, like PCR tests, especially when initial test results are positive.

Educating the public about the nature of these tests and their interpretation is also vital. It is essential to foster a broader understanding that a positive result in a low-prevalence setting does not equate to a definitive diagnosis. This awareness can play a significant role in managing public expectations and responses during health crises.

By addressing these issues, we can enhance the effectiveness of our public health response, reduce unnecessary burdens on individuals and society, and build a more resilient healthcare system capable of facing future challenges.

Relying on intuition and gut feeling is a double-edged sword. Sometimes, it can be a powerful tool, but it can also lead to disastrous results. If there are tools to double-check your intuition, by all means, use them. Most of the time, they exist.