There’s a lot of talk about various tests for Covid-19 and it can get confusing. Discussions of whether tests are useful throw about terms like sensitivity and specificity that often don’t imply what people think they would. In this post, I will start by defining some important terms and then explain how using them helps us understand the effectiveness of various tests and testing strategies. I will also explain why the prevalence of Covid-19 in the population being tested is such an important factor.
When we test someone for Covid-19, the result can either be positive (test says person has Covid-19) or negative (test says person doesn’t have Covid-19). “Positive” and “negative” just describe the test results, not whether they are correct. If the test results correctly say someone has Covid-19, it’s a true positive. If it correctly says the person doesn’t have Covid-19, it’s a true negative.
Please note that while I use Covid-19 testing as the example here, the principles described apply to tests of any type for any disease. All medical tests have sensitivity and specificity, and the number of false vs. true positives and negatives for any test varies the same way described below.
Unfortunately, tests can get wrong results for a variety of reasons. For example, when someone who has Covid-19 is tested using a swab, the swab may not hit a spot on the nose or throat that has virus on it so the results say they don’t have Covid-19. On the other hand, a swab that doesn’t have virus on it may get a result that says it did.
It’s important to distinguish between these two types of errors:
- A false positive test says a person has Covid-19 when they really don’t
- A false negative test says a person doesn’t have Covid-19 when they really do
This table shows the possible combinations:
For a test to be useful, it needs to have most results be either true negatives or true positives. Whether that happens depends on three things:
- Sensitivity is the proportion of people with Covid-19 who the test correctly identifies as having it.
Sensitivity = the probability of getting a positive test if someone has Covid-19
- Specificity is the proportion of people without Covid-19 who the test correctly identifies as not having Covid-19
Specificity = the probability of getting a negative test if someone doesn’t have Covid-19
- Prevalence is the proportion of the population that has Covid-19
Here’s a non-medical example of why prevalence is so important in determining how many errors we’ll end up with. Imagine you have a baseball umpire who’s calls 95% of balls (bad pitches) correctly; his sensitivity is 95%. He calls 95% of good pitches correctly but miscalls 5% of them as balls (specificity is 95%). If an amazing pitcher throws 1 ball and 99 good pitches (the prevalence of bad pitches is 1%), this ump is likely to call 6 balls overall (1 true ball, 5 good pitches). Of these, 5 will be wrong calls. So, 83% of the time the pitches he calls balls will be wrong.
Now let’s take a case where the umpire is calling a lousy pitcher who throws half his pitches as balls (the prevalence of bad pitches is 50%). In a 100-pitch game he’ll correctly call 50 balls and incorrectly call 2-3 good pitches as balls. In this game, 95% of his calls for balls will be correct.
Most of the time that people are tested for a disease, it’s because a medical practitioner suspects they may have it. If the test has 95% specificity and sensitivity, half the people who are tested have it, and we test 200 people, most of the positive tests will be people with the disease. We’ll expect about 95 of the 100 people with the disease and 5 of the 100 people who don’t have it to test positive, in which case 5 of 100 positives will be false positives (only 5%). In the picture below, the pink diamonds represent people who have Covid-19 but test negative (false negatives) while the dark circles represent people who don’t have Covid-19 but test positive (false positives). The picture below that shows all the positive tests – you can see that there are 95 true positives and 5 false positives.
However, if we decide to test a random sample of people in the population for the disease and only 10% of the population has it (prevalence=10%), we’re going to end up with a lot of false positives. When we test 200 people and only 20 of them have the disease, we’ll end up detecting 19 out of 20 (95%) of them, but we’ll also incorrectly diagnose 9 of the 180 (5%) people without the disease. That means that 9 out of 28 diagnoses (about one third) will be wrong.
This is the kind of situation epidemiologist Zachary Binney discusses in his Twitter thread on the new COVID-19 antibody test from Cellex. The test has sensitivity of 93.8% and specificity of 95.6%. If we use it to test a bunch of people chosen randomly and only 5% have had COVID-19 and developed antibodies, a positive test will only be right about half the time. If 30% were infected, a positive test will be right about 90% of the time.
As Binney discusses, if we’re trying to find out what proportion of the population has been infected, epidemiologists have methodology for correcting for this problem so the test will be useful for getting that information. Also, if we’re testing a group who are highly likely to have caught COVID-19, such as health care providers, then we’ll likely be correct more of the time since this will be like the umpire calling balls on the really good pitcher. And if we have a second test that works a little differently (so both tests don’t tend to be wrong on the same people), we can screen people with the first test and then give then the second test to confirm that the first one was right.
I hope this has been helpful for better understanding some of the issues in doing wide-spread testing for COVID-19. Don’t be discouraged if you find you need to read this a couple of times before fully grasping how it works – that happens to most of us learning these concepts for the first time. Please let me know if you have any questions or catch any errors.
Huge thanks to Katherine Boothby for the wonderful graphics and very helpful editorial suggestions.