Russel’s teapot, or would serology suitable for historical reconstructions of COVID-19?

6 min readJan 21, 2021

“SARS–CoV–2 was circulating outside of China back in 2019”. Sounds familiar? One of most recent claims of this sort came about three weeks ago from Italy. A joined team of Istituto Nazionale Tumori (Milan), University of Siena, and a company VisMederi Srl re-analyzed blood samples spared from a cancer related project with around 1000 participants. Apolone and co-authors reported that up to 15% of the samples might had contained antibodies to the receptor-binding domain (RBD) of SARS-CoV-2 already by September 2019. Can this be true and if not, how the result emerged? Let’s start from the famous article by John Ioannidis:

A quick answer to Ioannidis’ “why” is “because we test too many cases under weaker significance criteria, while negative cases strongly dominate the population”. In such situations cases tested positive are, in fact, mostly false positives.

True positive rate (TP) or sensitivity, percentage of SARS-CoV-2 exposed cases truly declared as positive cases.
Specificity, percentage of cases correctly declared negative, i.e. SARS-CoV-2 unexposed cases.
False positive (FP) rate, the opposite to the above, i.e. 1 — Specifity.
Prevalence, fraction of the population actually exposed to SARS-CoV-2.

In case of PCR-based tests that detect viral RNA in swabs, “exposed” denotes “currently infected”. For serologic tests on blood samples, “exposed” means “infected a while ago and bearing antibodies”. Naturally, the latter figure should be growing from month to month.

It is now well realized that serologic tests allow for rather large false positive rates. This is done deliberately, for not to lose too many true cases. However, the numeric consequences of this are realized not so well — and the “pre-pandemic discovery” reports provide us with a handy example of what exactly happens. The first strangeness with the Italian data was the high rates of SARS-CoV-2 seropositivity reported for the very first month, September 2019. Paradoxically, in the following months we do not see any notable increase of these:

Figure 1 (from Apolone et al., 2020). Frequency of immunoglobulin M (red columns) and immunoglobulin G (blue columns) receptor-binding domain (RBD)–positive cases in respect to the total number of screening participants (green columns) throughout the 24 weeks from September 2019 to February 2020.

This is quite counterintuitive after seeing all those empirical curves of exponential growth during Spring 2020.

I tried to model the counts reported by Apolone and colleagues under two alternative scenarios: the hypothetical “1) COVID-19 already in September 2019” and the commonly accepted “2) No COVID-19 until January 2020”. In this example, the sample size was set to 1000 which is close to the total N in the used SMILE sub-cohort (although their serum samples were split into monthly enrollment subsets 10…30% of the N). Also, we have to model COVID-19 prevalence in the Italian population in a realistic way. For scenario (2), available real-world figures could be used, whereas for (1) we have to somewhat hypothesize, considering the non-zero number of infected persons already in September. The both series shoud then converge to the same level by March, 2020. I managed to achieve acceptable convergence by setting the doubling time (regular time interval during which prevalence would increase twice) to 2 months in scenario (1) and to to 1/2 month in (2). In a population of N=1000, the monthly counts of persons who have experienced COVID-19 infection would then be:

Number* of true seropositives:1) COVID-19 already in September, 2019: [0 0 0 0 0 0 2 7]2) No COVID-19 until January, 2020: [0 0 0 0 0 4 16 64]Number* of reported positives by the RBD-antibody tests (scenario 1):IgM: [20 20 20 20 20 20 21 25]IgG: [100 100 100 100 100 101 101 101]* out of total N=1000, in a series [September...March]

From the two top lines we see that, regardless of scenario, prevalence must be negligible and true positive (TP) counts in this small group would round to zero. The reported counts (the sum of true and false positives, TP+FP) by the IgM and IgG tests would be, on the contrary, essential from the beginning and not grow much during the eight months. These two patterns are reflected by green and blue curves at FIGURE A. We see that the figures for autumn, 2019 reported by Apolone and colleagues could well be obtained in full absence of SARS-CoV-2. The IgM&IgG testing was not capable to distinguish between the two scenarios. Furthermore, it is not suitable to monitor the pandemics development until prevalence would reach levels comparable to its false positive rates (which might be the case now, after the 2nd wave but not in March).

So, what were the reasons for the technique with such high false positive rate to be used? The authors referred to a separate manuscript uploaded to Biorxiv in August 2020 where their custom ELISA assay was presented. The estimate of specificity was reassuring 98.1%. When further applied in the main study, it promised not more than 100.0–98.1 = 1.9% of false positives — a much lower figure than the reported 10–15%. However, Fig. 2 in the first manuscript reveals that the 98.1% estimate was based on just one negative control case tested positive. And that the number of positive control cases was as little as seven. The latter was the ground for estimating sensitivity as 85%. Apparently, confident evaluation of TPR and FPR using such small test sample was not possible.

The authors also compared geographic distribution of their findings with that of epidemic severity by March, 2020 (Fig. 2). They found it encouraging that these patterns matched closely. However, our FIGURE B (panes A,B,C) shows that this was rather a consequence of region’s population sizes and proximity to Milan (where the institute’s trial was centered) — and unluckily became the epidemic hotspot. If we instead look at the proportions (FIGURE B, pane D), the correlation disappears.

Figure 2 (from Apolone et al., 2020). Comparison of the distribution of patients with coronavirus disease 2019 (COVID-19) identified up to March 10, 2020, according to data of the Italian Ministry of Health (www.salute.gov.it), with the distribution of recruited screening subjects (blue dots) and SARS-CoV-2 receptor-binding domain (RBD)–positive screening subjects (red dots) of the SMILE trial (Screening and Multiple Intervention on Lung Epidemics). The national distribution includes 10,149 patients with COVID-19, the 959 recruited screening subjects, and the 111 SARS-CoV-2 RBD-positive screening subjects across the 20 Italian regions (A). The regional distribution includes 5791 patients with COVID-19, the 491 recruited screening subjects, and the 59 SARS-CoV-2 RBD-positive screening subjects across the 12 provinces of Lombardy (B).

A couple of days later, an American Red Cross & CDC team published a report about identification of SARS-CoV-2-reactive antibodies in December 2019-January 2020 blood samples. These authors were more cautious in choosing the time span and results interpretation. Among the two-week time intervals, regions, and ELISA test options, they reported positive rates range between 1.2% and 2.0%. The combinations of IgM versus IgG rates seemed rather contradictory, so that the authors have not made straightforward conclusions, but still found it worth publishing the results. And so did a number of mass media.

However, the popular and much respected in Italy blog Medical Facts di Roberto Burioni stated immediately and decisively:

“At the moment, there is no convincing proof of any SARS-CoV-2 circulation in Italy in autumn 2019.

P.S. it is not up to me to prove that there was no virus in autumn 2019; it’s those who have made the claim should prove that there was. Russel’s teapot, that is.”

Russel’s teapot, or would serology suitable for historical reconstructions of COVID-19?

Log into Facebook

Log into Facebook to start sharing and connecting with your friends, family, and people you know.

Written by Andrey Alexeyenko