Could Increases in Influenza Vaccination Rates Give Rise to Exponentially-Related Corresponding Increases in Mortality Rates From Covid-19?
Dr Grahame Blackwell
A Pan-European Study incorporating data from over 350 million subjects
See also this Pan-American Study incorporating data from 200 million subjects


A preliminary analysis based on a document published in the BMJ indicated that overall death rates from Covid-19 in different European countries could be exponentially linked to those countries' influenza vaccination rates (percentages) for over-65s. [This is consistent with a Jan 2020 study in the journal Vaccine linking influenza vaccinations with 36% increased susceptibility to coronavirus (pre-Covid-19).]

This follow-up in-depth analysis, involving over 350 million subjects, considers all European countries for which reliable vaccination data was available on the OECD website at time of preparation.
Data, links and instructions are provided to enable readers to check this analysis for themselves. The steps are as follows:
(1) Identify which data should and shouldn't be used, in accordance with statistical best practice; calculate the Correlation Coefficient, R;
(2) Test the null hypothesis ("no actual connection") by checking the probability that this value for R could occur by chance;
(3) This probability turns out to be less than 0.00001 - so we must reject that hypothesis in favour of the alternative hypothesis:
            "Overall mortality rates from Covid-19 do increase exponentially with increasing percentage vaccination rates of over-65s".
At this point we can add the information given by R2, the Coefficient of Determination: the significance of this figure is given at the bottom of this page.

Conclusion: Rate of total deaths/million of population from Covid-19 rises exponentially with increasing rate of percentage influenza vaccinations for over-65s across Europe. This could be cause-and-effect, or it could be due to some as-yet unidentified third factor linking these two.

It's difficult to imagine what that third factor might be, to give an exponential connection.


[See notes on Statistical Inferencing, in relation to this analysis, here.

This is the detailed version. Click here for the short version.

BMJ publication: Dr Allan Cunningham has collected data from reliable sources for 20 European countries. He's listed this in a Rapid-Response document published in the BMJ. [See it here.]

Dr Cunningham suggests analysing these 20 data pairs to see whether there's any connection between percentage of over-65s receiving the Influenza vaccine in a country and the death rate per million in that country from Covid-19. That analysis can be seen here, with step-by-step instructions on how to do it yourself. To avoid any suggestion of selection bias (cherry-picking sets of figures to give a preferred result), this analysis is updated below, again with step-by-step instructions on how to do it yourself (no specialist expertise needed), using all the latest reliable data as at 1st August 2020.

Figures used here are taken from the OECD website (for over-65s vaccination percentages) and the Worldometer site for mortality rates per million population. First we need to be sure that we have reliable properly-accredited figures. With regard to to over-65 percentage vaccination figures, neither Austria nor Poland have a vaccination percentage documented for later than 2014; Turkey's latest figure is for 2016. It seems possible, then, that any of these three figures could introduce an element of unreliability, so they should be omitted.

Another consideration is that of outliers - data points that lie so far out of the pattern set by the other points that they are clearly being heavily influenced by other factors, and so can't be regarded as part of that pattern. Outliers can be identified to some degree by eye - looking at the scattergram of the graph points - or more accurately by calculating whether they lie outside statistically-defined boundaries and so are most unlikely to be valid elements of the set.

First we need to identify the straight-line relationship between percentage influenza vaccination of over-65s and logs of mortality rates per million, over different countries (using logs to base 10 in this case; any base will give equivalent results). That previous study over 20 countries made it quite clear that any relationship between vaccination rates and death rates will be exponential, so relationship between vaccinations and logs of death rates will be linear - if this assumption is wrong then our results will make that very clear by not giving a significant correlation.

Four points appear by eye to be outside of the general pattern: Greece, Slovakia, Belgium and Iceland (the four furthest-out points on the scattergram below); this may be why three of them were omitted from the earlier analysis. For this in-depth study this requires further investigation, using well-defined principles by which outliers are identified.


Vacc %

The columns of figures to the left give: percentage vaccinations of over-65s (for 2018, or in two cases 2017); deaths per million of total population; logs of those death rates (to base 10). Stats are given for the 26 European countries listed by the OEDC, in the order they're listed (apart from Greece, marked ***), less the three countries noted above as having possibly unreliable figures: Austria, Poland, Turkey - i.e. 23 sets of figures in all.

We can get a best straight-line fit for these 23 sets of figures here. Simply copy-and-paste the 'Vacc %' column of figure into the 'XValues' box and the 'log(d/m)' figures into the 'YValues' box, skip past the 'Estimate' box and press the 'Calculate' button. You'll get the Regression Equation:      y(hat) = 0.022X + 1.11802 .

You'll also see the graph shown here (but without the blue circle).
This shows the 'line of best fit' through the 23 data points.
That line has the equation: Y = 0.022X + 1.11802 .

The point circled in blue is significantly further from the best-fit line than any of the other points. You can confirm this simply by holding a ruler up to your computer screen: the distance from that point to the line is half as much again as the distance from the next nearest point (the one down by '12' on the baseline).

This suggests that the circled point (for Greece) may be an outlier.
The next step is to check mathematically for any outliers.

[The following text, in maroon, can be skipped unless you're particularly interested in calculations for outliers.]
This involves first calculating distances of all the points from the line. It's not intended to cover the maths for that here, those interested (and slightly mathematical!) can find the necessary info here.

To check for outliers we next have to calculate upper and lower quartiles for this set of distances, and from these the interquartile range (having first put those distances into order, either way round). Again, info on these can be found here: in simple terms, upper and lower quartiles (UQ and LQ) are values that mark off the top and bottom values for the 'middle half' of the Y-values - i.e. cutting off the top quarter and the bottom quarter. The interquartile range (IQR) is UQ - LQ; outliers are defined as values greater than UQ+1.5xIQR (or smaller than LQ-1.5xIQR, where small values could also show abnormal situations; that can't apply here, as small values represent data points very close to the regression line). Note that where the number of gaps between lowest and highest point doesn't divide exactly by four, it's necessary to interpolate (work out 'in-between' values) to calculate LQ and UQ.

Long story short: the point for Greece (circled) does turn out to be an outlier, no other points do.
So our correlation calculations shouldn't include the figures for Greece, as they will distort the result.

To find a correlation coefficient, and its level of significance, for the data for the remaining 22 countries, simply copy-and-paste the 'Vacc %' column of figure into the 'XValues' box and the 'log(d/m)' figures into the 'YValues' box of a statistical analysis tool you can find here, omitting the final figures for Greece, marked ***.
[Don't copy the X or Y, and be sure to copy all the numbers (apart from Greece) - preferably in one go for each column.]

Once you've got both sets of 22 figures into the 'X Value' and 'Y Value' boxes (you'll see slider bars on both boxes, that's ok) just click on the 'Calculate R' button. This will produce the Pearson's Correlation Coefficient, referred to as 'R', for logs of death rates compared against over-65 vaccination percentages: it should give you a value of 0.7975, which is high for a set of 22 data pairs (the word they use is 'strong').

R can vary between +1 and -1. +1 is a 100% positive correlation between X and Y, meaning that as X goes up Y also goes up in exact correspondence with it; -1 is a 100% negative correlation between X and Y, meaning that as X goes up Y goes down in exact correspondence with it. Values in between show differing degrees of linking between X and Y, values nearer to zero showing weaker correlation than those nearer +1 or -1.

Scroll down past the calculations (which you don't need - they've done them for you) to get a figure for statistical significance of this result: click on the link that says: "Click here to calculate a p value". You'll be asked to input the R value (0.7975) and the number of data pairs (22)
[You can also choose a significance level if you like - if so, choose 0.01 - but we'd really need a 0.00001 option for the significance of this result!]
The calculator will give you a probability rating for this R for 22 data pairs: less than 0.00001. That's less than 1 in 100,000 (less than 10 in a million) likelihood that this result could happen by chance - in other words, a 99.999% probability that there's a significant link between over-65s influenza vaccination rate and logs of Covid-19 death rate per million of population.

To put that another way: these figures show that there's a strong correlation between an increasing percentage of over-65s being given influenza vaccinations and an exponentially increasing number of deaths per million (of total population) from Covid-19. The likelihood of this being a coincidence (happening by chance) is less than ten in a million.

We can now re-calculate that straight-line fit of the log values (excluding Greece), and from that the exponential fit of the death rates against percentage vaccinations for those 22 data pairs. First simply copy-and-paste the above 'Vacc %' column of figure into the 'XValues' box and the 'log(d/m)' figures into the 'YValues' box here, as before, but omitting the *** figures for Greece. This gives us the straight-line graph below for death-rate log values against % vaccination figure and tells us that the equation for this graph is:           y(hat) = 0.02384X + 1.09086.

If you wish (and you know how to) you can then use the graph facilities in Excel to produce the scatter-plot shown in the second graph below. Converting the equation above to the equivalent exponential equation (by taking both sides to the power 10), we get:
                Y = 101.09086*100.02384X ,             which can be written more tidily as:            Y =12.3271*1.0564X             [* is 'times'].
You could superimpose this line on the scattergraph (Excel again), as below - or you could skip all that and just go to here. Here you'll find another stats tool where you can copy-&-paste the 'Vacc %' and 'deaths/m' figures in (omitting Greece) to see an identical graph to the one below. This tool also confirms the equation for the exponential curve and the value for the Correlation Coefficient, R. It also gives the value of R2 - see below graphs for relevance of this.

R2:   The Coefficient of Determination

R-Squared - which is literally the square of the Correlation Coefficient, R - is a measure of how much the variation in the dependent variable (in our case deaths/million) is related to variation in the independent variable (in our case % vaccination of over-65s). In this case R = 0.7975, giving R2 = 0.636; this in turn gives an Adjusted R2 = 0.618*. In other words, variation in death rate from Covid-19 across European countries is over 60% attributable to its 99.999%-certain exponential link with vaccination rates of over-65s in those countries. Under 40% of that variation in death rate is attributable to other factors.
[* Giving increased precision.]

Note the careful wording here: The stats do not tell us that higher vaccination rates are the cause of higher death rates; stats cannot tell us causes - they can only tell us about connections. But the whole point of identifying connections - and the strength of those connections - is to inform us about things that merit our serious attention. The very fact that this strong connection exists tells us that we need to investigate it: either it is causal or there is some third factor linking these two. To ignore this warning, or to assume "It can't be that, there must be another factor" without checking is both highly irresponsible and highly un-scientific. Without clear evidence to the contrary (such as an obvious provable third factor linking these two) the Precautionary Principle requires that we act on the basis that it's very likely that higher vaccination rates could be causing higher mortality rates - exponentially higher mortality rates.

If this is so, then that exponential factor would imply that those vaccinated are not just at risk themselves, they are also acting as 'propagators' to spread the disease.

The Bottom Line: There is very significant evidence that death rate from Covid-19 is exponentially linked to over-65s influenza vaccination rate, compared across Europe.