Connecting world electricity production with air pollution
It is reasonable to assume that particularly dirty energy production, e.g. coal, will be associated with higher death rates from air pollution. This notebook investigates the relationship between different kinds of electricity production and deaths from air pollution using data from Our World in Data. I find that the relationship between electricity production and air pollution deaths is not quite so straightforward. You can also find this notebook, including the code and data, on GitHub.
Load and prepare data
I have two shapefile sets, but they both have issues. The One is old and is missing at least one new country (South Sudan). The one in world_countries_generalized
is better, but for some reason it does not separate China from Taiwan… So I’ll combine them by hand by loading the geometries for China and Taiwan from world_shapefiles
into world_countries_generalized
.
Here is a sample from world_countries_generalized
:
FID | NAME | ISO | COUNTRYAFF | AFF_ISO | SHAPE_Leng | SHAPE_Area | geometry | |
---|---|---|---|---|---|---|---|---|
73 | 74 | Ethiopia | ET | Ethiopia | ET | 46.810315 | 92.722761 | POLYGON ((45.48940 5.48976, 45.37446 5.36392, ... |
247 | 248 | Wallis and Futuna | WF | France | FR | 0.700608 | 0.013414 | MULTIPOLYGON (((-178.06082 -14.32389, -178.137... |
56 | 57 | Côte d'Ivoire | CI | Côte d'Ivoire | CI | 31.576752 | 26.340497 | MULTIPOLYGON (((-5.33971 5.19775, -5.31977 5.1... |
10 | 11 | Armenia | AM | Armenia | AM | 12.161117 | 3.142291 | MULTIPOLYGON (((46.54037 38.87559, 46.51639 38... |
246 | 247 | Vietnam | VN | Viet Nam | VN | 66.866802 | 27.556082 | MULTIPOLYGON (((107.07896 17.10804, 107.08333 ... |
228 | 229 | Trinidad and Tobago | TT | Trinidad and Tobago | TT | 4.384972 | 0.413753 | MULTIPOLYGON (((-61.07945 10.82416, -61.07556 ... |
69 | 70 | Equatorial Guinea | GQ | Equatorial Guinea | GQ | 8.191007 | 2.188207 | MULTIPOLYGON (((10.41505 1.00250, 10.30861 1.0... |
40 | 41 | Cameroon | CM | Cameroon | CM | 41.960596 | 37.972713 | POLYGON ((10.18107 2.16786, 10.07389 2.16778, ... |
162 | 163 | Niue | NU | New Zealand | NZ | 0.541413 | 0.021414 | POLYGON ((-169.89389 -19.14556, -169.93088 -19... |
34 | 35 | Brunei Darussalam | BN | Brunei Darussalam | BN | 4.918828 | 0.468299 | MULTIPOLYGON (((115.01844 4.89579, 114.98915 4... |
Here is a sample of world_shapefiles
:
NAME | geometry | |
---|---|---|
203 | United Republic of Tanzania | MULTIPOLYGON (((39.68250 -7.99333, 39.65305 -7... |
17 | Burma | MULTIPOLYGON (((98.03581 9.78639, 98.03027 9.7... |
59 | Finland | MULTIPOLYGON (((23.70583 59.92722, 23.64944 59... |
166 | Guinea-Bissau | MULTIPOLYGON (((-15.88583 11.05222, -15.92556 ... |
225 | Netherlands Antilles | MULTIPOLYGON (((-68.19528 12.22111, -68.19278 ... |
214 | Viet Nam | MULTIPOLYGON (((106.60027 8.64778, 106.59248 8... |
60 | Fiji | MULTIPOLYGON (((-178.70776 -20.67444, -178.715... |
21 | Bulgaria | POLYGON ((27.87917 42.84110, 27.89500 42.80250... |
6 | American Samoa | MULTIPOLYGON (((-170.54251 -14.29750, -170.546... |
72 | Guam | POLYGON ((144.70941 13.23500, 144.70245 13.235... |
The data table containing the death types is quite large and contains many different kinds of deaths. For now we are only going to look at Outdoor Air Pollution deaths.
NAME | Code | Year | Deaths - Cause: All causes - Risk: Outdoor air pollution - OWID - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: High systolic blood pressure - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Diet high in sodium - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Diet low in whole grains - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Alcohol use - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Diet low in fruits - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Unsafe water source - Sex: Both - Age: All Ages (Number) | ... | Deaths - Cause: All causes - Risk: High body-mass index - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Unsafe sanitation - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: No access to handwashing facility - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Drug use - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Low bone mineral density - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Vitamin A deficiency - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Child stunting - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Discontinued breastfeeding - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Non-exclusive breastfeeding - Sex: Both - Age: All Ages (Number) | Deaths - Cause: All causes - Risk: Iron deficiency - Sex: Both - Age: All Ages (Number) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Afghanistan | AFG | 1990 | 3169 | 25633 | 1045 | 7077 | 356 | 3185 | 3702 | ... | 9518 | 2798 | 4825 | 174 | 389 | 2016 | 7686 | 107 | 2216 | 564 |
1 | Afghanistan | AFG | 1991 | 3222 | 25872 | 1055 | 7149 | 364 | 3248 | 4309 | ... | 9489 | 3254 | 5127 | 188 | 389 | 2056 | 7886 | 121 | 2501 | 611 |
2 | Afghanistan | AFG | 1992 | 3395 | 26309 | 1075 | 7297 | 376 | 3351 | 5356 | ... | 9528 | 4042 | 5889 | 211 | 393 | 2100 | 8568 | 150 | 3053 | 700 |
3 | Afghanistan | AFG | 1993 | 3623 | 26961 | 1103 | 7499 | 389 | 3480 | 7152 | ... | 9611 | 5392 | 7007 | 232 | 411 | 2316 | 9875 | 204 | 3726 | 773 |
4 | Afghanistan | AFG | 1994 | 3788 | 27658 | 1134 | 7698 | 399 | 3610 | 7192 | ... | 9675 | 5418 | 7421 | 247 | 413 | 2665 | 11031 | 204 | 3833 | 812 |
5 rows × 31 columns
There are a bunch of missing values in the electricity production data table. For some countries, like Afghanistan, there does not appear to be electricity production data dating all the way back to 1900, which is fine. We will simply ignore those periods of time for countries that do not have them.
NAME | Year | Code | population | gdp | biofuel_cons_change_pct | biofuel_cons_change_twh | biofuel_cons_per_capita | biofuel_consumption | biofuel_elec_per_capita | ... | solar_share_elec | solar_share_energy | wind_cons_change_pct | wind_cons_change_twh | wind_consumption | wind_elec_per_capita | wind_electricity | wind_energy_per_capita | wind_share_elec | wind_share_energy | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Afghanistan | 1900 | AFG | 4832414.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | Afghanistan | 1901 | AFG | 4879685.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | Afghanistan | 1902 | AFG | 4935122.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | Afghanistan | 1903 | AFG | 4998861.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | Afghanistan | 1904 | AFG | 5063419.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 129 columns
Analysis of electricity production and air pollution deaths
Now we are ready to analyze this data set.
This plot shows the per-capita electricity use by country in 2019. It mostly tracks how you might expect, with less wealthy countries using less electricity. There are some outliers, like Iceland. This is apparently because there is a very strong aluminum industry in Iceland, which uses a lot of electricity. That, combined with Iceland’s very small population, leads to very high per-capita electricity usage.
We can also see how much coal each country burns to generate electricity on a per capita basis. Australia is a clear outlier with a handful of other countries (China, the United States, South Africa, Kazakhstan, and some Eastern/Southeastern European countries) in a second usage tier.
Next let’s look at annual deaths from outdoor air pollution per capita. There is some correlation between this plot and the previous one showing coal usage for electricity per capita (compare China and Eastern Europe), but there are plenty of places where outdoor air pollution is bad even though there is not much coal being used (Africa, the Middle East, etc.).
Let’s make some more plots to investigate the degree to which deaths from outdoor air pollution are related to fossil fuel use. Because the total number of countries in the world is large, we will focus on analyzing data from the 6 most populous and richest countries. These are nearly mutually exclusive sets which is nice for comparative purposes. There are several very wealthy but very small countries with unusual economies that are not particularly representative (e.g. Qatar, Luxembourg, Singapore), so I am excluding any country with a population less than 10 million. I am also excluding Saudi Arabia becauase, while its population is greater than 10 million, as an oil exporter its electricity source mix is unusually skewed towards oil.
Six most populous countries in the world:
China
India
United States
Indonesia
Pakistan
Brazil
Six richest countries in the world (GDP per capita) with a population
greater than 10 million people (excl. Saudi Arabia):
United States
Australia
Netherlands
Germany
Sweden
Canada
Below we see a series of plots comparing the annual number of deaths per capita from outdoor air pollution to a variety of metrics, mostly related to electricity production. However, before we start looking at those comparisons, I want to show this plot that motivates what I said above: that the most populous countries are generally not the wealthiest countries. They basically fall into three categories, with China, Indonesia, and Brazil having achieved essentially middle-income status, while India and Pakistan are still relatively poor (but getting richer quickly, particularly India!) and the US is much wealthier than the rest. Among the rich countries, we see that they are all pretty much the same, with the US (and to a lesser extent Australia) being a bit wealthier than the rest.
It’s also instructive to look at the scales of the x-axes. The wealthy countries are all within ~20% of each other in GDP per capita, while Brazil, Indonesia, and China are all more than twice as wealthy as India and Pakistan.
Now we’ll look at the relationships that outdoor air pollution deaths have with different electricity production metrics.
The first plot compares to the annual amount of electricity generated per capita from coal. One might expect that this would be correlated with deaths from outdoor air pollution, and ideed we see that this is the case for both populous and rich countries. The lightness of color for each country shows the progression over time, with later years being darker. We can see that, apart from the US, the most populous countries in the world have mostly seen increasing deaths from outdoor air pollution over time, which mostly corresponds to an increase in electricity generation from coal. The rich countries are doing the opposite: reducing their coal electricity generation and simultaneously reducing their rates of outdoor air pollution deaths. The Netherlands are somewhat of an outlier among the rich countries, as they seem to be increasing their coal electricity generation over time, but this has not yet lead to an increase in air pollution deaths. It turns out that there are factors besides coal electricity generation that affect air pollution deaths!
The second plot make the same comparison with solar electricity generation per capita. Here we see that both groups of countries are deploying more solar power, but this is of course not correlated with a decrease in deaths from outdoor air pollution in the populous countries. Building out renewables is not enough; we need to reduce emissions!
If we compare to the fraction of electricity that is generated by fossil fuels, we see somewhat different relationships. I’ll note that these plots have a logisticly scaled x-axis to better show the behavior of countries with either very high or very low fractions of fossil fuel electricity generation.
Even though China is increasing its coal usage for electricity, it is simultaneously reducing the fraction of its electricity that comes from fossil fuels. This is because it is building renewables even faster! We see that most of the other populous countries are mostly holding steady in fossil fuel use, with the exceptions of the US and Brazil. Brazil is unfortunately increasing its fossil fuel use, although it is important to note that this is from an incredibly low baseline.
The rich countries are all reducing their fossil fuel use overall, but there is wide variation in where they are in that process. Sweden has a nearly completely clean grid, while the Netherlands and Australia have comparitvely dirty grids.
Finally let’s compare to the fraction of electricity generated from renewable energy sources. Again, I plot the x-axis on a logistic scale. These are basically just inverses of the previous plots. We can see that the Netherlands and Germany have really scaled their renewable energy over the past 30 years or so, while other countries have been slower.
Extracting some numbers from the above plots
Some of the relationships in the above plots look vaguely linear, so let’s see if we can successfully apply that hypothesis to the rest of our data set. First we’ll take a look at the relationship between the share of electricity generated from fossil fuels and the number of deaths from outdoor air pollution per capita. Because the plot we made showing this relationship had log-logistic axes, the relationship in linear space is complicated. Suffice it to say that a positive slope (in the log-logistic space) indicates that deaths from outdoor air pollution generally go up when the fraction of electricty from fossil fuels increases, and a negative slope indicates the reverse. I plot only those countries that had sufficient data to perform a fit that returned a nonzero slope with a p-value less than 0.1. Countries that do not meet these criteria are shown in gray. I’m using a relatively high p-value because this is not a particularly rigorious study and we are more interested in seeing possible relationships than quantifying anything.
In my opinion, the above figure shows that, in general, weathier countries see a positive relationship between fossil fuel electricity share and deaths from outdoor air pollution. I.e. deaths decrease as fossil fuel use decreases. In particular, I am looking at Europe, North America, and Australia/New Zealand. There are a few outliers in that group, namely Norway, Latvia, Switzerland, Serbia, and North Macedonia. Conversely, many countries in the developing world have a negative relationship between these two variables. I suspect that this is probably related to the fact that as they develop, they are simultaneously increasing fossil fuel usage and improving their public health and healthcare so that illnesses from outdoor air pollution are less likely to be fatal.
Let’s see to what degree this relationship between this slope and a country’s wealth holds up. Below I plot these slopes against each country’s GDP per capita.
We can see that there is some relationship between the two, primarily for countries with a GDP per capita above ~$30,000 per year.
Now let’s look at the relationship between each country’s annual electricity generation from coal and the number of deaths from outdoor air pollution per capita. Again, I plot only those countries that had sufficient data to perform a fit that returned a nonzero slope with a p-value less than 0.1. Countries that do not meet these criteria are shown in gray.
Here we see that most countries have a positive relationship between coal electricity generation and deaths from outdoor air pollution. Norway is clearly an outlier, and several countries with apparently negative power-law exponents are also outliers. Below I show a plot comparing some of those countries.
We see that Norway is an outlier because it has had very little coal in its electricity generation mix for the entirety of this data set. That, combined with its great strides in reducing deaths from air pollution, mean that the slope of this relationship is very large for Norway. Italy and Kazakhstan are also outliers on this plot in that they have negative slopes. For Italy we can see that, even though its fitted slope apparently has a p-value < 0.1, its evolution over time does not seem to be particularly linear. Kazakhstan does seem to be reducing air pollution deaths while increasing its coal usage, perhaps because it is still a developing country.