The Effects of Increasing Population Granularity in PM2.5 Population-Weighted Exposure and Mortality Risk Assessment

Introduction The Global Burden of Diseases, Injuries, and Risk Factors Study (Vos et al. 2020) is a global assessment of morbidity and mortality from publicly available data and information on 369 diseases and injuries. According to the Global Burden of Disease study (GBD), 4:14million people died from exposure to ambient PM2:5 (fine particulate matter with a diameter of 2:5 lm or less) in 2019 (GBD 2019a). Developments in earth observation have created opportunities for more granular analysis of relative risk (RR) attributed to PM2:5. Our study assessed the potential value and feasibility of improved spatial and geographic resolution of population density along with more highly resolved spatial concentrations of PM2:5 in Russia, a country of nearly 146million (UNDESA n.d.).


Introduction
The Global Burden of Diseases, Injuries, and Risk Factors Study (Vos et al. 2020) is a global assessment of morbidity and mortality from publicly available data and information on 369 diseases and injuries. According to the Global Burden of Disease study (GBD), 4:14 million people died from exposure to ambient PM 2:5 (fine particulate matter with a diameter of 2:5 lm or less) in 2019 (GBD 2019a). Developments in earth observation have created opportunities for more granular analysis of relative risk (RR) attributed to PM 2:5 . Our study assessed the potential value and feasibility of improved spatial and geographic resolution of population density along with more highly resolved spatial concentrations of PM 2:5 in Russia, a country of nearly 146 million (UNDESA n.d.).
We developed a stepwise process to search for a level of population granularity to better estimate population exposure to PM 2:5 , especially because air pollution can be more concentrated in areas that may be more highly populated. Second, we wanted to explore the usefulness of the more highly spatially resolved PM 2:5 data. Our first step was to map the population of Russia's many large regions more precisely. Russia is an extremely large country, but its people live in a very small area of the country, heavily clustered in cities and towns. To do this mapping, we used the official Russian national population statistics from ROSSTAT (https:// rosstat.gov.ru/folder/210/document/13207; https://rosstat.gov.ru/ compendium/document/13282) that provide total population data for regions, subregions, and municipalities.
For the first step, we used an approach to generally replicate the GBD results by evenly distributing the population across Russia's large regions. Next, we looked at ways to locate the population more precisely. For the second step, we used municipalities which are much smaller than regions. The population of all municipalities of a region add up to the total population of the region. Although it should be noted that Russia's two largest cities, Moscow and St. Petersburg, are themselves classified as regions in these statistics and are separate from their surrounding regions. Then we used OpenStreetMap (OpenStreetMap Contributors n.d.) to map the population more finely through the use of buildings. We used density of buildings as a proxy for population density. Building density was estimated by using its areal footprint, without taking into account either building height or function. Finally, this analysis excluded the area north of 70°, an area without satellite coverage and only ∼ 6,800 inhabitants according to ROSSTAT data (https:// rosstat.gov.ru/compendium/document/13282).
To calculate population-weighted PM 2:5 concentration we conducted the following analyses. For regions, the arithmetic average PM 2:5 concentration of all grid cells belonging to a particular region was calculated. Then we used that region's share of the total Russian population as the weighting coefficient. The identical approach was used for municipalities, using the arithmetic average PM 2:5 concentration of all grid cells belonging to a particular municipality. These calculations, both for regions and municipalities, were calculated only using the lower spatial resolution of 0:1 × 0:1 .
For buildings, populations were assigned proportionately, based on the buildings' areal extent. Within each municipality, for each grid cell, we divided the total footprint of buildings belonging to this grid cell by the total footprint of all buildings within the municipality. This coefficient constitutes a proportion of the population in a municipality that belongs to each grid cell. Then that coefficient was multiplied by the total population of the municipality, which gives a population for each grid cell. This could then be used with the PM 2:5 concentration within each grid cell to estimate the population-weighted PM 2:5 concentration. This calculation for buildings was done for each of the two spatial resolutions (0:1 × 0:1 and 0:01 × 0:01 ).
The results of these analyses provided three levels of population granularity. The least fine-grained had the population evenly distributed within each region, and the medium granularity had a population evenly distributed within a municipality. The most fine-grained had the population distributed by buildings.
To ensure comparability with the GBD we calculated countrywide population-weighted concentration and country-wide mortality risk using GBD data and RR functions. The numbers of deaths from each of these diseases due to PM 2:5 in Russia were downloaded from GBD (2019a) and are available in the Open Science Framework (https://osf.io/egn87/). GBD (2019b) estimates of PM 2:5 attributable mortality are based on RR functions for six diseases: lower respiratory infections; type 2 diabetes; chronic obstructive pulmonary disease; tracheal, bronchial, and lung cancers; ischemic heart disease (IHD); and stroke. The RR is a function of the ambient PM 2:5 concentration for each disease and for each age group as described above and was calculated using the GBD RR tables (GBD 2019a). For concentrations in the range of 5-20 lg=m 3 , the RR could be approximated as a linear function of PM 2:5 concentration.
The population mortality attributable to PM 2:5 pollution was calculated using a simplified formula, not a multi-risk model. It was assumed that the risk attributable to PM 2:5 pollution increased Address correspondence to Michael Brody, Institute for Sustainable Development Studies, Russian Presidential Academy of the National Economy and Public Administration, Moscow, Russia. Email: brody-m@ ranepa.ru The authors declare they have no actual or potential competing financial interests.
Note to readers with disabilities: EHP strives to ensure that all journal content is accessible to all readers. However, some figures and Supplemental Material published in EHP articles may not conform to 508 standards due to the complexity of the information being presented. If you need assistance accessing journal content, please contact ehponline@niehs.nih.gov. Our staff will work with you to assess and meet your accessibility needs within 3 working days.
proportionally to the increase in the attributable fraction of the background mortality corresponded to increase in RR: where MN denotes a new estimate of mortality attributed to ambient PM 2:5 pollution (calculated for each of the six diseases in the GBD analysis (lower respiratory infections; type 2 diabetes; chronic obstructive pulmonary disease; tracheal, bronchial and lung cancers; IHD; and stroke). Additionally, for IHD and stroke, the same specific age groups were used (<5, and then in 5-y cohorts starting at age 25-29 y and ending with the cohort age 90-94 y). In addition, MB j i is mortality (from disease i for age group j) attributed to ambient PM 2:5 pollution reported in GBD. RRN j i stands for the RR at recalculated populationweighted concentration (for disease i and age group j); and RRB j i was calculated for the concentration estimated in GBD (11:6 lg=m 3 ).

Results and Discussion
In this analysis, we were able to use satellite data to better map population densities using regions, municipalities, and buildings. The more granular population data resulted in higher population risk. The effects of these different population aggregations can be seen in Table 1. The four estimates of population-weighted PM 2:5 concentrations for all of Russia are shown in the first column. The table shows that our estimated population-weighted PM 2:5 concentration and mortality using the lowest granularity and spatial resolution is virtually identical to the State of the Global Air (SOGA) estimate of 11:6 lg=m 3 . SOGA is an annual report that is part of the GBD and is a peer-reviewed air quality analysis (HEI 2020). Table 1 also shows that a stepwise increasing granularity led to higher estimates of PM 2:5 exposures and thus to Table 1. Population-weighted PM 2:5 concentration (micrograms per cubic meter) and attributable mortality for all of Russia at three levels of population granularity and by spatial resolution. Population data were from 2018. Population granularity and estimates of PM 2:5 concentrations and attributable mortality were calculated as described in the "Methods" section of the text. PM, particulate matter; PM 2:5 , particulate matter with aerodynamic diameter 2:5 lg or less.

Population unit
Population granularity a   Table 1. This figure compares precision of these population-weighted PM 2:5 concentrations using population allocation by buildings, municipalities, and regions. PWAC of buildings in this figure were based on the calculations using the 0:01 × 0:01 spatial resolution satellite data from Washington University in St. Louis [WUSTL (https://sites.wustl.edu/acag/datasets/surface-pm2-5/)]. These estimates for this figure are described in the "Methods" section in the main text. The black dots represent the PM 2:5 concentrations (micrograms per cubic meter). Each dot on the graph represents PWAC calculated by two alternative methods. The left panel shows the PWAC using average population density in all Russian municipalities (horizontal axis), and the PWAC computed with population density allocated by building footprint within each of the municipalities (vertical axis). On the right panel the horizontal axis is the PWAC calculated using the average population density in each region, and the vertical axis shows PWAC using average population density of each municipality belonging to that region. The diagonal reference line runs from the origin at a 45 angle. If a given dot is on the diagonal line, then both methods provide the same estimate. If a black dot is above the diagonal line, then the more granular method provides a more precise estimation of the PWAC and therefore a higher overall precision for the exposure calculations. Virtually all of the black dots are on or above the diagonal line and many are substantially higher, showing that increased granularity leads to higher PM 2:5 concentration estimates. Note: PM, particulate matter; PM 2:5 , particulate matter with aerodynamic diameter 2:5 lg or less; PWAC, population-weighted average PM 2:5 concentrations.
higher estimates of PM 2:5 attributable mortality. The higher spatial resolution also resulted in higher estimates of PM 2:5 exposure and attributable mortality. The single largest increase in population-weighted PM 2:5 concentration came from the change from regions to municipalities. Figure 1 shows that for any location, population-weighted PM 2:5 concentrations were almost always higher in the municipality, when compared with the larger region, and similarly that for any location, population-weighted PM 2:5 concentrations were higher when resolved by buildings as opposed to municipalities. Using regional population densities and therefore populationweighted concentrations introduced a systematic underestimation, because all but one municipality is above the reference line (starting at the origin running along the 45 angle), and many are substantially higher.
Use of more localized, fine-grained population data made a difference in estimations of air pollution health risk in Russia in comparison with the GBD estimates, while using the same RR functions. Russia is a country with a large proportion of its population living in relatively high urban densities. In the Khabarovsk Region, for example, the population of the two largest cities is 65% of the region's total but only 0.09% of the area. When estimated by region the population-weighted concentration of PM 2:5 was 6:9 lg=m 3 , but when estimated by municipalities it was 11:6 lg=m 3 . Another example is the Alagir subregion of North Ossetia, an area >2,000 km 2 . However, more than half of its population is in the city of Alagir, an area of only 26:7 km 2 . The calculated populationweighted concentration in the Alagir subregion is 12:1 lg=m 3 , but when calculated using the population distributed by buildings, the PM 2:5 weighted concentration rises to 18 lg=m 3 . Also, the use of the more highly spatially resolved concentration data from the WUSTL database also increased estimates of risk but was a smaller effect than from more granular population data. This was an unanticipated finding from this study.
The remarkable aspect of the Global Burden of Diseases is its comparability across the countries of the world. Nonetheless, there may be ways to improve accuracy of assessment of the global burden of diseases. This example for Russia shows how improved mapping of population based on readily available satellite data on settlements and buildings seems to us to be a potential approach to update the GBD while maintaining a globally uniform method to estimate health risk from air pollution. The methodology for increased precision of the exposure assessment could be replicated in other countries and could result in more accurate estimates of the health risks associated with air pollution.
As more of the world's population lives in cities, improving air pollution exposure assessment is important for improving estimates of air pollution attributable mortality and for air pollution regulation itself. These results would make the case that more stringent management of PM 2:5 concentrations are required to reduce population-level risk. Also, using the higher PM 2:5 spatial resolution within cities could be used to identify hotspots with higherthan-average concentrations. The contribution of this methodology is to show the potential for more accurate representation of population distribution by using readily available data from national statistics and additionally using building information available from open sources. These methods are just a first step and can clearly be improved, but even this proposed approximation can benefit air pollution health risk analyses. Mapping populations by buildings is just a start, and clearly, populations are not evenly distributed by buildings just based on footprint, nor did this approach account for height and building use. Accounting for these issues could lead to further improvements in population exposures. Nonetheless, for Russia, we showed an approach to more precise population mapping approach that led to an ∼ 17% higher estimate of populationweighted PM 2:5 concentrations in comparison with the GBD.