The Impact of Exposure Measurement Error on the Estimated Concentration–Response Relationship between Long-Term Exposure to and Mortality
Publication: Environmental Health Perspectives
Volume 130, Issue 7
CID: 077006
Abstract
Background:
Exposure measurement error is a central concern in air pollution epidemiology. Given that studies have been using ambient air pollution predictions as proxy exposure measures, the potential impact of exposure error on health effect estimates needs to be comprehensively assessed.
Objectives:
We aimed to generate wide-ranging scenarios to assess direction and magnitude of bias caused by exposure errors under plausible concentration–response relationships between annual exposure to fine particulate matter [PM in aerodynamic diameter ()] and all-cause mortality.
Methods:
In this simulation study, we use daily predictions at spatial resolution to estimate annual exposures and their uncertainties for ZIP Codes of residence across the contiguous United States between 2000 and 2016. We consider scenarios in which we vary the error type (classical or Berkson) and the true concentration–response relationship between exposure and mortality (linear, quadratic, or soft-threshold—i.e., a smooth approximation to the hard-threshold model). In each scenario, we generate numbers of deaths using error-free exposures and confounders of concurrent air pollutants and neighborhood-level covariates and perform epidemiological analyses using error-prone exposures under correct specification or misspecification of the concentration–response relationship between exposure and mortality, adjusting for the confounders.
Results:
We simulate 1,000 replicates of each of 162 scenarios investigated. In general, both classical and Berkson errors can bias the concentration–response curve toward the null. The biases remain small even when using three times the predicted uncertainty to generate errors and are relatively larger at higher exposure levels.
Discussion:
Our findings suggest that the causal determination for long-term exposure and mortality is unlikely to be undermined when using high-resolution ambient predictions given that the estimated effect is generally smaller than the truth. The small magnitude of bias suggests that epidemiological findings are relatively robust against the exposure error. In practice, the use of ambient predictions with a finer spatial resolution will result in smaller bias. https://doi.org/10.1289/EHP10389
Introduction
Exposure measurement error is a central concern in air pollution epidemiology.1 The problem of exposure measurement error arises from using mismeasured exposure in the analysis given that error in exposure assessment is unavoidable. For decades, researchers have studied and taken steps to control the consequences of exposure error through careful research design and data collection, by evaluating the sensitivity of health effect estimates with respect to some degree of exposure error,2–4 and by adjusting for exposure error with any number of methods, including regression calibration,5 Bayesian approaches,6 and simulation extrapolation estimation (SIMEX).7,8 In the context of air pollution, ideally, monitoring an individual’s personal exposure (e.g., the amount of air pollution inhaled into the lungs) would minimize the exposure error and provide the most accurate measurement of exposure. In practice, however, it is usually not feasible to monitor personal exposure for a long period of time, especially for a large population.9
In the past, researchers have used air pollutant concentrations measured at monitoring stations, primarily located in urban areas, as population-averaged exposure measurements.10 Recently, significant advances in satellite remote sensing technology have enabled exposure assessment at a finer spatial resolution, and many spatiotemporal prediction models (e.g., hierarchical models, fusion models, machine learning models) have been developed to obtain high-resolution predictions of ambient pollutant concentrations.11–13 Epidemiological studies then have relied on these spatiotemporal predictions at an individual’s residence as surrogate exposures.14,15 The use of spatiotemporal predictions, although more precise, is still subject to measurement error regarding the extent to which the surrogate is representative of the personal exposure.9
Zeger et al.16 divided the difference between the spatiotemporal ambient air pollution prediction and the error-free personal exposure into three components: the differences between personal and population-averaged exposures, between population-averaged exposure and true ambient level, and between true ambient level and the predicted level. This partition suggests the existence of both classical and Berkson errors. Classical error is assumed to be independent of the true exposure, which occurs when the error-prone exposure is randomly distributed around the truth. Classical error arises when the predicted level is accurate but not precise to the true exposure.17 Conversely, Berkson error is assumed to be independent of the error-prone exposure and occurs when the true exposure is randomly scattered around the error-prone exposure. Berkson error may result from the difference between individual and population-averaged exposures (i.e., the individual exposure may randomly vary around the population-averaged exposure) and also from the spatial smoothing in the modeling of ambient predictions.18 In addition to classical and Berkson errors, which are random by themselves, systematic error may also occur owing to the spatial correlation between neighboring areas, particularly for studies that have ambient predictions with coarse spatial resolution.19 Previous studies have proposed several methodologies to correct for spatially correlated errors, including asymptotic approximation or parametric bootstrap for optimizing exposure model parameters,20,21 but the performance has been mixed. One challenging part of evaluation and correction for spatially correlated errors is to separate the systematic and random components, whose structures are different from each other.22 In a measurement error setting, it is important to characterize both random and systematic errors, especially the type and degree of errors, as well as their spatial correlations, to comprehensively address measurement error problems.
In reality, the specific impact of exposure error on epidemiological findings varies with the error type, and it is therefore informative to discuss each of them in turn.16 Classic error, in which the measured exposure is expected to have more variation than the truth, tends to bias the effect estimate toward the null; whereas under Berkson error, the measurement is less variable than the truth, which results in an unbiased effect estimate but greater variance, provided that the true relationship between exposure and outcome is linear.19 In the presence of confounders, the direction of bias becomes unclear in principle and depends on the correlation of exposure with the confounders.19 Often, such findings are observed under the condition that the concentration–response relationship is correctly specified in epidemiological analysis: for example, investigators simulate the study outcome based on the error-free exposure and a prespecified concentration–response relationship (mostly linear) and then assume the true relationship when estimating the health effect of error-prone exposure.23,24 However, there is the possibility that investigators fail to specify the true concentration–response relationship, and the practice of assuming a linear relationship restricts discussion of the impact of exposure error to only that case.
In this work, we focus on long-term exposure to fine particulate matter [PM in aerodynamic diameter ()] because it is one of the most deadly air pollutants based on current research findings.25,26 We conduct an extensive simulation study to assess the impact of exposure error on the estimated concentration–response relationship between long-term exposure to and all-cause mortality across the contiguous United States using high-resolution ambient predictions as proxy exposure measures. We incorporate a variety of confounders to ensure our simulations are close to reality. The true exposure level is defined as the ambient concentration at an individual’s ZIP Code of residence. The goal is to generate various plausible scenarios of classical and Berkson errors to assess the possible direction and magnitude of bias in the estimated concentration–response relationship between long-term exposure and mortality.
Methods
General Notation and Simulation Overview
We use the following notation throughout the paper: and , respectively, denote the error-free and error-prone exposures; denotes the difference between predicted and monitored levels; denotes the exposure uncertainty estimated from ; denotes the generated exposure error, either classical or Berkson; denotes the generated all-cause mortality outcome; denotes a function of that specifies the assumed true concentration–response relationship between long-term exposure and mortality; denotes the vector of error-free covariates that confounds the association between long-term exposure and mortality; denotes the vector of coefficients of , and is obtained from prior literature.26 Let index ZIP Codes, index grid cells, index years, and index days, which are nested within years. Therefore, for example, is the difference between predicted and monitored at grid cell in year on day .
The general workflow of the simulation process is presented in Figure 1. Specifically, we use daily ambient predictions at grid cells to estimate annual exposures and their uncertainties for ZIP Codes of residence across the contiguous United States between 2000 and 2016. For each ZIP Code in each year, we generate exposure errors of different magnitudes accommodating the spatial correlations among neighboring errors. We consider scenarios in which we vary the type of exposure error and the true concentration–response relationship between exposure and mortality. In each scenario, we generate the numbers of deaths using error-free exposures and confounders of concurrent air pollutants and neighborhood-level covariates and perform epidemiological analyses using error-prone exposures under correct specification or misspecification of the concentration–response relationship between and mortality, adjusting for the confounders. We simulate 1,000 replicates of each scenario.
Annual Prediction
We fit an ensemble model to integrate cross-validated predictions of three machine learning algorithms—neural network, random forest, and gradient boosting—to obtain daily ambient levels at grid cells across the contiguous United States.27,28 As predictors we include ground monitoring data, satellite-derived measurements of aerosol optical depth, meteorological conditions (i.e., air temperature, relative humidity, wind speed, and height of planetary boundary layer), chemical transport model simulations, and land-use terms. The ensemble model demonstrates good predictive performance with a 10-fold cross-validated of 0.89 for annual predictions. The slope of regression splines comparing the predicted and monitored at annual level is almost 1 and the intercept is very close to 0, indicating little bias between the predicted and monitored levels.
These high-resolution predictions allow us to estimate ZIP Code-level with a high degree of accuracy. For a general ZIP Code, which has a normal street delivery route and therefore can be represented by a polygonal area, we estimate the ZIP Code-level by averaging the predictions of grid cells whose centroids lie inside the polygon of that ZIP Code; for the other ZIP Codes that do not have polygon representations, for example, an apartment building, a military base, or a post office, we consider each of them as a single point and estimate their ZIP Code-level by assigning the prediction of the nearest grid cell.25
Exposure Uncertainty Estimation
Let denote the daily difference between predicted and monitored levels for grid cell in year on day . The uncertainty of annual predictions for grid cell in year is defined as the standard deviation of the annual average of , denoted by . For each grid in each year, we estimate temporal correlations in and find that are temporally correlated with the lag-1 serial correlation being 0.2 and decaying to as the lag increases (Figure S1). Therefore, we sample every other day and consider them as independent and identical within year . The is then estimated bywhere is the set of sampled for grid in year and is the number of sampled .
We start by quantifying the uncertainty of annual predictions for grid cells that have monitoring sites. For grid cells that have more than one monitor, we take the average of the monitors as the level for that grid cell. We construct random forests by regressing the uncertainty against predictors, including location (latitude and longitude of the centroid), elevation, percentage urban land-use area, tree canopy, relative humidity, normalized difference vegetation index, annual prediction, and year for grid cells with monitors and then predict the uncertainty for those without monitors using the constructed random forests. The random forests’ variable selection and tuning of parameters are performed by cross-validation among monitored grid cells.
For a general ZIP Code in year , we use formula to estimate its uncertainty, where is the set of grids whose centroids are inside the polygon area of ZIP Code and is the number of grids in set . For the other ZIP Codes, we estimate their uncertainty by linking them to the nearest grid cells.
Generation of Exposure Measurement Error
For ZIP Code and year , we generate the independent, annual exposure measurement error univariately from a normal distribution , where , , or . The value of characterizes the magnitude of error, with a higher indicating larger error. We then create spatially correlated exposure error for ZIP Code and year by integrating the surrounding errors within a radius (Figure S2),where is the independent error for ZIP Code and year ; , , and are averaged errors for ZIP Code-level whose centroids lie respectively within the buffer zones of 10, 10–20, and around ZIP Code ; , , , and are regression coefficients obtained by regressing the exposure error for annual monitored against those within the buffer zones of 10, 10–20, and . Values of , , , and are provided in Table S1.
Error-Free and Error-Prone Exposures
We assume an additive rather than multiplicative exposure error structure because a) the uncertainty of annual predictions is relatively homogeneous and normally distributed across the contiguous United States and over time, and b) the confidence interval for the regression spline comparing the predicted and monitored is extremely narrow throughout the range of levels, indicating that the uncertainty of annual predictions does not spread proportionally with the mean.27 Assuming that the true exposure is the ambient level at an individual’s ZIP Code of residence, under additive structure setup, we obtain the error-free and error-prone exposures for classical and Berkson errors based on their definitions.19
Classical error.
For classical error, we assume that the predicted for ZIP Code and year is the error-free exposure, denoted by , and we obtain the error-prone exposure simply by adding the spatially correlated error to ,
Berkson error.
Conversely, for Berkson error, we assume that the predicted for ZIP Code and year is error-prone, denoted by , and we obtain the error-free exposure by adding to ,
Covariates
To make our simulations closer to the reality where the association is confounded by other factors, we incorporate various potential confounders based on existing literature.14,26 Concurrent air pollution exposures include warm-season ozone () and annual nitrogen dioxide () at ZIP Codes of residence.12,29–31 As with , the ambient and levels are predicted with a spatial resolution of and are aggregated to the ZIP Code level across the contiguous United States. Meteorological factors include annual surface air temperature and specific humidity, which have a spatial resolution of and are obtained from the National Aeronautics and Space Administration North American Land Data Assimilation System project phase 2 project.32 Annual socioeconomic status factors at ZIP Code Tabulation Areas (ZCTAs)—including poverty rate, high school graduation rate, median household income, homeownership rate, median value of owner-occupied housing, population density, percentage Blacks, and percentage Hispanics—are linearly interpolated between the 2000 and 2010 U.S. Censuses and are extracted from the American Community Survey for years after 2010.33–35 Annual behavioral variables at ZCTAs—including percentage ever smokers, average body mass index, and lung cancer rate—are obtained from the Behavioral Risk Factor Surveillance System.36 Annual health care use variables at ZCTAs—including rates of having an annual eye exam, an annual low-density lipoprotein cholesterol test, and an annual mammography exam and distance to the nearest hospital—are obtained from the Dartmouth Health Atlas.37 We also include calendar year and regions of the United States (Northeast, Southwest, West, Southeast, and Midwest) as factors. These covariates are used to generate the mortality outcome and then adjusted in the epidemiological analysis.
Generation of All-Cause Mortality Outcome
We extract all the ZIP Codes of residence in the contiguous United States between 2000 and 2016 from ArcGIS Data and Maps (formerly Esri Data & Maps)38 and link error-free and error-prone exposures and covariates to each ZIP Code in each year. In each replicate, we construct a data set where each record represents a unique ZIP Code–year combination. For ZIP Code and year , the number of deaths is generated assuming that it follows a Poisson distribution with mean ,
Here, is mortality rate, which is determined by the error-free exposure and covariates; is the population, which is set to be 1 for simplicity; is the error-free exposure; is a function of , which specifies the true concentration–response relationship between and mortality; is the transpose of the covariate vector; is the coefficient vector obtained in prior literature,26 with values provided in Table S2.
For the choice of , we consider linear, quadratic, and soft-threshold models (as a smooth approximation to the hard-threshold model where exposures below the threshold value will be safe, i.e., have no effects), three theoretical concentration–response models that have been seen in previous epidemiological studies or have been widely applied in toxicology.14,39 For parameters in , we consider all the combinations of values obtained from the prior literature or based on plausible concentration–response shapes. Detailed specifications are presented in Table 1. To fit within the space constrains, we present results of three representative specifications for each type of concentration–response model. Complete simulation results are available at https://doi.org/10.7910/DVN/QVNKPP.
Model | Specification, | Parameter values | ||
---|---|---|---|---|
Linear | 0, 0.005, 0.012, 0.019 | — | ||
Quadratic | 0.005, 0.012, 0.019 | , , , 0.0001, 0.0002, 0.0003 | ||
Soft-threshold | 0.2, 0.3, 0.4 | 22, 23, 24, 25, 26 | — |
Note: —, not applicable; , error-free exposure for ZIP Code in year .
Epidemiological Analysis
We investigate a total of 162 scenarios characterized by all the combinations of error type (classical or Berkson) and [ plausible concentration–response relationships specified by ] and simulate 1,000 data sets under each scenario. To each data set, we fit a series of Poisson regressions to estimate the concentration–response relationship between mortality and error-prone exposure with the error in different sizes of magnitude (, , or ). The error-prone exposure in the Poisson regression is modeled as a linear term, a quadratic polynomial, or a penalized cubic spline with up to 9 degrees of freedom,40 depending upon the shape of . In particular, for the linear concentration–response relationship, we model the error-prone exposure as linear to assess the magnitude of bias by comparing the averaged coefficients of 1,000 replicates with the parameter in . For quadratic and soft-threshold relationships, we model error-prone exposure with a quadratic polynomial or a penalized cubic spline and estimate the concentration–response curve by averaging 1,000 replicated responses at each exposure level to contrast with the shape of .
Sensitivity Analysis
Under the linear concentration–response model, we conduct a sensitivity analysis with respect to the range of spatial correlation among neighboring errors by adding an addition correlation for distances between . Proceeding as described above, we also conduct a low-level analysis by restricting to ZIP Codes with annual predictions always (the national standard41) during the study period, and single-pollutant analysis by removing concurrent air pollutants of warm-season and in the generation of mortality outcome and epidemiological analysis. Finally, we evaluate the biases caused by independent errors for all the scenarios.
R scripts are provided in “Rscripts.zip,” in the Supplemental Material, and are also available at https://github.com/yycome/EME_PM25_Mortality. Interested parties are encouraged to apply our codes to conduct replication research or propel new research.
Results
We include a total of 39,401 ZIP Codes of residence without missing values in each variable, covering 95% of all the ZIP Codes used by the U.S. Postal Service in the contiguous United States.38 Among these, 3,871 (9.8%) have monitoring sites and 14,961 (38.0%) have annual predictions always during the study period. For general ZIP Codes that have polygon representations, the average distance from the ZIP Code centroid to boundary is ; the 5th and 95th percentiles of the distance from the centroid to boundary are 1.2 and , respectively. In each replicate, the data set consists of 649,910 observations (i.e., 649,910 unique ZIP Code–year combinations). The annual predictions over all the ZIP Codes from 2000 to 2016 range from 0.1 to and approximately follow a normal distribution with the mean of and standard deviation of (Figure 2). The spatially correlated errors have slightly greater variability than independent errors (Figure 3).
Table 2 presents the relative bias between the assumed true effect of error-free exposure and the estimated coefficient of error-prone exposure obtained from epidemiological analysis under three representative specifications of the linear concentration–response model. Simulation results are consistent for each error type and magnitude. Complete results are available at https://doi.org/10.7910/DVN/QVNKPP. Specifically, the biases caused by classical errors are all negative (i.e., toward the null), ranging from to and becoming larger as the magnitude of error increases, whereas Berkson errors result in much smaller biases that range from to 0.12% and are either positive or negative. The size of Berkson error–induced biases does not tend to change as we increase the magnitude of errors. The results remain consistent for errors incorporating spatial correlation among neighboring errors up to (Table S3). The direction of classical error–induced biases remains negative in low- and single-pollutant analyses; compared with those in the main analysis, the size of classical error–induced bias increase in the low-level analysis and decrease in the single-pollutant analysis (Table S4). Consistent with the main results, the Berkson error–induced biases are either positive or negative and remain much smaller than the classical error–induced biases in low- and single-pollutant analyses. For the scenario in which we assume that has no effect on mortality (), the bias occurs in either positive or negative direction and is so small as to be negligible (Tables S5 and S6). The size of error does not affect the inference of fitted effects () through comparing the width of the confidence intervals (Tables S7 and S8). Varying baseline mortality risk () does not have any impact on size or direction of bias.
Classical (%) | Berkson (%) | Classical (%) | Berkson (%) | Classical (%) | Berkson (%) | ||
---|---|---|---|---|---|---|---|
Log(8) | 0.019 | 0.02 | 0.08 | 0.00 | |||
Log(12) | 0.012 | ||||||
Log(20) | 0.005 | 0.12 | 0.12 |
Note: , where is the averaged coefficient of 1,000 replicates; the magnitude of exposure error is determined by , , or , where is the estimated exposure uncertainty for ZIP Code in year .
Figure 4 illustrates discrepancies between the true and fitted concentration–response curves under representative specifications of the quadratic concentration–response model, taking either a concave () or a convex () upward shape. Complete simulation results are available at https://doi.org/10.7910/DVN/QVNKPP. The fitted curves are obtained by averaging 1,000 replicated responses at each exposure level. Overall, biases are small in size and become relatively larger as the exposure level increases. When error-prone exposure is modeled with a quadratic polynomial in epidemiological analysis (i.e., the concentration–response relationship is correctly specified), the classical error introduces negative bias, whereas Berkson error does not induce any bias. When error-prone exposure is modeled with a penalized cubic spline (i.e., the concentration–response relationship is misspecified), both types of errors induce negative biases. In each model specification, classical error produces larger bias than Berkson error. Again, varying baseline mortality () does not affect size or direction of bias.
Figure 5 compares the true and fitted concentration–response curves under representative specifications of the soft-threshold concentration–response model with different levels of curvature, where the error-prone exposure is modeled with a penalized cubic spline in epidemiological analysis (i.e., the concentration–response relationship is misspecified). Complete simulation results are available at https://doi.org/10.7910/DVN/QVNKPP. Both types of errors induce little bias at levels , where over 99.99% observations fall within this range; at extremely high exposure levels, both types of errors induce negative bias. Biases gets relatively smaller as the curvature increases. In each model specification, classical error induces larger bias than Berkson error.
We also tested biases caused by independent errors without accommodating the spatial correlations among neighboring errors. The biases show the same patterns as for the spatially correlated errors under each scenario but are smaller in size. (Tables S9 and S10, Figures S3 and S4).
Discussion
In this simulation study, we use ambient predictions as proxy exposure measurements at individual residence to evaluate the direction and magnitude of bias caused by exposure measurement error in various plausible concentration–response relationships between long-term exposure to and mortality. We find that both classical and Berkson errors can bias the concentration–response curve toward the null, suggesting that the causal determination for the long-term exposure and mortality drawn from studies using high-resolution ambient predictions is unlikely to be undermined given that the estimated effect is generally smaller than the truth. Further, all the biases are small in size even when using three times the predicted uncertainty to generate exposure errors, suggesting that epidemiological findings are relatively robust against the exposure measurement error from high-resolution ambient predictions. In our settings, because all residents are assigned to an average exposure level in each ZIP Code each year, the Berkson error may be considered as the dominant error source. Therefore in practice, even smaller bias may be expected. In the low-level analysis, the finding of a greater downward bias in models restricted to people never exposed above the national standard provides assurance that health effects below the standard are not being overestimated. Finally, from our finding that a larger error is associated with larger bias, it is inferred that the use of ambient predictions with a finer spatial resolution will result in smaller bias, and that previous studies that use less accurate predictions are more likely biased downward. Indeed, a recent meta-regression found that studies that have cruder exposure estimates tended to have lower estimated effect sizes.42
The observed small magnitude of biases is consistent with Gryparis et al.,22 who found relatively little bias when using spatiotemporal predictions in health effect estimate with an average exposure metric of 9 months. The underlying reason is that the long-term average concentrations are relatively stable, which is consistent with Pinto et al.43 and Wang et al.44 who observed low levels of variability of annual from monitoring sites. Given the reasonably good performance of our prediction model, the annual level predicted and monitored are highly consistent, which leads to generated errors being small. As a result, we obtain slight differences between error-prone and error-free exposures, as well as between their variability, with a variance ratio of the error-prone and error-free exposures ranging from 0.97 to 1.03. By allowing the variance ratio to take values between 0.5 and 2, Butland et al.2 observed up to 65% of relative bias caused from exposure error. However, their settings were at the daily level and from ambient predictions with lower accuracy, which may not reflect annual exposures and the higher accuracy of modern spatiotemporal models.45,46
Despite the small magnitude of bias, our findings confirm previously published results suggesting the negative direction of bias caused by classical error, including those by Butland et al.,24 Samoli et al.,47 Kioumourtzoglou et al.,23 and Goldman et al.18 Assuming a linear concentration–response relationship between and mortality, we find either positive or negative bias caused by Berkson error, which is consistent with the results of Bateson et al.,5 although the magnitude was even smaller or trivial. Further, we simultaneously consider the two important issues in air pollution epidemiology: exposure measurement error and misspecification of concentration–response relationship between exposure and outcome.48 When the nonlinear concentration–response relationship is misspecified, we find that both classical and Berkson errors produce negative bias, given that the nonlinearity has been adequately captured by the penalized cubic spline. In particular, for the soft-threshold models, the fitted penalized splines using error-prone exposures detect the thresholds with good accuracy and no upward bias, suggesting that exposure errors are unlikely to impede detection of the thresholds, if present. In line with Richmond-Bryant and Long,19 our findings suggest that the specific impact of exposure error is complex and depends on multiple factors, such as error type and magnitude, the concentration–response relationship between exposure and outcome, and the degree of misspecification. Because nature is rarely so simple as assumed in a quadratic or soft-threshold relationship, further investigation is clearly warranted.
The spatial correlation of the exposure error is an important concern in driving the unreliability of health effect estimates.19 In our study, the advanced ensemble approach used to predict air pollution concentrations demonstrate high accuracy compared with the observed monitored data, and the lack of bias and high level of cross-validated suggests that the prediction model is not overfitted. In this scenario, the potential for very strong correlations among neighboring errors is likely to be low. To accommodate spatial correlation, we incorporated spatial correlation estimated from neighboring monitoring sites, resulting in slightly greater variability of exposure error and larger bias in the health effect estimates, which is consistent with Goldman et al.,49 who identified negative bias related to the spatial correlation. In epidemiological studies, although systematic error may exist when spatial correlation is not adequately adjusted, the health effect estimates are underestimated and are thus conservative.
We avoid the temporal correlation of daily differences between predicted and monitored levels by sampling them every other day and considering them independent. Given the low levels of serial correlation, this simplified sampling approach is at least approximately correct. By comparison, the root mean square error (RMSE) of daily predictions for each year reported by Di et al.,27 on which the simulations are based, ranges between 2.404 and 3.385, which is relatively consistent with the average standard deviation of our sampled daily differences of 2.101. The consistency brings value through validity of our uncertainty estimates and so the reliability of conclusions of simulations. The RMSEs are slightly larger than our estimate because they measured the unconditional (overall) uncertainty across grids, whereas our estimate measures the conditional uncertainty within each grid.
This study includes neighborhood-level covariates in its simulations, including control for other air pollutants that may be correlated with and its error. It does not include any individual characteristics such as sex, race/ethnicity, age, or socioeconomic status because we believe they are unlikely to be confounders in the analysis. In principle, ambient air pollution in neighborhoods would not be influenced by individual-level covariates. Rather, it is neighborhoods with lower levels of household income, higher smoking and obesity rates, and so on, that may also be neighborhoods with higher air pollution levels.50 This is because, for example, if a person with high income were to move to a low-income neighborhood, they would receive an exposure level associated with characteristics at the neighborhood level and not at the individual level. Hence, only neighborhood-level covariates are the confounders, which we have controlled for. Similar arguments apply to personal behavioral stressors—such as the use of tobacco, alcohol, or drugs; exercising; or transportation: We think that air pollution is spatially patterned in neighborhoods with different neighborhood-level behavioral characteristics, not by whether individuals have higher or lower exposures.51 Since the exposures are air pollution levels in the neighborhoods, it is the neighborhood characteristics that are the confounders.
The large geographical area covered and comprehensive set of scenarios tested are the major strengths of this study. Although relatively new in air pollution epidemiology, the soft-threshold concentration–response model introduces more flexibility by assuming that even if there is a threshold for a health effect, its level will have some distribution, making the soft-threshold model closer to reality than the hard-threshold model.52 In addition, although this study has a cohort design for the long-term exposure, the Poisson regression we posited is widely used for time-series analysis to investigate short-term exposures23; hence our findings may be generalized to other settings, although the magnitude and type of error may differ. Further, our ambient predictions have been used for a large number of studies, and thus our results may be directly applicable to them.14,26,53 Finally, although this study is motivated by , our proposed approach is applicable to other pollutants and other areas dealing with contextual data, such as neighborhood socioeconomic status, walkability, noise, or nature environments. However, the results may differ because of the differences in exposure errors.
Our simulation framework and the scenarios discussed makes room for new and different questions to be raised and opportunities for further research. First, the impact of exposure measurement error is influenced by many other factors, such as the measurement error of confounders, and whether and how the errors of exposure and confounders are related.24 Such issues are very specific to the nature of confounders and may be considered in future studies. Second, because air pollution and itself are complex mixtures of various components, it will be informative to consider multipollutant scenarios to identify which ones most bias the health effect estimates. However, given that the components are highly positively related,54 it is challenging to characterize the error size and structure of each component and to separate their independent impacts. It is also important for subsequent analyses to address the problem of effect transfer from a pollutant measured with larger error, such as , to that measured with less error, such as .55 Third, although we consider many different scenarios for the concentration–response relationship between exposure and mortality, we take only two scenarios for the error type: 100% classical or 100% Berkson. It would be interesting for future studies to explore the error decomposition and evaluate the impact of mixture error, which is generally accepted to be more realistic.56 Finally, assessing the impacts of exposure measurement error by region or urbanicity is an important unanswered question given that less monitored areas likely have larger errors. In this work, our findings are likely conservative, in that we have given equal weighting to those more error-prone, but less populous, locations.
Exposure measurement error–induced bias is pervasive but commonly ignored in air pollution epidemiology. In practice, because hybrid spatiotemporal predictions are being used more often in recent years, our findings offer an opportunity to interpret whether the bias might or might not plausibly account for the relations observed in epidemiological studies, in particular whether the bias is upward. When drawing conclusions about general causations and policy implications, we hope more epidemiological studies will include a quantitative estimate of the effect of measurement error on the results,57 along with other bias correction techniques of, for example, Alexeeff et al.8 and Hart et al.,58 to treat measurement error more seriously. Now that several high-resolution, well-validated prediction models of ambient pollutant concentrations are publicly accessible,28,30,31 we hope these data sets will be more widely used.
Conclusions
Our simulations investigate the direction and magnitude of bias caused by exposure measurement error in the estimated concentration–response relationship between long-term exposure to and all-cause mortality across the contiguous United States, under an extensive range of scenarios. Using high-resolution ambient predictions as proxy exposure measurements, our results demonstrate that both classical and Berkson errors can bias the concentration–response curve toward the null, suggesting that the causal determination for the long-term exposure and mortality drawn from studies using high-resolution ambient predictions is unlikely to be undermined given that the estimated effect is generally smaller than the truth. Further, the small magnitude of biases suggests that epidemiological findings are relatively robust against exposure measurement error from high-resolution ambient predictions. Finally, from our finding that a larger error is associated with larger bias, it is inferred that the use of ambient predictions with a finer spatial resolution will result in smaller bias.
Acknowledgments
This publication was made possible by the U.S. Environmental Protection Agency (EPA) grants RD-8358720 and RD-83587201-0 (both to J.D.S.). Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the U.S. EPA. Further, the U.S. EPA does not endorse the purchase of any commercial products or services mentioned in the publication. This publication was also made possible by National Institutes of Health grants R01ES032418-01 (to J.D.S.), R01AG074357-01 (to L.S.), and ES-000002 (to J.D.S.).
The grid-level fine particulate matter () data are available at https://doi.org/10.7927/0rvr-4538. The grid-level ozone () data are available at https://doi.org/10.7927/a4mb-4t86. The grid-level nitrogen dioxide () data are available at https://doi.org/10.7927/f8eh-5864. The ZIP Code-level , , and , and uncertainty data are available from the corresponding author on reasonable request. Other covariate data are publicly available with sources described in the manuscript. Data availability that relates to predictors in the prediction model can be referred to Di et al.27
Article Notes
*
These authors contributed equally to this work.
All authors declare they have no actual or potential competing financial interest.
Supplementary Material
References
1.
Armstrong BK, Saracci R, White E. 1992. Principles of Exposure Measurement in Epidemiology. New York, NY: Oxford University Press.
2.
Butland BK, Samoli E, Atkinson RW, Barratt B, Katsouyanni K. 2019. Measurement error in a multi-level analysis of air pollution and health: a simulation study. Environ Health 18(1):13. https://pubmed.ncbi.nlm.nih.gov/30764837/, https://doi.org/10.1186/s12940-018-0432-8.
3.
Girguis MS, Li L, Lurmann F, Wu J, Urman R, Rappaport E, et al. 2019. Exposure measurement error in air pollution studies: a framework for assessing shared, multiplicative measurement error in ensemble learning estimates of nitrogen oxides. Environ Int 125:97–106. https://pubmed.ncbi.nlm.nih.gov/30711654/, https://doi.org/10.1016/j.envint.2018.12.025.
4.
Keller JP, Peng RD. 2019. Error in estimating area-level air pollution exposures for epidemiology. Environmetrics 30(8):e2573, https://doi.org/10.1002/env.2573.
5.
Bateson TF, Wright JM. 2010. Regression calibration for classical exposure measurement error in environmental epidemiology studies using multiple local surrogate exposures. Am J Epidemiol 172(3):344–352. https://pubmed.ncbi.nlm.nih.gov/20573838/, https://doi.org/10.1093/aje/kwq123.
6.
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. 2006. Measurement Error in Nonlinear Models: A Modern Perspective. 2nd ed. Hoboken, NJ: Taylor & Francis.
7.
Stefanski LA, Cook JR. 1995. Simulation-extrapolation: the measurement error jackknife. J Am Stat Assoc 90(432):1247–1256, https://doi.org/10.1080/01621459.1995.10476629.
8.
Alexeeff SE, Carroll RJ, Coull B. 2016. Spatial measurement error and correction by spatial SIMEX in linear regression models when using predicted air pollution exposures. Biostatistics 17(2):377–389. https://pubmed.ncbi.nlm.nih.gov/26621845/, https://doi.org/10.1093/biostatistics/kxv048.
9.
Weisskopf MG, Webster TF. 2017. Trade-offs of personal versus more proxy exposure measures in environmental epidemiology. Epidemiology 28(5):635–643. https://pubmed.ncbi.nlm.nih.gov/28520644/, https://doi.org/10.1097/EDE.0000000000000686.
10.
Dominici F, Peng RD, Bell ML, Pham L, McDermott A, Zeger SL, et al. 2006. Fine particulate air pollution and hospital admission for cardiovascular and respiratory diseases. JAMA 295(10):1127–1134. https://pubmed.ncbi.nlm.nih.gov/16522832/, https://doi.org/10.1001/jama.295.10.1127.
11.
Di Q, Kloog I, Koutrakis P, Lyapustin A, Wang Y, Schwartz J. 2016. Assessing PM2.5 exposures with high spatiotemporal resolution across the continental United States. Environ Sci Technol 50(9):4712–4721. https://pubmed.ncbi.nlm.nih.gov/27023334/, https://doi.org/10.1021/acs.est.5b06121.
12.
Requia WJ, Di Q, Silvern R, Kelly JT, Koutrakis P, Mickley LJ, et al. 2020. An ensemble learning approach for estimating high spatiotemporal resolution of ground-level ozone in the contiguous United States. Environ Sci Technol 54(18):11037–11047. https://pubmed.ncbi.nlm.nih.gov/32808786/, https://doi.org/10.1021/acs.est.0c01791.
13.
Li L, Lurmann F, Habre R, Urman R, Rappaport E, Ritz B, et al. 2017. Constrained mixed-effect models with ensemble learning for prediction of nitrogen oxides concentrations at high spatiotemporal resolution. Environ Sci Technol 51(17):9920–9929. https://pubmed.ncbi.nlm.nih.gov/28727456/, https://doi.org/10.1021/acs.est.7b01864.
14.
Di Q, Wang Y, Zanobetti A, Wang Y, Koutrakis P, Choirat C, et al. 2017. Air pollution and mortality in the Medicare population. N Engl J Med 376(26):2513–2522. https://pubmed.ncbi.nlm.nih.gov/28657878/, https://doi.org/10.1056/NEJMoa1702747.
15.
Wei Y, Wang Y, Di Q, Choirat C, Wang Y, Koutrakis P, et al. 2019. Short term exposure to fine particulate matter and hospital admission risks and costs in the Medicare population: time stratified, case crossover study. BMJ 367:l6258. https://pubmed.ncbi.nlm.nih.gov/31776122/, https://doi.org/10.1136/bmj.l6258.
16.
Zeger SL, Thomas D, Dominici F, Samet JM, Schwartz J, Dockery D, et al. 2000. Exposure measurement error in time-series studies of air pollution: concepts and consequences. Environ Health Perspect 108(5):419–426. https://pubmed.ncbi.nlm.nih.gov/10811568/, https://doi.org/10.1289/ehp.00108419.
17.
Szpiro AA, Paciorek CJ, Sheppard L. 2011. Does more accurate exposure prediction necessarily improve health effect estimates? Epidemiology 22(5):680–685. https://pubmed.ncbi.nlm.nih.gov/21716114/, https://doi.org/10.1097/EDE.0b013e3182254cc6.
18.
Goldman GT, Mulholland JA, Russell AG, Strickland MJ, Klein M, Waller LA, et al. 2011. Impact of exposure measurement error in air pollution epidemiology: effect of error type in time-series studies. Environ Health 10:61. https://pubmed.ncbi.nlm.nih.gov/21696612/, https://doi.org/10.1186/1476-069X-10-61.
19.
Richmond-Bryant J, Long TC. 2020. Influence of exposure measurement errors on results from epidemiologic studies of different designs. J Expo Sci Environ Epidemiol 30(3):420–429. https://pubmed.ncbi.nlm.nih.gov/31477780/, https://doi.org/10.1038/s41370-019-0164-z.
20.
Szpiro AA, Paciorek CJ. 2013. Measurement error in two-stage analyses, with application to air pollution epidemiology. Environmetrics 24(8):501–517. https://pubmed.ncbi.nlm.nih.gov/24764691/, https://doi.org/10.1002/env.2233.
21.
Szpiro AA, Sheppard L, Lumley T. 2011. Efficient measurement error correction with spatially misaligned data. Biostatistics 12(4):610–623. https://pubmed.ncbi.nlm.nih.gov/21252080/, https://doi.org/10.1093/biostatistics/kxq083.
22.
Gryparis A, Paciorek CJ, Zeka A, Schwartz J, Coull BA. 2009. Measurement error caused by spatial misalignment in environmental epidemiology. Biostatistics 10(2):258–274. https://pubmed.ncbi.nlm.nih.gov/18927119/, https://doi.org/10.1093/biostatistics/kxn033.
23.
Kioumourtzoglou MA, Spiegelman D, Szpiro AA, Sheppard L, Kaufman JD, Yanosky JD, et al. 2014. Exposure measurement error in PM2.5 health effects studies: a pooled analysis of eight personal exposure validation studies. Environ Health 13(1):2. https://pubmed.ncbi.nlm.nih.gov/24410940/, https://doi.org/10.1186/1476-069X-13-2.
24.
Butland BK, Samoli E, Atkinson RW, Barratt B, Beevers SD, Kitwiroon N, et al. 2020. Comparing the performance of air pollution models for nitrogen dioxide and ozone in the context of a multilevel epidemiological analysis. Environ Epidemiol 4(3):e093. https://pubmed.ncbi.nlm.nih.gov/32656488/, https://doi.org/10.1097/EE9.0000000000000093.
25.
Wei Y, Wang Y, Wu X, Di Q, Shi L, Koutrakis P, et al. 2020. Causal effects of air pollution on mortality in Massachusetts. Am J Epidemiol 189(11):1316–1323. https://pubmed.ncbi.nlm.nih.gov/32558888/, https://doi.org/10.1093/aje/kwaa098.
26.
Wei Y, Yazdi MD, Di Q, Requia WJ, Dominici F, Zanobetti A, et al. 2021. Emulating causal dose-response relations between air pollutants and mortality in the Medicare population. Environ Health 20(1):53. https://pubmed.ncbi.nlm.nih.gov/33957920/, https://doi.org/10.1186/s12940-021-00742-x.
27.
Di Q, Amini H, Shi L, Kloog I, Silvern R, Kelly J, et al. 2019. An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution. Environ Int 130:104909. https://pubmed.ncbi.nlm.nih.gov/31272018/, https://doi.org/10.1016/j.envint.2019.104909.
28.
Di Q, Wei Y, Shtein A, Hultquist C, Xing X, Amini H, et al. 2021. Daily and Annual PM2.5 Concentrations for the Contiguous United States, 1-km Grids, v1 (2000–2016). Palisades, NY: NASA Socioeconomic Data and Applications Center. https://doi.org/10.7927/0rvr-4538 [accessed 22 July 2022].
29.
Di Q, Amini H, Shi L, Kloog I, Silvern RF, Kelly JT, et al. 2020. Assessing NO2 concentration and model uncertainty with high spatiotemporal resolution across the contiguous United States using ensemble model averaging. Environ Sci Technol 54(3):1372–1384. https://pubmed.ncbi.nlm.nih.gov/31851499/, https://doi.org/10.1021/acs.est.9b03358.
30.
Requia WJ, Wei Y, Shtein A, Hultquist C, Xing X, Di Q, et al. 2021. Daily 8-Hour Maximum and Annual O3 Concentrations for the Contiguous United States, 1-km Grids, v1 (2000–2016). Palisades, NY: NASA Socioeconomic Data and Applications Center. https://doi.org/10.7927/a4mb-4t86 [accessed 22 July 2022].
31.
Di Q, Wei Y, Shtein A, Hultquist C, Xing X, Amini H, et al. 2022. Daily and Annual NO2 Concentrations for the Contiguous United States, 1-km Grids, v1 (2000–2016). Palisades, NY: NASA Socioeconomic Data and Applications Center. https://doi.org/10.7927/f8eh-5864 [accessed 22 July 2022].
32.
Xia Y, Mitchell K, Ek M, Sheffield J, Cosgrove B, Wood E, et al. 2012. Continental-scale water and energy flux analysis and validation for the North American Land Data Assimilation System project phase 2 (NLDAS-2): 1. Intercomparison and application of model products. J Geophys Res 117(D3):D03109, https://doi.org/10.1029/2011JD016048.
33.
U.S. Census Bureau. 2000. American Community Survey. Washington, DC: U.S. Department of Commerce, Economics and Statistics Administration, U.S. Census Bureau.
34.
U.S. Census Bureau. 2010. American Community Survey 5-Year Data 2010. https://www.census.gov/data/developers/data-sets/acs-5year/2010.html.
35.
NRC (National Research Council). 2007. Using the American Community Survey: Benefits and Challenges. Washington, DC: National Academies Press.
36.
CDC (Centers for Disease Control and Prevention). 2013. CDC—BRFSS 2013 Survey Data and Documentation. https://www.cdc.gov/brfss/annual_data/annual_2013.html.
37.
Wennberg J, Cooper M. 1996. The Dartmouth Atlas of Health Care. American Hospital Publishing Chicago, IL. https://data.dartmouthatlas.org/downloads/atlases/96Atlas.pdf.
38.
Esri. 2010. Esri Data & Maps 10. An Esri White Paper. Redlands, CA: Esri. https://www.esri.com/content/dam/esrisites/sitecore-archive/Files/Pdfs/library/whitepapers/pdfs/esri-data-and-maps.pdf [accessed 22 July 2022].
39.
Calabrese EJ, Baldwin LA. 2003. The hormetic dose–response model is more common than the threshold model in toxicology. Toxicol Sci 71(2):246–250. https://pubmed.ncbi.nlm.nih.gov/12563110/, https://doi.org/10.1093/toxsci/71.2.246.
40.
Wood SN. 2006. Generalized Additive Models: An Introduction with R. Boca Raton, FL: Chapman and Hall/CRC.
41.
U.S. EPA (U.S. Environmental Protection Agency). 2016. NAAQS Table. https://www.epa.gov/criteria-air-pollutants/naaqs-table [accessed 22 July 2022].
42.
Vodonos A, Awad YA, Schwartz J. 2018. The concentration-response between long-term PM2.5 exposure and mortality; a meta-regression approach. Environ Res 166:677–689. https://pubmed.ncbi.nlm.nih.gov/30077140/, https://doi.org/10.1016/j.envres.2018.06.021.
43.
Pinto JP, Lefohn AS, Shadwick DS. 2004. Spatial variability of PM2.5 in urban areas in the United States. J Air Waste Manag Assoc 54(4):440–449. https://pubmed.ncbi.nlm.nih.gov/15115373/, https://doi.org/10.1080/10473289.2004.10470919.
44.
Wang L, Xiong Q, Wu G, Gautam A, Jiang J, Liu S, et al. 2019. Spatio-temporal variation characteristics of PM2.5 in the Beijing–Tianjin–Hebei Region, China, from 2013 to 2018. Int J Environ Res Public Health 16(21):4276. https://pubmed.ncbi.nlm.nih.gov/31689921/, https://doi.org/10.3390/ijerph16214276.
45.
Hasheminassab S, Daher N, Saffari A, Wang D, Ostro BD, Sioutas C. 2014. Spatial and temporal variability of sources of ambient fine particulate matter (PM2.5) in California. Atmos Chem Phys 14(22):12085–12097, https://doi.org/10.5194/acp-14-12085-2014.
46.
Cheng B, Wang-Li L. 2019. Spatial and temporal variations of PM2.5 in North Carolina. Aerosol Air Qual Res 19(4):698–710, https://doi.org/10.4209/aaqr.2018.03.0111.
47.
Samoli E, Butland BK, Rodopoulou S, Atkinson RW, Barratt B, Beevers SD, et al. 2020. The impact of measurement error in modeled ambient particles exposures on health effect estimates in multilevel analysis: a simulation study. Environ Epidemiol 4(3):e094. https://pubmed.ncbi.nlm.nih.gov/32656489/, https://doi.org/10.1097/EE9.0000000000000094.
48.
Cefalu M, Dominici F. 2014. Does exposure prediction bias health-effect estimation? The relationship between confounding adjustment and exposure prediction. Epidemiology 25(4):583–590. https://pubmed.ncbi.nlm.nih.gov/24815302/, https://doi.org/10.1097/EDE.0000000000000099.
49.
Goldman GT, Mulholland JA, Russell AG, Gass K, Strickland MJ, Tolbert PE. 2012. Characterization of ambient air pollution measurement error in a time-series health study using a geostatistical simulation approach. Atmos Environ (1994) 57:101–108. https://pubmed.ncbi.nlm.nih.gov/23606805/, https://doi.org/10.1016/j.atmosenv.2012.04.045.
50.
Hajat A, Hsia C, O’Neill MS. 2015. Socioeconomic disparities and air pollution exposure: a global review. Curr Environ Health Rep 2(4):440–450. https://pubmed.ncbi.nlm.nih.gov/26381684/, https://doi.org/10.1007/s40572-015-0069-5.
51.
O’Lenick CR, Winquist A, Mulholland JA, Friberg MD, Chang HH, Kramer MR, et al. 2017. Assessment of neighbourhood-level socioeconomic status as a modifier of air pollution–asthma associations among children in Atlanta. J Epidemiol Commun Health 71(2):129–136. https://pubmed.ncbi.nlm.nih.gov/27422981/, https://doi.org/10.1136/jech-2015-206530.
52.
Gale P. 2003. Developing risk assessments of waterborne microbial contaminations. In: Handbook of Water and Wastewater Microbiology. Duncun Mara D, Horan NJ, eds. London. UK: Academic Press, 263–280.
53.
Danesh Yazdi M, Wang Y, Di Q, Wei Y, Requia WJ, Shi L, et al. 2021. Long-term association of air pollution and hospital admissions among Medicare participants using a doubly robust additive model. Circulation 143(16):1584–1596. https://pubmed.ncbi.nlm.nih.gov/33611922/, https://doi.org/10.1161/CIRCULATIONAHA.120.050252.
54.
Błaszczak B, Zioła N, Mathews B, Klejnowski K, Słaby K. 2020. The role of PM2.5 chemical composition and meteorology during high pollution periods at a suburban background station in southern Poland. Aerosol Air Qual Res 20(11):2433–2447, https://doi.org/10.4209/aaqr.2020.01.0013.
55.
Evangelopoulos D, Katsouyanni K, Schwartz J, Walton H. 2021. Quantifying the short-term effects of air pollution on health in the presence of exposure measurement error: a simulation study of multi-pollutant model results. Environ Health 20(1):94. https://pubmed.ncbi.nlm.nih.gov/34429109/, https://doi.org/10.1186/s12940-021-00757-4.
56.
Mallick B, Hoffman FO, Carrol RJ. 2002. Semiparametric regression modeling with mixtures of Berkson and classical error, with application to fallout from the Nevada test site. Biometrics 58(1):13–20. https://pubmed.ncbi.nlm.nih.gov/11890308/, https://doi.org/10.1111/j.0006-341x.2002.00013.x.
57.
Lash TL, Fox MP, MacLehose RF, Maldonado G, McCandless LC, Greenland S. 2014. Good practices for quantitative bias analysis. Int J Epidemiol 43(6):1969–1985. https://pubmed.ncbi.nlm.nih.gov/25080530/, https://doi.org/10.1093/ije/dyu149.
58.
Hart JE, Liao X, Hong B, Puett RC, Yanosky JD, Suh H, et al. 2015. The association of long-term exposure to PM2.5 on all-cause mortality in the Nurses’ Health Study and the impact of measurement-error correction. Environ Health 14(1):38. https://pubmed.ncbi.nlm.nih.gov/25926123/, https://doi.org/10.1186/s12940-015-0027-6.
Information & Authors
Information
Published In
License Information
EHP is an open-access journal published with support from the National Institute of Environmental Health Sciences, National Institutes of Health. All content is public domain unless otherwise noted.
History
Received: 25 September 2021
Revision received: 7 July 2022
Accepted: 8 July 2022
Published online: 29 July 2022
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click DOWNLOAD.
Cited by
- Peng M, Li Y, Wu J, Zeng Y, Yao Y, Zhang Y, Exposure to submicron particulate matter and long-term survival: Cross-cohort analysis of 3 Chinese national surveys, International Journal of Hygiene and Environmental Health, 10.1016/j.ijheh.2024.114472, 263, (114472), (2025).
- Yu Z, Kebede Merid S, Bellander T, Bergström A, Eneroth K, Merritt A, Ödling M, Kull I, Ljungman P, Klevebro S, Stafoggia M, Janson C, Wang G, Pershagen G, Melén E, Gruzieva O, Improved Air Quality and Asthma Incidence from School Age to Young Adulthood: A Population-based Prospective Cohort Study, Annals of the American Thoracic Society, 10.1513/AnnalsATS.202402-200OC, 21, 10, (1432-1440), (2024).
- White A, Growing Evidence for the Role of Air Pollution in Breast Cancer Development, Journal of Clinical Oncology, 10.1200/JCO-24-01987, (2024).
- Tec M, Josey K, Mudele O, Dominici F, Baeza-Yates R, Bonchi F, Causal Estimation of Exposure Shifts with Neural Networks and an Application to Inform Air Quality Standards in the US, Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 10.1145/3637528.3671761, (2876-2887), (2024).
- Wei Y, Feng Y, Danesh Yazdi M, Yin K, Castro E, Shtein A, Qiu X, Peralta A, Coull B, Dominici F, Schwartz J, Exposure-response associations between chronic exposure to fine particulate matter and risks of hospital admission for major cardiovascular diseases: population based cohort study, BMJ, 10.1136/bmj-2023-076939, (e076939), (2024).
- Yu W, Huang W, Gasparrini A, Sera F, Schneider A, Breitner S, Kyselý J, Schwartz J, Madureira J, Gaio V, Guo Y, Xu R, Chen G, Yang Z, Wen B, Wu Y, Zanobetti A, Kan H, Song J, Li S, Guo Y, Tong S, Pascal M, das Neves Pereira da Silva S, Tobias A, Íñiguez C, Pan S, Urban A, Jaakkola J, Ryti N, Ameling C, Rao S, Forsberg B, Scortichini M, Stafoggia M, Masselot P, Ambient fine particulate matter and daily mortality: a comparative analysis of observed and estimated exposure in 347 cities, International Journal of Epidemiology, 10.1093/ije/dyae066, 53, 3, (2024).
- Qin M, Khoshnevis N, Dominici F, Braun D, Zanobetti A, Mork D, Comparing traditional and causal inference methodologies for evaluating impacts of long-term air pollution exposure on hospitalization with Alzheimer disease and related dementias, American Journal of Epidemiology, 10.1093/aje/kwae133, (2024).
- Rowan C, R D’Souza R, Zheng X, Crooks J, Hohsfield K, Tong D, Chang H, Ebelt S, Dust storms and cardiorespiratory emergency department visits in three Southwestern United States: application of a monitoring-based exposure metric, Environmental Research: Health, 10.1088/2752-5309/ad5751, 2, 3, (031003), (2024).
- Castro E, Healy J, Liu A, Wei Y, Kosheleva A, Schwartz J, Interactive effects between extreme temperatures and PM 2.5 on cause-specific mortality in thirteen U.S. states , Environmental Research Letters, 10.1088/1748-9326/ad97d1, 20, 1, (014011), (2024).
- Geldsetzer P, Fridljand D, Kiang M, Bendavid E, Heft-Neal S, Burke M, Thieme A, Benmarhnia T, Disparities in air pollution attributable mortality in the US population by race/ethnicity and sociodemographic factors, Nature Medicine, 10.1038/s41591-024-03117-0, 30, 10, (2821-2829), (2024).
- See more