Examining the Shape of the Association between Low Levels of Fine Particulate Matter and Mortality across Three Cycles of the Canadian Census Health and Environment Cohort

Background: Ambient fine particulate air pollution with aerodynamic diameter ≤2.5 μm (PM2.5) is an important contributor to the global burden of disease. Information on the shape of the concentration–response relationship at low concentrations is critical for estimating this burden, setting air quality standards, and in benefits assessments. Objectives: We examined the concentration–response relationship between PM2.5 and nonaccidental mortality in three Canadian Census Health and Environment Cohorts (CanCHECs) based on the 1991, 1996, and 2001 census cycles linked to mobility and mortality data. Methods: Census respondents were linked with death records through 2016, resulting in 8.5 million adults, 150 million years of follow-up, and 1.5 million deaths. Using annual mailing address, we assigned time-varying contextual variables and 3-y moving-average ambient PM2.5 at a 1×1 km spatial resolution from 1988 to 2015. We ran Cox proportional hazards models for PM2.5 adjusted for eight subject-level indicators of socioeconomic status, seven contextual covariates, ozone, nitrogen dioxide, and combined oxidative potential. We used three statistical methods to examine the shape of the concentration–response relationship between PM2.5 and nonaccidental mortality. Results: The mean 3-y annual average estimate of PM2.5 exposure ranged from 6.7 to 8.0 μg/m3 over the three cohorts. We estimated a hazard ratio (HR) of 1.053 [95% confidence interval (CI): 1.041, 1.065] per 10-μg/m3 change in PM2.5 after pooling the three cohort-specific hazard ratios, with some variation between cohorts (1.041 for the 1991 and 1996 cohorts and 1.084 for the 2001 cohort). We observed a supralinear association in all three cohorts. The lower bound of the 95% CIs exceeded unity for all concentrations in the 1991 cohort, for concentrations above 2 μg/m3 in the 1996 cohort, and above 5 μg/m3 in the 2001 cohort. Discussion: In a very large population-based cohort with up to 25 y of follow-up, PM2.5 was associated with nonaccidental mortality at concentrations as low as 5 μg/m3. https://doi.org/10.1289/EHP5204


Introduction
Exposure to ambient fine particulate air pollution with aerodynamic diameter ≤2:5 lm (PM 2:5 ) consistently ranks among the leading risk factors for premature death and disease worldwide GBD 2017Risk Factors Collaborators 2018Lim et al. 2012). A number of studies supporting this work have found that the relationship between PM 2:5 concentrations and mortality risk (for various causes) was supralinear across the global range (Burnett et al. 2014;Pope et al. 2009Pope et al. , 2011Yin et al. 2017). In a detailed examination of the shape of the PM 2:5mortality association in 15 of the world's largest cohorts , 12 displayed a supralinear association. A supralinear concentration-response curve is characterized by a positively sloped curve of decreasing steepness, such that risk initially rises rapidly with a decreasing slope as concentrations increase. Studies that specifically characterize the shape of concentrationresponse relationships at low-PM 2:5 mass concentrations offer great value given the steady decline in PM 2:5 levels over recent decades in North America (ECCC 2017). Further, a substantial proportion of the global PM 2:5 disease burden is from relatively low level exposures (Apte et al. 2015). Canada is an ideal setting to conduct such analyses, given the availability of large, national cohorts with sufficient sample sizes and detailed exposure information at low PM 2:5 concentrations.
Canadian cohort studies have shown consistent positive associations between PM 2:5 and mortality from various causes at low PM 2:5 concentrations (i.e., annual concentrations generally below 20 lg=m 3 even in large urban areas) (Crouse et al. 2012(Crouse et al. , 2015Nasari et al. 2016;Pinault et al. 2016bPinault et al. , 2017Weichenthal et al. 2017). Crouse et al. (2012) used the 1991 Canadian Census Health and Environment Cohort (CanCHEC) to conduct the first nationwide cohort analysis and identified a hazard ratio (HR) for nonaccidental mortality of 1.07 [95% confidence interval (CI): 1.06, 1.08] per 10-lg=m 3 change in PM 2:5 among nonimmigrant adults. In a recent analysis of the 2001 CanCHEC, Pinault et al. (2017) reported a larger HR of 1.18 (95% CI: 1.15, 1.21) for PM 2:5 and nonaccidental mortality. While these studies made important contributions to the evidence base for mortality risks at low PM 2:5 levels, they also had several important limitations. For example, with the exception of Pinault et al. (2017), past studies used coarser-resolution PM 2:5 models (i.e., 10 × 10 km) to assign exposures to census respondents. Furthermore, most of the previous studies excluded immigrants, although this group represents nearly 20% of the Canadian population. Additionally, most of these studies had only 10 y of follow-up.
The present study specifically investigated the shape of the concentration-response function between PM 2:5 and nonaccidental mortality at low levels of exposure among Canadian adults. We examined data from the 1991, 1996, and 2001 CanCHECs with follow-up until 2016. We address a number of limitations of previous cohort studies in Canada by extending the period of follow-up to 25 y (i.e., for individuals in the 1991 cohort), including all but recent immigrants in the analysis, using annual 1 km 2 PM 2:5 estimates from 1988-2015, using time-varying contextual covariates over the duration of follow-up, and applying a validated marginalization index to represent four orthogonal dimensions of neighborhood-or community-level socioeconomic status. We examined the shape of associations at low levels of PM 2:5 exposure by applying restricted cubic splines (RCS) (Harrell 2015), monotonically increasing smoothing splines (MISS) (Pya and Wood 2015), and the Shape Constrained Health Impact Function (SCHIF) (Nasari et al. 2016).

Analytical Cohort
We created three new, separate analytical cohorts from the 1991, 1996, and 2001 CanCHECs. Briefly, the CanCHECs are populationbased, administrative data cohorts that link eligible census respondents (i.e., noninstitutional respondents to the mandatory Statistics Canada long-form census questionnaire that is distributed to 20% of all Canadian households) to their annual mailing address  and follow subjects for mortality. Information on a number of variables capturing the social and economic status of the subjects was available from the long-form census (Table 1).
The linkage was approved by Statistics Canada (linkage requests 037-2016 and 045-2015) and is governed by the Directive on Microdata Linkage (Statistics Canada 2017a). Eligible respondents were first linked probabilistically to tax records using sex, date of birth, postal code (PC), and spousal date of birth (if available). This initial linkage was necessary since linkage to the mortality database is based on the social insurance number (SIN), a unique personal identifier. The long form censuses did not capture the SIN, but they are available on tax records. The linkage rate to tax records near the time of cohort inception was approximately 80%, of which 99% were determined to be accurate matches (Christidis et al. 2018;Pinault et al. 2016a;Wilkins et al. 2008).
Mortality and PC history data were attached to the census-tax cohorts using Statistics Canada's Social Data Linkage Environment (SDLE) Derived Record Depository (DRD) (Statistics Canada 2017b), a dynamic relational database. About 99.8% of all deaths that occurred in Canada between 1991 and 2016 were linked to the DRD before being linked to eligible census respondents. From this linkage, we obtained death date and underlying cause of death if it occurred between census day and 31 December 2016. Mortality data were coded by underlying cause of death according to the International Classification of Diseases, 9th Revision, prior to 2000 (ICD-9;WHO 1977), and 10th Revision post-2000 (ICD-10; WHO 2016).
We enhanced the cohort with a number of data elements characterizing the environment in which each subject lived, using PC histories from tax records, of which the primary source was income tax filings (1981 to 2016). We assigned a representative point (latitude and longitude) to each PC (Statistics Canada 2017c). In large cities, PCs often correspond to a single block face, though in rural areas, they can range over much larger areas. Similarly, the point estimates of PCs are accurate within 0:2 km in urban centers and 5:6 km in rural areas (Khan et al. 2018). These point estimates were used to derive estimates of air pollution and location-based contextual risk factors.
We note that these three linked cohorts are newly created using an enhanced linkage environment (SDLE) and thus are not identical to the CanCHEC cohorts used in previous publications (Crouse et al. 2015;Pinault et al. 2017).

Outdoor Air Pollution Concentrations
We used annual ambient PM 2:5 surfaces as our main exposure of interest at a 0:01˚× 0:01˚resolution ( ∼ 1 km 2 ) over North America for 1981-2016 (Meng et al. 2019;van Donkelaar et al. 2015). PM 2:5 estimates for the years 1998-2012 were developed by relating satellite-based retrievals of total column aerosol optical depth to near-surface PM 2:5 concentrations using the geophysical relationship simulated by a chemical transport model (CTM). These estimates were constrained using ground-based monitoring from the National Air Pollution Surveillance (NAPS) program stations, along with other North America-based measurements, land-use information, and simulated composition in a geographically weighted regression (V4.NA.01; van Donkelaar et al. 2015). For years outside this period, we used PM 2:5 surfaces developed using a backcasting method (Meng et al. 2019) that applied observed annual trends in ground monitoring data for PM 2:5 and coarser size fractions to adjust pregridded PM 2:5 estimates backwards or forwards in time. We estimated a 3-y moving-average exposure window with 1-y lag for assigning PM 2:5 exposures for consistency with previous studies, as ambient PM 2:5 is regulated based on a 3-y time window in Canada (CCME 2012).
We assigned estimates of exposures to ambient ozone (O 3 ; as a May-September daily maximum 8-h average) and nitrogen dioxide (NO 2 ; annual) for inclusion in multipollutant models. Additionally, we estimated a measure of the combined oxidant capacity of O 3 and NO 2 , expressed as O x = 2=3 O 3 + 1=3 NO 2 ). We estimated a 3-y average with 1-y lag for each of O 3 , NO 2 , and O x for inclusion in the hazard models. Modeled O 3 surfaces at 21-km spatial resolution were developed by Environment and Climate Change Canada (ECCC) for 2002-2015 using chemical transport modeling informed by surface observations (Robichaud and Ménard 2014;Robichaud et al. 2016). Estimates of ambient NO 2 were based on a national landuse regression model (LUR) developed for 2006 (Hystad et al. 2011) with a spatial resolution of 100 m. The LUR estimates were built using satellite-derived NO 2 (with 10-km resolution), distances to highways and major roads, and roadway kernel density gradients as predictive variables. We temporally adjusted the O 3 and NO 2 models to obtain exposure estimates over our study period (i.e., 1988-2015). Our adjustment was based on observed trends in ground monitoring data for NO 2 and O 3 from the NAPS in Canada. For each of 24 census divisions (CDs) that had monitoring data available, we estimated yearly adjustment factors from the ratio of observed CD-average concentration in a specific year to the reference year(s) for which the original surfaces were estimated (i.e., 2006 for NO 2 and 2002-2015 average for O 3 ). We assigned adjustment factors for each PC from the closest CD.

Contextual Covariates
We assigned contextual risk factors describing neighborhood-level characteristics and geographic identifiers using residential PC and data from the closest census (every 5 y from 1991 through 2016). We included in our analysis the Canadian Marginalization Index (CAN-Marg), population size of home community or city, an indicator of urban form, and regional airshed to capture risk factors beyond those captured at the subject level. We assigned these four categories of contextual covariates to residential PCs linked to census geography for each census year.
CAN-Marg is a publicly available index of neighborhood marginalization in Canada that was developed by Matheson et al. (2012) using an analysis of the 2001 and 2006 long-form census cycles. CAN-Marg consists of four dimensions that aim to capture different aspects of marginalization: material deprivation, residential instability, ethnic concentration, and dependency. Following the methodology of Matheson et al. (2012), we developed CAN-Marg using the 1991 and 1996 censuses. We assigned CAN-Marg to PC locations and then created quintiles (based on the cohort distribution) of the continuous values in the four Can-MARG dimensions in order to account for any potentially nonlinear associations with mortality.
We used a variable to describe the population size of a subject's community   (Table 2). We categorized geographic locations into the following: census metropolitan areas (CMAs) or census agglomerations (CAs; Statistics Canada 2003) with a population exceeding 1.5 million; 500,000-1.49 million; 100,000-499,999; 30,000-99,999; or 10,000-29,999, as well as non-CMAs/CAs. We note that although non-CMA/CAs are always rural areas, CMAs cover both the urban core of a city and the urban-rural fringe, such that some rural locations fall within a CMA/CA. As such, this variable does not perfectly delineate subjects living in rural vs. urban settings.
To further differentiate between the kinds of built environments and neighborhoods within communities, we created an urban form variable following the methodology developed by Gordon and Janzen (2013). This measure of urban form is informed by population density and the most frequently reported mode of transportation (active or transit) in each census tract as reported on each census cycle. The categories of this variable include an active urban core, transit-reliant suburb, car-reliant suburb, exurban, and non-CMA/CA. We note that mode of commute was not reported on the 1991 census cycle and was derived from the 1996 census.
We included airshed as a geographic covariate in our analysis . Airsheds divide Canada into six regions (Western, Prairie, West Central, Southern Atlantic, East Central, and Northern) based on large-scale differences in air masses and meteorology. Airsheds can also be used to represent regional differences in mortality rates across Canada that remain uncaptured by other geographic covariates.

Exclusion of Person-Years of Follow-Up
PC history was not available for each person in every year of follow-up, either because they did not file a tax return or from gaps in administrative data. Any gaps in PCs that had the same PC prior to and after the gap were assigned that PC for all years of the gap. After this imputation, 87.8% of person-years had an available PC. We imputed an additional 2.1% of person-years of missing PCs if the bounding PCs shared the first two characters (Finès et al. 2017;Pinault et al. 2017), totaling 89.9% of personyears with a PC.
After imputation, person-years were excluded if they did not have an assigned PC. Further exclusions of person-years occurred due to: immigrated to Canada less than 10 y before survey date (9,364,400 person-years), age during follow-up period exceeded 89 y (7,357,200), could not be linked to air pollution values (17,814,400), could not be linked to 973,900), could not be linked to CMA/CA size (25,613,100), could not be linked to airshed (25,545,500), 3-y moving average being informed by only 1 y of exposure (20,056,400), and year after subject death (17,936,100). The above are not mutually exclusive numbers of exclusions. The total available person-years for analyses were 150,996,500 after all exclusions ( Figure S1).

Statistical Analysis
Our primary statistical model relating exposure to mortality was the Cox proportional hazards model (Cox 1972). Participants were at least 25 y of age at the beginning of each cohort, and the time axis was the year of follow-up until 2016. Person-years before a census year and after a subject's death year were excluded from analysis. Events were determined by year of death for nonaccidental causes. The Cox model baseline hazard function was stratified by age (5-y groups), sex, and immigrant status (yes or no). This latter strata variable was included since immigrants to Canada live longer, on average, than do Canadian-born citizens (Ng 2011). We excluded immigrants living in Canada for less than 10 y at cohort commencement due to the healthy immigrant effect (Ng 2011) and lack of knowledge of their historical air pollution exposures. Each subject was censored at 89 y of age, either at the start of each cohort or during follow-up, due to evidence suggesting an increased mismatch between home address and the tax return mailing address with increasing age (Bérard-Chagnon 2017). We postulate that relatives of elderly people were completing their tax returns. Each of the three CanCHEC cohorts (1991, 1996, and 2001) was examined separately. Estimates of the cohort-specific hazard ratios were then pooled to form a single summary hazard ratio. We also conducted a test for differences in the hazard ratios between cohorts (Cochran 1950).
We fit two covariate adjustment models for each cohort. The first was based on a directed acyclic graph (DAG; Figure S2) and consisted of all the geographically based predictors: CAN-Marg (four dimensions), airshed, urban form, and CMA/CA size. The second model, denoted as "Full," additionally included the subjectlevel predictors (income, education, occupational class, Indigenous status, visible minority status, employment status, and marital status), which are not a priori causes of outdoor PM 2:5 concentrations, but which may contribute to confounding owing to a chance imbalance across the PM 2:5 distribution.
We also conducted analysis by categories of: immigrant status (yes or no), sex (male or female), and age during follow-up (<65, 65-74, or ≥75 y) for each cohort separately, again pooling the cohort-specific hazard ratio estimates among the three cohorts. In addition, we examined the PM 2:5 association, adjusting for O 3 , NO 2 , or O x by cohort.

Shape of the Association between PM 2:5 and Mortality
The main purpose of the current study was to describe the association between PM 2:5 and mortality in a manner that can be used for risk and benefits assessment. The standard approach is the log-linear (LL) model that relates the logarithm of the hazard ratio to exposure in a linear manner: logHRðPM 2:5 Þ = bPM 2:5 .
Here, b represents a change in relative risk per unit change in concentration estimated using the Cox model. Nasari et al. (2016) developed the SCHIF in order to extend the LL model to nonlinear transformations of exposure, TðPM 2:5 Þ, with the form: SCHIFðPM 2:5 Þ = hTðPM 2:5 Þ. Nasari et al. (2016) proposed a specific family of transformations based on a sigmodal function that could accommodate a variety of shapes they suggested would be suitable for risk and benefits assessment. Here, h represents a change in risk per unit change in TðPM 2:5 Þ. The SCHIF can then be used in benefits assessment in a manner similar to the LL model after a suitable transformation of concentration.
The SCHIF approach has two major limitations: The first is in defining an appropriate number of transformations of a sigmodal function that can capture all shapes of interest; the second is that the method requires considerable computational capacity if the selected family is very large. This can be a serious limitation when cohort sizes are very large, such as with the CanCHECs.
Spline methods have also been proposed to characterize the shape. RCS with a few knots have been used (Beelen et al. 2014) in addition to smoothing splines (Di et al. 2017). However, the manner in which splines are presented by graphic representation of the mean predictions and uncertainty bounds over the concentration range limits their use in risk and benefits assessment, as these assessments typically require a differentiable algebraic function in addition to a quantitative estimate of uncertainty by concentration.
We developed and applied a new method that combines the flexibility of splines and the ease of use of the SCHIF in benefits assessment. Our method involves three steps. The first step is a data reduction step in which we fit a RCS with a large number of knots in order to characterize the shape of the concentrationresponse relationship in sufficient detail. RCS can easily be fit to large cohorts, as they only involve a series of transformations of concentration. Here, we have converted millions of person-years of data into a few hundred observations of RCS predictions over the observed concentration range. In step 2, we smooth the potential erratic predictions due to the large number of knots using a MISS, and in step 3, we fit our SCHIF function to the MISS predictions. In addition, we model the uncertainty in the spline fit as a cubic polynomial in concentration in a manner that assigns all uncertainty to the h parameter in the SCHIF model, but unlike the LL model, uncertainty can vary with concentration. We now have a differentiable algebraic function of both relative risk and its uncertainty by concentration. This approach also allows for visualization of the SCHIF as well as its representation of the underlying data (as summarized by the RCS).  10th, 14th, 18th, 22nd, 26th, 50th, 74th, 78th, 82nd, 86th, 90th, 94th, and 98th percentiles of the PM 2:5 person-year distribution. We selected a large number of knots covering both the lower and upper quartiles in order to capture a variety of desired shapes. From this first step, we obtain estimates of the logarithm of the RCS hazard ratio (logRCS) and the associated standard error at several hundred concentrations between the minimum and the 99th percentile of the exposure distribution. We do not include predictions above the 99th percentile, since RCS are linear beyond the highest knot concentration. This linear form can have some influence on the shape of the SCHIF throughout the concentration range, and especially over the higher concentrations, since the SCHIF is a single algebraic function. We also fixed the logRCS to zero at the minimum concentration, and its associated standard error was also set to zero.
In step 2, we smooth the potentially erratic logRCS predictions with a MISS in order to obtain predictions suitable to model with the SCHIF algebraic function, which itself is monotonically increasing (Pya and Wood 2015). The SCHIF hazard ratio function has the form: a logistic function in concentration.
Here, h, l, and s are unknown parameters to be estimated from the data, r is the range in the translated exposure, and z = PM 2:5 − minðPM 2:5 Þ such that logSCHIFð0Þ = 0. The function f ðzÞ can take two forms: f ðzÞ = z (linear) and f ðzÞ = logðz + 1Þ (log). We have constructed the SCHIF to be similar to the LL model, logLLðzÞ = bz, by writing: logSCHIFðzÞ = hTðzÞ, where TðzÞ = f ðzÞlðzÞ is a specific transformation of concentration.
The linear form f ðzÞ = z can model both linear and sublinear associations, while the log form f ðzÞ = logðz + 1Þ can model supralinear associations with mortality. Both forms can accommodate S-shaped functions through lðzÞ. Sets of values ðl,sÞ are selected that define the shape of lðzÞ. Larger values of l result in larger ranges of concentration for which a sublinear association is modeled at lower concentrations due to the property of the logistic function. Larger values of s generate shapes for lðzÞ with less curvature. By limiting the ranges for ðl,sÞ, we can limit the amount of curvature in the SCHIF. A linear regression model was constructed using each transformation as the single predictor and the MISS prediction as the response. Using the MISS predictions, we were then able to select a wide range of values of the parameters to examine a wide variety of shapes that is not possible by modeling the subjectlevel cohort data. We selected values of l ranging from 0 to r by integers, and s ranging from 0.1 to 1 by 0.1 increments. For each set of parameters and the two forms of f ðzÞ, we obtained an estimate of h and its standard error. We then created a single SCHIF curve by a weighted average of all the SCHIF curves examined, with weights determined by the fit of each curve on the MISS values. However, as the model averaged predictions at each concentration are themselves a potentially complicated function, these predictions can be summarized as a single algebraic function. Specifically, we fit a generalization of the SCHIF model to the mean SCHIF predicted curve over the concentration range. We added an additional parameter a to model the combination of the linear and log forms of f ðzÞ used in the fitting step. The function log z a + 1 À Á is nearly linear in z for large values of a. We collapse the product sr into a single parameter v to simplify the reporting of the parameter estimates. In the LL model, all uncertainty in the hazard ratio is assigned to the single unknown parameter, b. We aim to make a similar characterization of uncertainty in the SCHIF predictions, where all the uncertainty is ascribed to the parameter h. We do this by considering a model of the standard error in the RCS predictions. However, unlike the LL model, RCS standard errors can vary in a nonlinear manner with concentration. We therefore consider a model for the standard error as a function of concentration of the form: se RCS ðzÞ = se h ðzÞ × TðzÞ, with se h ðzÞ denoting our standard error model of h, dependent on concentration. We select a general model that can accommodate a variety of shapes such as a cubic polynomial with the form: se h ðzÞ = r 0 + r 1 z + r 2 z 2 + r 3 z 3 .
Finally, we construct pooled SCHIF models among the three cohorts in the following manner: Let v c ðzÞ be the variance of the logarithm of the SCHIF prediction, logSCHIF c ðzÞ, at concentration z for cohort c = 1,2,3. We construct a meta-analytic summary of the SCHIF predictions among the three cohorts as: In order to obtain an algebraic function for the pooled SCHIF, we used nonlinear regression to estimate the SCHIF parameters, with logSCHIF Pooled ðzÞ defining the data for the regression. We also modeled the standard error of the pooled SCHIF in a manner similar to that for each cohort separately. The variance of the pooled SCHIFs is a function of both the variance of each cohortspecific SCHIF prediction and the squared difference between the cohort-specific SCHIF predictions and the pooled SCHIF prediction. This latter term captures the uncertainty in both the shape and magnitude of the hazard ratio predictions among the three cohorts.

Results
Main Analysis PM 2:5 by cohort and covariate categories. Table 1 presents percentiles of the PM 2:5 distribution based on person-years for each of the three cohorts separately. Concentrations were highest for the 1991 cohort, moderate for the 1996 cohort, and lowest for the 2001 cohort. Concentration differences were well within 1 lg=m 3 between cohorts for median and lower percentiles, with greater differences for the higher percentiles, suggesting that greater declines in exposure were observed in locations with higher levels. The spatial distribution of PM 2:5 across Canada is presented for selected 3-y averages (Figure 1). Concentrations declined over time in the heavily populated areas of Southern Ontario and Quebec. Moderate concentrations were observed in the earlier time periods for Northern Canada and the Prairies. These levels declined through the 1990s but then increased during the latter part of our cohort follow-up period. Table 2 reports both the number of person-years and percentages among the categories of mortality predictors for each cohort separately, in addition to the mean and standard deviation of PM 2:5 assigned to each category. Males tended to be assigned higher concentrations than females in all three cohorts, although the difference was very small (<1 lg=m 3 ). There was a U-shaped pattern with age at cohort commencement for all three cohorts, with concentration declining with age up to the 55-to 64-y-old group and then increasing. Immigrants were consistently assigned higher concentrations than nonimmigrants; however, concentrations were similar over the length an immigrant subject lived in Canada. Subjects who defined themselves as visible minorities had higher assigned concentrations than those subjects who did not in the 1991 and 1996 cohorts. Subjects of Indigenous identity had lower concentrations. Married and common-law subjects had lower assigned exposures compared to other marital categories in all cohorts. Exposure monotonically increased with educational attainment in all cohorts. However, exposure monotonically declined with income. Employed subjects at the time of interview had higher exposures compared to those unemployed subjects. Exposure tended to decline over the occupational class categories moving from management/ professional to semi-and unskilled workers. Note that the "not in the labor force" and "not-applicable occupational class" categories had the highest exposures, possibly to due to older subjects who tended to have higher than average exposures. There was a tendency for exposure to increase over the quintiles of three of the CAN-Marg dimensions: residential instability, material deprivation, and ethnic concentration, with no clear trend for the fourth dimension, dependence. Outdoor concentrations increased with CMA/CA size and for the inner-city categories of urban form. Of the six airsheds, the East Central contained 58% of person-years and had the highest concentrations. Based on the associations between several geographic and subject based covariates, there is some potential that adjustment for these variables could influence the magnitude of our estimates of the PM 2:5 -mortality association.
Hazard ratio estimates. Table 3 reports the hazard ratio and 95% confidence limits per 10-lg=m 3 , for each cohort separately and pooled among the three cohorts by categories of immigrant status, age, and sex, for both the DAG and Full models. There was a tendency for the hazard ratio to be larger under the Full model compared to the DAG for the 1991 and 1996 cohorts, but smaller for the 2001 cohort. Consequently, there was less variation among the hazard ratios between cohorts under the Full compared to the DAG models. The Full model was a better predictor of mortality compared to the DAG model based on its much lower Akaike Information Criterion/ Schwarz's Bayesian Criterion values (see Table S1). We therefore focus our interpretation on the results using the Full model.
When all subjects were considered together, hazard ratio estimates were similar for the 1991 and 1996 cohorts (HR = 1:041), with a larger estimate observed for the 2001 cohort (HR = 1:084). The pooled cohort HR estimate was 1.053 (95% CI: 1.041, 1.065). Hazard ratio estimates for nonimmigrants were higher than for immigrants in the 1991 and 1996 cohorts, but lower in the 2001 cohort. Hazard ratio estimates for males were higher than for females in the 1991 and 1996 cohorts but lower in the 2001 cohort. Hazard ratio estimates declined with age in all three cohorts, however.
Hazard ratio estimates based on interquartile range changes in concentrations were larger for O x compared to O 3 , and lowest for NO 2 ( Table 3). The PM 2:5 HR estimate was moderately sensitive to adjustment for NO 2 , declining from 1.053 to 1.043 per 10-lg=m 3 , but very sensitive to adjustment for either O 3 , declining to 0.982, and O x , declining to 0.955.
Shape of PM 2:5 -mortality association. The shape of the association between PM 2:5 concentrations and mortality for the Full model is displayed in Figure 2 for each of the three cohorts separately and pooled among cohorts using the SCHIF. MISS predictions (dashed black line) and RCS predictions (dashed red line) over the concentration range are also displayed. A similar shape is observed in each cohort for the MISS, with a steep increase below 5 lg=m 3 followed by a much shallower increase for higher concentrations. The SCHIF predictions also display a supralinear association with concentration. Note that the SCHIF predictions display much less curvature than the MISS; a design feature of constraining the shape of the SCHIF. The SCHIF 95% CIs (grayshaped area) are clearly widest in the 2001 cohort, as it had the shortest follow-up time (15 y) and lowest concentrations (Table 1) compared to the 1991 and 1996 cohorts. The steepness of the increase in the HR below 5 lg=m 3 appears to dampen between the 1991 to 1996 to 2001 cohorts, with the SCHIF predictions of the MISS improving over these lower concentrations as the start date of the cohorts increased. The lower bound of the CIs exceeded unity for all concentrations for the 1991 cohort, above 2 lg=m 3 for the 1996 cohort, and above 5 lg=m 3 for the 2001 cohort.
We display the association between both h and its standard error and show that association varies with concentration by the ratio NðzÞ = h=se h ðzÞ in Figure 3. Here, NðzÞ represents the signal-to-noise ratio by concentration. This ratio increases with concentration for all three cohorts, exceeding the 1.96 value (dashed black line) for all concentrations in the 1991 cohort, above 2 lg=m 3 in the 1996 cohort, and above 5 lg=m 3 in the 2001 cohort. For the pooled SCHIF, the ratio increases for concentrations less than 7 lg=m 3 and then is relatively stable for higher concentrations. This pattern is due to the additional variation between the SCHIF values among the three cohorts at higher levels. The parameter estimates for both the SCHIF and its standard error are given in Table 4.

Discussion
The three CanCHEC cohorts included 8.5 million adults with 150 million person-years of follow-up and nearly 1.5 million deaths, Table 3. Hazard ratio (HR) estimates and 95% confidence intervals (CIs) for the association between PM 2:5 and nonaccidental mortality, as well as for copollutants (NO 2 , Ozone, oxidative potential), within the Canadian Health and Environment Cohorts (CanCHECs) from 1991, 1996, 2001, and pooled cohorts. Effect modification analyses by immigrant status, sex, and age, and multi-pollutant models are also provided. representing one of the largest population-based air pollution cohort analyses conducted to date. We found a HR of 1.053 (95% CI: 1.041, 1.065) per 10-lg=m 3 change in PM 2:5 after pooling the three cohort-specific hazard ratios. Hazard ratio estimates based on a LL model do not fully characterize the relationship between PM 2:5 exposure and mortality, as in each cohort, a supralinear association was observed (Figure 1). We found variation in the sensitivity of covariate model specification. The full model yielded larger hazard ratio estimates compared to the DAG model for both the 1991 and 1996 cohorts, with the opposite pattern observed in the 2001 cohort. It is not completely clear why such patterns occur, as the change in exposure among the categories of the covariates was similar in all three cohorts (Table 1), although the differences in exposure among the categories decreases with more recent cohort start dates due to generally declining concentrations over time.
We observed an increase in the hazard ratio for the immigrant population over time, starting with the weakest association for the 1991 cohort (HR = 1:006; 95% CI: 0.935, 1.081), a slightly stronger association in the 1996 cohort (HR = 1:027; 95% CI: 0.987, 1.068), with the strongest association in the 2001 cohort (HR = 1:109; 95% CI: 1.053, 1.167). We previously also observed no association in the 1991 cohort (Crouse et al. 2015) for immigrants to Canada. There also appeared to be little evidence  (1991, 1996, and 2001). Hazard ratio predictions based on pooling cohort-specific SCHIFs are also presented. Uncertainty bounds are displayed as gray shaded area. Tick marks on PM 2:5 axis represent the 15-RCS knot locations.
of an association for females in the 1991 and 1996 cohorts, with much stronger evidence for males. However, the association was similar for males and females in the 2001 cohort (Table 3).
The observation that the PM 2:5 hazard ratio can be partially explained by NO 2 and fully explained by O 3 also supports previously reported results (Crouse et al. 2015). As these gaseous pollutants may exhibit varying correlations with different components of PM 2:5 , these multipollutant results can provide insight into the source components of PM 2:5 that are most influential in the relationship between PM 2:5 mass and mortality. That is, the results of our models with NO 2 and O 3 may, in part, reflect differences in the mixture/composition of PM 2:5 across space and time. Earlier work has shown that the distribution of components, such as sulfate, organic matter, and black carbon, vary by total mass concentration in Canada . Using this spatial variation, Crouse et al. (2016) showed that inclusion of components in addition to total mass improved the prediction of mortality in the 1991 cohort. Mortality from ischemic heart disease has been shown to vary by source of PM 2:5 in the U.S.-based American Cancer Society (ACS) cohort (Thurston et al. 2016). Therefore, differences over time in the component distribution could explain some of the differences in our hazard ratio estimates between cohorts. However, the empirical evidence for such temporal changes in the component distribution is not supported by our recent modeling efforts , which suggest that the proportion of components is relatively stable over time, even as total mass concentrations have clearly declined.
We find a substantially lower hazard ratio estimate based on the LL model than in previous CanCHEC analyses. There are a number of possible explanations for these differences. First, this new version of the cohort includes an improved death and mobility linkage with updated methodology. Second, the period of follow-up in our analysis ranges from 15 to 25 y (up to 2016). Our analysis further differs from previous CanCHEC studies (e.g., Crouse et al. 2012;Pinault et al. 2017;Weichenthal et al. 2017) in our inclusion of immigrants in the analytical cohort. Immigrants generally have lower mortality rates than the Canadian-born population due to screening criteria for immigration into Canada (Newbold 2005;Ng 2011). Immigrants also tend to live in cities with higher PM 2:5 concentrations, leading to different risks of air pollution exposure. Our methods of PM 2:5 averaging at the PC level and imputation differed slightly from previous analyses. Specifically, we required a minimum of a twodigit PC (e.g., K1) to assign a PM 2:5 value based on a populationweighted average of the geographic area. Some previous analyses assigned the national average PM 2:5 value in absence of any PC information. We also removed all person-years that resulted in missing environmental (PM 2:5 , O 3 , or NO 2 ) or contextual covariates that may have informed previous analyses, while in previous analyses, we either imputed all missing person-years or included dummy variables for geographically based predictors. Finally, these analyses employ updated PM 2:5 estimates for the 2012-2016 period based on enhancements to the satellite retrievals (van Donkelaar et al. 2019) and backcasted estimates prior to 2000 that were not available previously. We do note, however, that the relationship between PM 2:5 and mortality, in both shape and magnitude, is similar to that previously reported for the 1991 (Crouse et al. 2015) and 2001  cohorts. This observation suggests that caution is required in interpreting the magnitude of an association solely based on the LL model without further consideration of the shape of the association.
This work supports observations of a supralinear concentration-response between PM 2:5 exposure and risk of nonaccidental mortality similar to that reported in Crouse et al. (2012Crouse et al. ( , 2015 for the 1991 cohort and Pinault et al. (2017) for the 2001 cohort in more limited analyses. Most of the previous major cohort studies on PM 2:5 did not examine the shape of the concentration-mortality association at the low levels that are observed in our cohort, in which the 25th percentile is 5:1 lg=m 3 , and the median is 6:9 lg=m 3 among all person-years of the three cohorts combined. For example, the lower end of the exposure distribution in the Medicare cohort was 6:2 lg=m 3 (minimum; Di et al. 2017), the National Health Interview Survey cohort (NHIS) was 7:6 lg=m 3 (minimum; Pope et al. 2018), and the ACS Cancer Prevention Study II cohort was 6:7 lg=m 3 (first percentile; Turner et al. 2016). Our mortality HR estimate of 1.053 (95% CI: 1.041, 1.065) is slightly below estimates of U.S. cohorts such as the Medicare cohort (HR = 1:073; 95% CI: 1.071, 1.075), NHIS cohort (HR = 1:056; 95% CI: 1.005, 1.110), and the ACS cohort (HR = 1:07; 95% CI: 1.06, 1.09). Our estimate for the 2001 cohort-the only CanCHEC cohort that did not require backcasted exposure datawas slightly higher than the U.S. cohorts at 1.084 (95% CI: 1.060, 1.108).

Strengths and Limitations
Our study has a number of strengths and limitations. These analyses include a large number of deaths (nearly 1.5 million); population representativeness; a number of subject-level mortality predictors, such as occupation, income, and education, that are  CanCHEC;1991, 1996. Dashed-dotted horizontal black line represents a ratio of 1.96. often not available in such large administrative-based cohorts (Di et al. 2017); and annual location information through linkage to income tax filings back to 1981. Our temporally resolved air pollution estimates could then be assigned to each year of followup. We also developed a method to combine the flexibility of splines to characterize the shape of the relationship between PM 2:5 and mortality with the usefulness of the SCHIF in risk and benefits assessments. This new approach is feasible with respect to computing resources for very large cohort studies. Despite these important strengths, the study includes a number of limitations. Specifically, the cohorts lack individual information on behavioral mortality risk factors such as smoking, diet, and obesity. We have previously examined the influence of further adjustment for these missing risk factors using statistical methods of indirect adjustment (Crouse et al. 2015;Erickson et al. 2019;Shin et al. 2014) and direct adjustment in a cohort based on annual health surveys in Canada where such information was available (Pinault et al. 2016b). For both of these approaches, we found that the PM 2:5 -mortality association remained positive. These observations indicate that it is unlikely that the positive associations we observed are entirely due to lack of inclusion of missing risk factors.
We captured information on social and economic status of each subject only at cohort inception. Thus, information on marital status, income, education, occupation, and employment may have changed over time, and we were not able to model any potential influence of such changes on the PM 2:5 -mortality association. The 1991 cohort may have been most influenced by this lack of temporal adjustment given its 25 y of follow-up.
We employed three very different exposure models for PM 2:5 , NO 2 , and O 3 . The PM 2:5 exposure model included remote sensing information coupled with CTM predictions. Land-use information informed any bias in the CTM predictions for the chemical components of total mass, but were not included as direct predictors of ground measurement data. The NO 2 model used both land-use and remote sensing information, while the O 3 model fused ground data with CTM predictions. The spatial resolution of the models was also different, with NO 2 having the finest resolution of 100 m and PM 2:5 at 1 km, while the O 3 model was at a resolution of 21 km. It was therefore difficult to directly compare the associations with mortality between pollutants and even more difficult with multiple-pollutant models subject to potential differential exposure error. However, the observed stronger associations with O 3 may be due to a lower exposure error in both the original surfaces and the temporal adjustments, since O 3 is known to be more spatially homogenous than PM 2:5 , while NO 2 is known to have the largest spatial variation. Additionally, although the quantitative estimates presented here reflect the large and diverse geographic area of Canada, they may not apply to places around the world with notably different sources and compositions of PM 2:5 , for example, in areas where a much higher proportion of the PM 2:5 is from dust and sand.
In summary, we found positive associations between PM 2:5 and mortality in all three cohorts, and a similar supralinear concentration-response relationship. Lower uncertainty bounds were elevated above unity for all concentrations in the 1991 cohort, above 2 lg=m 3 in the 1996 cohort, and above 5 lg=m 3 in the 2001 cohort, suggesting our confidence in identifying concentrations where there exists a positive response is declining as concentrations decline over time. Therefore, interest exists in examining subsequent CanCHEC cohorts as they become available with the administration of more recent long form censuses.