Air Pollution (Particulate Matter) Exposure and Associations with Depression, Anxiety, Bipolar, Psychosis and Suicide Risk: A Systematic Review and Meta-Analysis

Background: Particulate air pollution’s physical health effects are well known, but associations between particulate matter (PM) exposure and mental illness have not yet been established. However, there is increasing interest in emerging evidence supporting a possible etiological link. Objectives: This systematic review aims to provide a comprehensive overview and synthesis of the epidemiological literature to date by investigating quantitative associations between PM and multiple adverse mental health outcomes (depression, anxiety, bipolar disorder, psychosis, or suicide). Methods: We undertook a systematic review and meta-analysis. We searched Medline, PsycINFO, and EMBASE from January 1974 to September 2017 for English-language human observational studies reporting quantitative associations between exposure to PM <1.0μm in aerodynamic diameter (ultrafine particles) and PM <2.5 and <10μm in aerodynamic diameter (PM2.5 and PM10, respectively) and the above psychiatric outcomes. We extracted data, appraised study quality using a published quality assessment tool, summarized methodological approaches, and conducted meta-analyses where appropriate. Results: Of 1,826 citations identified, 22 met our overall inclusion criteria, and we included 9 in our primary meta-analyses. In our meta-analysis of associations between long-term (>6  months) PM2.5 exposure and depression (n=5 studies), the pooled odds ratio was 1.102 per 10-μg/m3 PM2.5 increase (95% CI: 1.023, 1.189; I2=0.00%). Two of the included studies investigating associations between long-term PM2.5 exposure and anxiety also reported statistically significant positive associations, and we found a statistically significant association between short-term PM10 exposure and suicide in meta-analysis at a 0-2 d cumulative exposure lag. Discussion: Our findings support the hypothesis of an association between long-term PM2.5 exposure and depression, as well as supporting hypotheses of possible associations between long-term PM2.5 exposure and anxiety and between short-term PM10 exposure and suicide. The limited literature and methodological challenges in this field, including heterogeneous outcome definitions, exposure assessment, and residual confounding, suggest further high-quality studies are warranted to investigate potentially causal associations between air pollution and poor mental health. https://doi.org/10.1289/EHP4595


Introduction
Mental illness is a major and fast-growing cause of morbidity worldwide (Vigo et al. 2016). At the same time, ambient air pollution causes approximately 3 million premature deaths globally per year (WHO 2016). Both problems disproportionately affect deprived groups (Fone and Dunstan 2006;Jerrett 2009) and can show marked urban-rural differences (Kovess-Masféty et al. 2005;Strosnider et al. 2017).
Air pollution contains many individual pollutants, including particulate matter (PM), gaseous pollutants and metallic and organic compounds. In this systematic review, we have focused on PM-itself a complex, heterogeneous mixture-because it is responsible for the largest proportion of air pollution's physical health impacts (WHO 2016) and there is mounting evidence for tenable mechanisms by which it might affect risk of multiple mental health outcomes. Inflammation involving the central nervous system (CNS; neuroinflammation), has been implicated as having an important role in the pathophysiology of both depression (Liu et al. 2012) and psychosis (Barron et al. 2017). Hypothalamo-pituitary-adrenal (HPA) axis dysregulation has also been implicated in depression (Lopez-Duran et al. 2009). The plausibility of these as potential etiological pathways between PM exposure and such outcomes is backed up by evidence from animal and human studies into the pathophysiological effects of PM exposure. For example, PM exposure has been shown to be associated with markers of neuroinflammation such as glial activation and oxidative stress in humans (e.g., Block and Calderón-Garcidueñas 2009;Calderón-Garcidueñas et al. 2004;Block et al. 2012) and rodents (e.g., Levesque et al. 2011;Fonken et al. 2011); with changes in brain structure, as shown in humans (e.g., Calderón-Garcidueñas et al. 2008) and animals (e.g., Fonken et al. 2011); and with increased stress hormone (cortisol) production (e.g., Li et al. 2017).
There is also emerging evidence to suggest that PM may adversely affect cognitive development (Zhang et al. 2018), cognitive performance (Clifford et al. 2016;Freire et al. 2010), and dementia risk ) as well as stress and psychological well-being (e.g., Mehta et al. 2015;Orru et al. 2016). Taken together, these findings demonstrate the wide-ranging impacts of PM on brain health and functioning and may support the hypothesis of an association with clinically relevant mental health outcomes.
Prior to our systematic review, we searched the Cochrane Library, PROSPERO, and PubMED for previous or in-progress reviews of air pollution and mental health outcomes. Previous reviews have focused more specifically on cognitive outcomes (Guxens and Sunyer 2012;Tzivian et al. 2015) or pollutants other than PM (Lundberg 1996). Attademo et al. (2017) reviewed the association between psychosis and a wide range of pollutants. In addition to evidence for associations with exposure to heavy metals, they identified four studies investigating associations with PM exposure and concluded that it "appear[s] to play an influential role" in schizophrenia. However, several of the individual reports identified are exploratory in nature and study quality was not assessed.
Our aim was to build on the work reported in the above reviews and provide a comprehensive overview of the epidemiological evidence, from human observational studies, for associations between PM exposure and depression, anxiety, bipolar disorder, psychosis, and suicide, using meta-analysis where appropriate. We also set out to appraise the quality of this evidence, highlighting key gaps and limitations in the current research. These aims are particularly relevant and timely given that some of the previously conducted reviews adopted a narrower scope regarding the outcomes and/or timeframes included and because additional studies have since been published.

Systematic Review and Selection Criteria
We carried out a systematic review and meta-analysis to assess the evidence from human observational studies for associations between PM and adverse mental health outcomes in adults. We excluded experimental or laboratory-based study designs given that these can offer only limited evidence of causal effects on clinically relevant mental health outcomes. We registered the protocol with PROSPERO (registration identification CRD42017074598). Protocol development and reporting were guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist and flow diagram (Moher et al. 2009).
Inclusion and exclusion criteria, grouped according to the population-exposure-comparator-outcome (PECO) framework (Morgan et al. 2018), were as follows. We required studies to meet all of the following criteria to be eligible for inclusion: • Population-studies including adults (≥18 years of age).
• Exposures-studies of outdoor PM <2:5 lm in aerodynamic diameter (PM 2:5 ), <10 lm in aerodynamic diameter (PM 10 ), and <1 lm in aerodynamic diameter (ultrafine particles, or UFPs), collectively termed PM. • Comparators-observational studies reporting comparative effect estimates (i.e., comparing risk between individuals exposed to different levels of PM or different exposures within individuals at different time points). • Outcomes-studies reporting incident or prevalent depression, anxiety, bipolar disorder, or psychosis, including validated symptom-based measures using screening or diagnostic instruments, and clinician diagnoses (recorded or self-reported), emergency department (ED)/clinic attendance or hospitalization for psychiatric disorders, and attempted or completed suicide as outcomes. • English-language full-text available. We excluded studies meeting the following descriptions: • Population-studies restricted to children and pregnant women or individuals with chronic diseases (although we considered studies which included such populations in subgroup analyses eligible). • Exposures-studies using only a proxy measure of PM, such as distance from a major road, road density, or traffic intensity or solid air-suspended particles (rather than measurements of PM 2:5 , PM 10 , or UFP concentrations); studies investigating indoor or cigarette-related air pollution; and studies of occupational air pollution exposure (e.g., firefighters). • Comparators-studies without comparison between higher and lower levels of PM exposure. • Outcomes-studies without an assessed outcome meeting the above criteria (e.g., studies that used measures of stress/ psychological distress and studies that used medication prescribing data as their only outcome). We excluded studies using only unvalidated symptom-based measures for outcome assessment from the primary meta-analyses, although not from sensitivity analyses.

Search Terms and Sources
We searched the Medline, PsycINFO, and EMBASE databases from 1 January 1974 (the start date of the EMBASE database) to 20 September 2017 via the OVID platform. We mapped search terms to each index using keywords for air pollution, particulate matter, PM 10 , PM 2:5 , and ultrafine particles, psychiatric disorders, depression, anxiety, bipolar, psychosis, suicide, and related terms (see Table S1). We checked reference lists and review papers identified for additional relevant citations.

Study Selection
Two authors (I.B. and S.Z.) separately screened all retrieved citations' titles and abstracts, and reviewed all studies deemed potentially relevant in full. Selection decisions were made independently using the prespecified criteria above, tracked using a spreadsheet, and disagreements resolved through discussion.

Data Extraction
I.B. and S.Z. extracted data from eligible studies using a standardized pre-piloted template, covering the following information for each citation: study design, case definition, data source(s), study location, age range, sex (if only one), population at risk, number of cases/outcomes, follow-up time (where applicable), baseline PM concentration (in micrograms per cubic meter), exposure lags investigated, and a summary of overall findings [including direction and statistical significance of adjusted associations, such as risk, odds, or hazard ratios (RRs, ORs, and HRs, respectively)].
Where results were reported only at selected lags, only in stratified analyses, or only in graphical form, we requested unpublished results from authors by email. We also extracted covariates adjusted for in the main adjusted models of each study, which are listed in Table S2.

Studies of Short-and Long-Term Associations
Within the air pollution epidemiological literature more widely, there does not appear to be a consensus definition regarding the threshold distinguishing short-from long-term air pollution exposure. Short-term exposure generally refers to mean exposure over periods of hours, days, or weeks (which should also be measured immediately or only a short time prior to outcome assessment; i.e., at a short lag), often assessed through time-series or case-crossover study designs. Long-term exposure generally refers to mean exposure over a period of years and, in some studies, of several months, typically assessed through cohort, cross-sectional, or case-control study designs (e.g., WHO 2013; Cai et al. 2016). For this analysis, we grouped studies based on the duration of exposure assessment (short-and long-term). We defined studies investigating shortterm exposure as those using measured or modeled exposure values for periods shorter than 6 months, as an intermediate threshold between the time periods typically considered short-term (days or weeks) and long-term (≥1 y). We also required that the exposure assessment period finish within the 6 months prior to outcome assessment, in line with other reviews (e.g., Cai et al. 2016). Longterm exposure was operationalized as measured or modeled mean exposure values covering a time period equal to or longer than 6 months. Short-term time-series, case-crossover, and hierarchical cluster analysis study designs used one or both of two types of exposure lag periods (cumulative and single lags); the term lag 0-2 (or equivalent) is used to denote the lagged cumulative exposure (i.e., the moving average of daily mean concentration on Day 0 (day of outcome), Day −1 (day before outcome), and Day −2 (two days before outcome), and so forth). Lag 2 (or equivalent terms) denotes the single-day mean exposure concentration 2 d before the outcome.

Study Quality Assessment
We assessed study quality using the validated Effective Public Health Practice Project (EPHPP) Quality Assessment Tool for Quantitative Studies (MERST 2010;Armijo-Olivo et al. 2012;Thomas et al. 2004), which was developed by the McMaster Evidence Review and Synthesis Centre for evaluating public health research across heterogeneous study designs, including observational studies. Quality assessment scores were not used to restrict inclusion in meta-analysis but, rather, for the readers' reference. I.B. and S.Z. assigned quality ratings for each component according to the EPHPP Quality Assessment Tool and guidance within the associated EPHPP Tool Dictionary (available online at https://merst.ca/ephpp/ and https://merst.ca/wp-content/uploads/ 2018/02/qualilty-assessment-dictionary_2017.pdf, respectively). I.B. and S.Z. discussed individual component ratings to reach an agreed rating, and where disagreements arose, these studies were discussed with J.F.H. to determine final ratings. We assigned overall ratings according to the EPHPP guidance (those with one rating of poor were assigned an overall rating of fair, and those with two or more, an overall rating of poor).
We assessed selection bias, study design, confounders, blinding, data collection methods (outcome assessment), withdrawals and drop-outs (in longitudinal designs), and intervention/exposure integrity and analyses and assigned an overall rating for all but the last two. Table S3 provides a list of the questions within each component and summary descriptors for how each was allocated.
We extended the list of study designs assigned a moderate rating from those in the EPHPP Tool Dictionary roadmap to include time-series analysis, case-crossover, and hierarchical cluster analysis, provided they used bidirectional referent period selection (as per Carracedo-Martínez et al. 2010). Cross-sectional studies of long-term exposure using an exposure period overlapping with or after outcome assessment were rated poor on study design, whereas cross-sectional studies of long-term exposure using PM values corresponding to a period prior to the outcome assessment period were rated moderate.
In assessing confounding risk, we did not assess the likelihood of differences by intervention (exposure) status given that PM exposure is continuous and we assumed all studies to be at risk of confounding. To assign a rating, we estimated the approximate proportion of relevant confounders controlled (through design or analysis) as follows. We deemed studies comparing risk between individuals to be vulnerable primarily to confounding by socioeconomic status/deprivation, lifestyle factors, health status, and urbanity and considered those that controlled for age, sex, urbanity, multiple relevant lifestyle factors, and for multiple indicators of socioeconomic status (e.g., education, income/household income, area-level measures of deprivation) to have controlled for >80% of (most) relevant confounders; the rating threshold termed good. However, other potentially important confounders (such as noise exposure and access to green space) may not have been controlled for; we judged their (combined) effect as likely to represent <20% of overall relevant confounders. Some adjustment for socioeconomic status and urbanity was required as a minimum for a (between-individual) study to be considered moderate quality with respect to confounding. We considered case-only designs, such as case-crossover analyses, which compare risk within each individual at different points in time, prone mainly to confounding by time-varying covariates (e.g., meteorological factors, seasonal variation, and day-of-the-week effects), and the risk of confounding was appraised accordingly.

Meta-Analysis and Standardization of Results
In addition to the eligibility criteria noted for inclusion in the systematic review, we conducted meta-analyses only when three or more eligible studies investigated associations between the same PM fraction and same outcome, analyzed dichotomous and continuous outcomes separately, and considered only results unstratified by season. We meta-analyzed the adjusted results of studies of associations with long-term PM exposure together, using the most similar time lags available in the primary meta-analysis, and without combining associations of PM exposure with incidence and prevalence in the same (primary) meta-analysis. We meta-analyzed studies of short-term exposure only at the same exposure lag, treating single and cumulative lags separately. Given the overlap between cumulative and single-day lags, we judged meta-regression not to be conceptually appropriate.
We did not use the overall EPHPP study quality ratings to determine eligibility for inclusion in meta-analyses; however, we required studies to have adjusted for a minimal set of confounders to be eligible for meta-analysis (primary or sensitivity analyses) and always used the results from the most comprehensively adjusted model. For studies of long-term exposure, we required adjustment for urbanicity and socioeconomic status or deprivation as a minimum for inclusion in meta-analysis. For studies of short-term exposure, we required adjustment for day of the week and temperature as a minimum.
We considered only results from whole-year analyses for inclusion, both because analyses stratified by season used heterogeneous seasonal cutoffs and because there is currently not sufficient evidence of seasonal effect modification of a putative association with PM to justify stratified analyses. For our primary meta-analysis we required the measures effect to be the same or to be convertible. This meant we did not meta-analyze studies reporting HRs alongside ORs or RRs. We assessed the impact of excluding these results in sensitivity analysis.
We conducted three primary meta-analyses: long-term PM 2:5 and depression risk; long-term PM 10 and depression risk, and shortterm PM 10 and suicide risk (at two lag periods of 0-1 and 0-2 d).
We used DerSimonian-Laird random-effects meta-analysis (DerSimonian and Laird 1986), using the metan function in STATA, because of the heterogeneity in study populations, outcome definitions, and prevalence. We produced forest and funnel plots, and conducted all statistical analyses using STATA (version 14.0; Stata Corporation). Assuming a log-linear exposureresponse relationship, we standardized ORs, RRs, and HRs for a 10 lg=m 3 PM increment-where expressed per 5 lg=m 3 PM increment, for example-before meta-analysis. To standardize these results, we used the formula OR 10 = e ln OR x ð Þ× 10 x ð Þ ð Þ (substituting RR or HR as applicable), where x is the increment of PM exposure (5 lg=m 3 in this example) for which OR x is presented in the original study (Cohen et al. 2004).

Sensitivity Analyses
We used sensitivity meta-analysis to explore the effect of the following changes on the results of meta-analyses of associations between depression and long-term PM 2:5 exposure (1a-c), and between depression and long-term PM 10 exposure (2a-b), • Including HRs in meta-analysis with ORs (i.e., assuming equivalence) • Including studies excluded from the primary meta-analysis on the basis of the outcome definitions used, for PM 2:5 • Including the same studies as the primary meta-analysis using alternative exposure time periods for studies which used more than one (PM 2:5 only) • Including the same studies as the primary meta-analysis but using the alternative exposure assessment model in studies using two models (PM 10 only) • Including the studies excluded from the primary metaanalysis on the basis of their outcome definitions, for PM 10 (as per 1b).

Assessment of Variability and Publication Bias
We assessed heterogeneity between meta-analyzed studies using forest plots (given the relatively small number of studies included, this approach was considered more appropriate than Cochran's Q) and assessed the percentage of between-study variation due to heterogeneity rather than within-study error with the I 2 statistic; we also calculated p-values for a chi-square test of heterogeneity. We analyzed funnel plots visually for evidence of publication bias and conducted Egger's regression test.

Illustrative Population Attributable Fraction Estimates
Where meta-analyses of associations with long-term exposure yielded statistically significant summary estimates, we estimated illustrative population attributable fractions (PAFs) for two separate exposure scenarios (UK cities and global). For this purpose, we assumed causality (while noting that this has not been proven), equivalence between ORs and RRs (the rare outcome assumption), and a log-linear exposure-response function (i.e., that relative risk remains constant per unit increase in exposure, at all concentrations). The latter assumption is frequently adopted in air pollution epidemiology in relation to physical health outcomes (e.g., Daniels et al. 2000). We estimated PAFs by assuming 100% exposure prevalence at population-weighted mean PM exposure levels, using published population-weighted exposure estimates for UK urban (EC 2017) and global (Brauer et al. 2016) populations (12:8 lg=m 3 and 43:9 lg=m 3 , respectively). The counterfactual scenarios we used were those of population-weighted average exposure being reduced to the World Health Organization's (WHO) recommended limit for annual mean PM 2:5 (10:0 lg=m 3 ) for UK cities (Krzyzanowski and Cohen 2008), and to the European Union's (EU) less stringent recommended limit value (25:0 lg=m 3 ) for the global scenario (EC 2015).
To calculate PAFs, we adapted the following formula used by the WHO (2016): where i is the level of PM 2:5 in micrograms per cubic meter, P i is the percentage of the population exposed to that level of air pollution, and RR is the relative risk (compared with the counterfactual scenario; see below). Assuming uniform and ubiquitous exposure at a specified mean level, based on a population-weighted exposure estimate, this simplifies to PAF = 1:00 RR − 1 ð Þ 1:00 RR − 1 ð Þ+ 1 : For this purpose, we treated pooled ORs obtained from metaanalysis as equivalent to RRs, and estimated the RR under the current relative to counterfactual PM 2:5 exposure scenarios outlined above as RR OR 10 PM2:5 current − PM2:5 counterfactual ð Þ 10 , where OR 10 denotes the OR per 10-lg=m 3 increment in PM 2:5 exposure used. Calculations are detailed in Table S4.

Results
The study selection process is summarized in Figure 1. We identified 25 eligible studies (from 22 citations, 1 of which included four data sets combined in one paper as a meta-analysis (Zijlema et al. 2016) (Tables 1-3) from 1,829 citations screened; most investigated outcomes were related to depression, suicide, or ED/ hospital attendance. Two studies (Power et al. 2015;Pun et al. 2017) included both short-and long-term exposure lags; 14 studies assessed associations with short-term exposure only, and 9 assessed associations with long-term exposure only. One study (Wang et al. 2014) categorized with studies of short-term PM exposure in our review also studied longer-term exposure; however, we excluded these analyses because they used road proximity rather than PM exposure.
Sample size ranged from 958 to 69,966 in studies including both cases and noncases (e.g., cohort and cross-sectional studies). Those adopting case-only study designs included between 1,546 and 118,602 individuals. Two studies (Kioumourtzoglou et al. 2017;Power et al. 2015) included only women, and 5 (Kioumourtzoglou et al. 2017;Power et al. 2015;Pun et al. 2017;Lim et al. 2012;Wang et al. 2014) included only older adults, with a range of age cutoffs. The most common study locations were Asia and North America; 14 were restricted to urban populations. Most studies identified were relatively recent, with only 8 published before 2015. Six of these used case-only study designs, whereas a higher proportion published since then adopted cross-sectional or cohort designs.
Tables 1-3 summarize the key data extracted from included studies, divided according to the exposure timeframe, and Figure  2 summarizes study quality, outcomes studied, and inclusion in meta-analyses. Associations (ORs/RRs/HRs) are presented per 10-lg=m 3 increment unless otherwise specified. Table S2 details the covariates included by each study in their primary adjusted models (although it is worth noting that case-only designs also adjust for all time-invariant confounders; i.e., those for which an individual's exposure remains constant over the timeframes studied-by design).

Study Quality
Study quality varied substantially ( Figure 2). Overall, we rated quality more highly in studies of short-than long-term exposure, partly due to the case-only designs' advantages regarding confounder adjustment-given that many important confounders (such as individual socioeconomic status, physical activity, smoking status, and area-level deprivation), do not vary over short timeframes when individuals act as their own controlsalong with their narrower epidemiological scope (given that they can only compare transient, acute effects rather than provide evidence about the etiology of more chronic exposureoutcome relationships). We rated many studies fair or poor regarding selection bias because few selection strategies ensured that participants were very likely to be representative of the target population, and some reported participation rates <60% at enrollment, as in Zijlema et al. 2016 (Substudies A and C; LifeLines and HUNT, respectively), or did not report this (e.g., Kim and Kim 2017;Zijlema et al. 2016 (Substudy B;KORA), Wang et al. 2014).
We rated four of the studies of long-term PM exposure [Zijlema et al. 2016 (Substudies A-C; LifeLines, KORA, and HUNT) and Vert et al. 2017] as poor overall due to two or more poor component ratings. It should be noted that some studies were rated moderate overall according to EPHPP guidance, within which a poor global rating was assigned to studies with two or more poor component ratings, despite potentially significant weaknesses in a specific area, if they did not have two poor scores. For example, we rated both the KORA and FINRISK studies [Substudies B and D in the meta-analysis by Zijlema et al. (2016)] poor on study design because in both cases exposure assessment was based on models created using monitoring data collected significantly after clinical outcomes were assessed, although using concurrent address data.
Studies investigating associations between long-term PM exposure and mental health outcomes (which tend to compare exposure and risk differences between individuals) are primarily susceptible to confounding by time-invariant confounders that may co-vary with both air pollution and mental health. Some studies failed to control for confounders that we considered likely to be particularly important (e.g., socioeconomic status/ deprivation, urbanity), as in the HUNT subcohort in Zijlema et al. (2016), which included neither of these, or did so only partially (e.g., Kim and Kim 2017;Vert et al. 2017). By contrast, several other studies controlled for socioeconomic status reasonably comprehensively (e.g., Kim et al. 2016;Kioumourtzoglou et al. 2017;Lin et al. 2017a;Pun et al. 2017;Power et al. 2015). Some other potentially influential confounders were rarely included in analyses; for example, no included study adjusted for access to green space and only the substudies by Zijlema et al. (2016) included noise. However, the results of these substudies had very wide confidence intervals (CIs), in analyses with and without adjustment for noise exposure, which the authors noted could have been due to model overfitting and comparatively low exposure variability.
Although relatively robust to time-invariant confounders, case-only studies (which compare changes in risk associated with short-term exposure within the included individuals, who all experience the outcome of interest over the study period) remain susceptible to time-varying confounders such as temperature (Page et al. 2007;Peng et al. 2017) and seasonal and long-term trends (Carracedo-Martínez et al. 2010). These were generally well controlled (see Table S3 for covariates adjusted for), leading to most case-only studies being assigned a confounding rating of  good. However, temporal variation in noise levels was not adjusted for by any studies of short-term exposure.

Studies of Associations with Long-Term PM Exposure ( ‡ 6-Month Exposure Period)
We identified 8 studies of associations with long-term exposure (Tables 1 and 2), including 1 meta-analysis (Zijlema et al. 2016) that comprised 4 previously unpublished primary studies, each of which we considered individually, giving 11 in total. Two of these (Pun et al. 2017;Power et al. 2015) also investigated associations between short-term exposure and one or more eligible mental health outcomes. Of these 11 studies, 9 investigated associations of any eligible outcome with long-term PM 2:5 exposure and 6 with long-term PM 10 exposure (these totals include 4 studies that investigated both long-term PM 2:5 and PM 10 exposure; Tables 1 and 2).

Depression
Two eligible cohort studies (Kim et al. 2016;Kioumourtzoglou et al. 2017) and seven cross-sectional studies, including four substudies from Zijlema et al.'s paper [Kim and Kim 2017;Lin et al. 2017a;Pun et al. 2017;Vert et al. 2017;Zijlema et al. 2016 (Substudies A-D)] looked at associations between long-term PM exposure and depression. Outcomes included depression incidence (Kim et al. 2016;Kioumourtzoglou et al. 2017) and prevalence-based on either score on a screening instrument indicating moderate-to-severe depressive symptoms [Pun et al. 2017;Zijlema et al. 2016 (Substudies C and D)], or meeting diagnostic criteria on a validated diagnostic instrument [Zijlema et al. 2016 (Substudies A and B)]. Another study (Lin et al. 2017a) used the validated World Mental Health Survey version of the Composite International Diagnostic Interview (WMH-CIDI) and defined depression as any of the following: depressive symptoms within the past year, meeting diagnostic criteria for research (based on WMH-CIDI), and/or having received depression-related health care. This study partially involved self-report of diagnosis or health care utilization; however, we rated outcome assessment moderate because this selfreported outcome data was used alongside a valid and reliable instrument.
Of the seven cross-sectional studies, five [Lin et al. 2017a;Zijlema et al. 2016 (Substudies A, B and D); Pun et al. 2017] met criteria for inclusion in the primary meta-analysis of associations between long-term PM 2:5 exposure and depression. Results of the sensitivity meta-analyses are shown in Table 4 and the reasons for exclusion of other studies from the meta-analyses are detailed in Table S5.
Long-term PM 2:5 and depression. In meta-analysis, the pooled OR for the association between long-term PM 2:5 exposure and depression prevalence (Figure 3; n = 5 studies) was 1.102 per 10 lg=m 3 (95% CI: 1.023, 1.189; p = 0:011), indicating that higher PM 2:5 exposure is associated with higher odds of depression. The I 2 statistic was 0.00%, suggesting low statistical heterogeneity among the included studies. Evidence of publication bias was not evident in the funnel plot ( Figure 4A), and this was further supported by the results of Egger's regression test for small-study effects (H 0 = no small-study effects; p = 0:311).
We conducted a sensitivity meta-analysis (Table 4, Row 1a) in which we also included HRs for PM 2:5 exposure from the two observational cohort studies, the first of which found statistically significant positive associations with incidence of recorded depression diagnoses (Kim et al. 2016) and another that did not find a statistically significant association with self-report of diagnosis (Kioumourtzoglou et al. 2017). This gave a similar pooled       Lags refer to mean exposure over the specified period relative to the point of outcome assessment, such that single lag 2 denotes exposure 2 d prior to outcome assessment, whereas cumulative lag 0-2 denotes exposure on the day of outcome assessment as well as on the preceding 2 d, and so forth.
c Significance refers to statistical significance at the 95% level.   Thomas et al. 2004), which was developed for evaluating public health research across heterogeneous study designs. We extended the list of study designs assigned a fair rating beyond those listed in the EPHPP Tool Dictionary. We rated all time-series analyses, case-crossover, and hierarchical cluster analyses as fair quality, provided case-crossover studies used bidirectional referent period selection. We rated all cross-sectional studies of associations with long-term PM exposure that used measured or modeled PM values from a period prior to (and not overlapping with) outcome assessment as fair to reflect the relative strength of these studies compared with typical cross-sectional studies with simultaneous exposure and outcome assessment; we rated those using an exposure period overlapping with or after outcome assessment as poor for study quality. We assigned overall ratings according to the EPHPP guidance (those with one poor rating were assigned an overall rating of fair; those with two or more, an overall rating of poor). See Table S3 for further information on the allocation of quality ratings for individual quality components. In the middle set of columns, the outcomes studied (indicated via tick marks) refer to any outcome related to the specified mental health outcome or diagnosis, such as physician diagnosis, meeting a specified threshold on a diagnostic scale, measures of symptom severity, hospital or ED attendance, and in the case of suicide, suicide attempts, ideation, or suicide death. Inclusion in primary meta-analyses (Figures 3, 5, and 6 and funnel plots in Figure 4) is indicated via tick marks in the right-hand set of columns; the numbers and letters of the sensitivity meta-analyses indicated via an S correspond to those detailed in Table 4. Note: ED, emergency department; L-T, long term (≥6 months) PM exposure (exposure assessment period ≥6 months); N/A, not applicable; PM, particulate matter; PM 10 , particulate matter of <10 lm in aerodynamic diameter; PM 2:5 , particulate matter of <2:5 lm in aerodynamic diameter; S-T, short term (<6 months) PM exposure (exposure assessment period <6 months). estimated effect size but a wider confidence interval (OR = 1:113; 95% CI: 1.003, 1.234; p = 0:043) and evidence of somewhat greater between-study heterogeneity; I 2 was slightly increased at 20.4%. Another PM 2:5 sensitivity meta-analysis (Table 4, Row 1b) showed that including a study (Vert et al. 2017) that used self-reported depression and included a select population who were screened at study enrollment to ensure that they did not have major depressive disorder (potentially biasing this study) resulted in a less precise statistically nonsignificant pooled estimate and increased heterogeneity compared with the primary PM 2:5 depression meta-analysis (OR = 1:259; 95% CI: 0.874, 1.813; p = 0:217; I 2 = 45:8%). Vert et al. (2017) found a positive OR of 4.38 per 5-lg=m 3 PM 2:5 (95% CI: 1.70, 11.30). In a third sensitivity meta-analysis ( Long-term PM 10 and depression. As Figure 5 shows, the overall (pooled) result for the primary PM 10 meta-analysis (n = 3 studies, all from Zijlema et al. 2016) was not statistically significant at the conventional 5% level and showed no clear association. The pooled OR we found was 0.891 (95% CI: 0.504, 1.577) per 10 lg=m 3 ; p = 0:692 and I 2 was 0.00% (p heterogeneity = 0:638), again representing low heterogeneity. The funnel plot for this metaanalysis is shown in Figure 4B. In Egger's regression test for small-study effects, p = 0:128; suggesting no clear evidence of publication bias (but given the p < 0:10 threshold recommended by Egger et al. (1997), it cannot be definitively excluded). Zijlema et al. (2016) used two PM 10 exposure models; we included results based on the European Study of Cohorts for Air Pollution Effects (ESCAPE) study (rather than EU-wide) land-use regression model exposure estimates in this primary PM 10 meta-analysis because these were described as region-specific estimates based on intensive monitoring campaigns and local predictors (e.g., traffic counts), and we judged them as likely to be more accurate and to capture a greater degree of air pollution variability in the study location than the EU-wide model.
In a sensitivity analysis of the PM 10 meta-analysis, we tested the impact of using EU-wide PM 10 land-use regression model results for the two eligible Zijlema et al. (2016) substudies for which these were available (LifeLines/Substudy A, where the corresponding main model OR was higher, and statistically significant; and KORA/Substudy B, in which it was lower, than the corresponding ESCAPE model results). We also sensitivity analyzed the impact of including two studies that we had excluded from the primary meta-analysis due to the unvalidated outcome definitions used (Vert et al. 2017;Kim and Kim 2017) and the exclusion of individuals with diagnosed depression at baseline by Vert et al. (2017). Vert et al. (2017) found a large statistically significant association between PM 10 and self-reported depression disorders, with a wide confidence interval (OR = 6.52; 95% CI: 1.82, 23.35), whereas Kim and Kim (2017) did not find a statistically significant association between PM 10 and depressive symptoms (OR = 1.01; 95% CI: 0.98, 1.05). Table 4 (Rows 2a,b) shows results of these sensitivity meta-analyses. We obtained positive but statistically nonsignificant pooled estimates with wide confidence intervals and moderate between-study heterogeneity in both sensitivity analyses (I 2 values of 59.4% and 57.1%, respectively).
The HUNT substudy from Zijlema et al. (2016; Substudy C) was excluded from the PM 10 -depression meta-analysis and sensitivity analyses because it lacked the requisite minimal confounder set of indicators of urbanicity and socioeconomic status/deprivation. The extended confounder model for this study was adjusted Table 4. Results of sensitivity analyses (random effects meta-analyses) of associations between long-term PM 2:5 or PM 10 exposure (exposure assessment periods of ≥6 months duration) and odds of depression.

Anxiety
Two studies (Power et al. 2015;Pun et al. 2017) investigated anxiety using validated screening instruments and both individually found statistically significant positive associations between PM 2:5 exposure and risk of anxiety symptoms above a threshold that was considered clinically relevant. Another study (Vert et al. 2017) included self-reported anxiety as an outcome but was excluded from primary meta-analyses (see Table S5 for details). We therefore did not conduct meta-analysis because there were too few eligible studies. Pun et al. (2017) found generally stronger associations between PM 2:5 and moderate-to-severe anxiety symptoms (operationalized as a Hospital Anxiety and Depression Score-A (HADS-A) score ≥8) than did Power et al. (2015) using a cutoff of 6 or more points on the less commonly used Crown-Crisp Index (CCI) phobic anxiety subscale, although both were significant at a range of cumulative lags. Pun et al. (2017) reported adjusted ORs of 1.39 (95% CI: 1.15, 1.69) and 1.34 (95% CI: 1.12, 1.61) per 5-lg=m 3 PM 2:5 increment over 1-and 4-y cumulative lags, respectively, whereas Power et al. reported adjusted ORs per 10-lg=m 3 increment of PM 2:5 of 1.15 (95% CI: 1.06, 1.25) for a 1-y cumulative lag and 1.09 (95% CI: 1.01, 1.18) for cumulative exposure over the 15 y before outcome assessment in 2004 (average PM 2:5 at participants' recorded residential address from 1988 to 2003). The CCI subscale has been validated as differentiating anxiety and phobias from depression and healthy controls (Crown and Crisp 1966;Mavissakalian and Michelson 1981), and the cutoff adopted was based on previous evidence that this represents a clinically important threshold (Okereke et al. 2012;McGrath et al. 2004); however, CCI scores may measure a different form of anxiety from HADS-A.

Other Outcomes
We did not identify any eligible studies investigating associations between long-term PM 2:5 or PM 10 exposure and bipolar disorder (or manifestations thereof including mania). We also did not find any study associations between long-term PM 2:5 or PM 10 exposure and psychosis or suicide which met our inclusion criteria.

Population Attributable Fraction Estimates
We estimated illustrative PAFs for two scenarios (UK cities and global) based on the pooled estimate from our meta-analysis of associations between long-term PM 2:5 exposure and depression, assuming causality and a log-linear exposure-response function (calculations detailed in Table S4). However, it should be noted the observed relationship may be due to bias and/or confounding rather than a causal association.
This analysis suggested that if urban UK population-weighted PM 2:5 exposure (12:8 lg=m 3 in 2014) were reduced to the WHO's recommended annual maximum (10:0 lg=m 3 ) (Krzyzanowski and Cohen 2008), city-dwelling UK citizens' depression risk could, potentially, be reduced by approximately 2.5% (95% CI: 0.58, 4.34%). If global mean population-weighted PM 2:5 exposure (43:9 lg=m 3 ) were reduced to a population-weighted mean PM 2:5 exposure of 25:0 lg=m 3 (the current EU-recommended limit), depression risk could be reduced by up to 15.2% (95% CI: 3.85, 25.9%). . Forest plot of meta-analysis of associations between long-term (≥6-months) PM 2:5 exposure and depression risk (n = 5 studies). Results of meta-analysis are shown as pooled effect estimates of the OR of depression per 10 lg=m 3 (95% CIs). The dashed vertical line indicates the overall effect estimate derived from DerSimonian-Laird random effects meta-analysis (DerSimonian and Laird 1986), and the blue diamond indicates the 95% CI of the overall (pooled) effect estimate. The horizontal lines indicate the 95% CI around each study's central estimate for the adjusted OR (shown with a closed circle); arrowheads at the end of these lines indicate where the true location of the end of a line is not shown (for scale reasons) and the upper or lower 95% CI is farther from the central estimate, in the direction of the arrowhead. The percentage weights are weightings assigned to individual studies' results in the DerSimonian-Laird random effects meta-analysis, and the sizes of the shaded squares around each effect estimate are scaled according to these relative weightings. The p-value of 0.972 shown at the bottom left is derived from a test of the null hypothesis of heterogeneity (Cochran's Q). Covariates adjusted for are detailed in Table S2. Note: CI, confidence interval; OR, odds ratio; PM 2:5 , particulate matter of <2:5 lm in aerodynamic diameter.

Studies of Associations with Short-Term PM Exposure (Exposure Lags <6 Months)
Sixteen studies investigated short-term associations between PM and eligible psychiatric outcomes (Tables 2 and 3 list study citations and data extracted). Twelve used case-only designs (e.g., case-crossover and time-series analysis), which are well suited for studying acute effects. Eleven investigated short-term associations between any eligible mental health outcome and PM 2:5 and 13 investigated associations of eligible outcomes with PM 10 (six analyzed both); 1 (Wang et al. 2014) also investigated associations with UFP exposure.

Suicide (Attempted or Completed)
Six studies (Bakian et al. 2015;Casas et al. 2017;C Kim et al. 2010;Y Kim et al. 2015;Lin et al. 2016;Ng et al. 2016) investigated associations between short-term PM exposure and risk of completed suicide (using mortality register data), and one study investigated associations with suicide attempt or ideation (Szyszkowicz et al. 2010). The results of a majority of eligible studies investigating short-term associations with risk of completed suicide and PM 10 exposure indicated a positive association, with a statistically significant pooled result from primary meta-analysis at lag 0-2 d.
Only two studies investigated associations between completed suicide and PM 2:5 at any lags, so meta-analysis of associations between PM 2:5 and completed suicide was not possible. We were only able to meta-analyze studies of associations between PM 10 and completed suicide at lags 0-1 d (n = 3) and 0-2 d (n = 4) because fewer than three studies included all other time lags studied. All studies considered eligible for inclusion in these meta-analyses reported relative risks, except for Lin et al.  Figure 2; those indicated by tick marks in the right-hand column are included in primary meta-analyses. Funnel plots for sensitivity meta-analyses, the results of which are detailed in Table 4, are not included in this figure. The dark blue circles represent the central estimates for each included study or substudy's results; the dashed diagonal lines represent pseudo 95% confidence intervals and the solid vertical lines represent the natural logarithm of the overall effect estimate. Note: L-T, long-term (≥6 months) PM exposure (exposure assessment period ≥6 months); lnOR, natural logarithm of the odds ratio; lnRR, natural logarithm of the relative risk (both presented per 10 lg=m 3 increase in PM 10 or PM 2:5 exposure); SE, standard error; S-T, short-term (<6 months) PM exposure (exposure assessment period <6 months).
A majority of the associations with completed suicide reported by individual studies were positive at a range of short-term lags, particularly for PM 10 , although not all were statistically significant. Only the study by Ng et al. (2016) reported no consistent direction of association. Although most of these studies focused on lag periods of <1 week, Kim et al. (2015) also found statistically significant associations between short-term PM 10 concentrations and the nationwide weekly suicide rate at 2-, 3-, and 4week lags, including a 3.6% increase (95% CI: 1.5, 5.7%) per 37:8-lg=m 3 PM 10 increment at a 4-week (single week) lag, but they found no statistically significant association at either the 5or 6-week time lags studied.
In addition, Szyszkowicz et al. (2010) defined all ED visits coded as suicide attempt or ideation as cases and observed positive but nonsignificant associations for PM 10 and PM 2:5 . However, unstratified results were presented only graphically; we requested numerical results, but these were not received and so could not be included in the meta-analyses.

Depressive Episode Exacerbation and Depressive Symptom Severity
Seven studies investigated associations between short-term PM 10 or PM 2:5 exposure and outcomes such as ED attendance for depressive episodes (Cho et al. 2014;Szyszkowicz 2007;Szyszkowicz et al. 2009Szyszkowicz et al. , 2016 or depressive symptoms (Lim et al. 2012;Pun et al. 2017;Wang et al. 2014). Short-term increases in PM exposure appear to be positively associated with depressionrelated ED attendance risk, whereas the evidence regarding associations between short-term PM exposure and depressive symptom severity is more mixed; we did not conduct meta-analysis of results for these outcomes. We did not undertake meta-analysis of associations of short-term PM exposure with these outcomes due to the heterogeneity of the time lags analyzed and a lack of numeric reporting of unstratified results in two studies of ED attendances, meaning that-for each lag considered-there were fewer than three studies using the same exposure lags. Most of the above studies found positive associations with PM exposure at some of the exposure lags studied, with the exception of Wang et al. (2014). Cho et al. (2014) observed statistically significant positive associations of PM 10 exposure with ED attendance for depression at five of the six lags studied, whereas two studies (Szyszkowicz 2007;Szyszkowicz et al. 2009) both reported statistically significant positive associations between PM 10 and ED attendance for depression at some lags but were excluded from the meta-analysis (see Table S5 for details). Szyszkowicz et al. (2016) found whole-year associations between depression-related ED attendances and PM 2:5 that were positive (ORs >1) at a majority of the single-day lags, from 0 to 8 d (8 and 7 of the 9 lags studied, for males and females, respectively). However, they were only statistically significant in males at Figure 5. Forest plot of meta-analysis of associations between long-term (≥6-months) PM 10 exposure and depression risk (n = 3 studies). Results of meta-analysis are shown as pooled effect estimates of the OR of depression per 10 lg=m 3 (95% CIs). The dashed vertical line indicates the overall effect estimate derived from DerSimonian-Laird random effects meta-analysis, and the blue diamond indicates the 95% CI of the overall (pooled) effect estimate. The horizontal lines indicate the 95% CI around each study's central estimate for the adjusted OR (shown with a closed circle); the arrowhead at the end of the line for Zijlema et al. 2016, Substudy A (LifeLines) indicates that the true end of this line is not shown (for scale reasons) and the lower 95% CI is farther from the central estimate. The percentage weights are weightings assigned to individual studies' results in the DerSimonian-Laird random effects meta-analysis, and the sizes of the shaded squares around each effect estimate are scaled according to these relative weightings. The p-value of 0.638 shown at the bottom left is derived from a test of the null hypothesis of heterogeneity (Cochran's Q). Covariates adjusted for by individual studies are detailed in Table S2. Note: CI, confidence interval; OR, odds ratio; PM 10 , particulate matter of <10 lm in aerodynamic diameter. lag 3 d (OR = 1.023 per 7:12 lg=m 3 ; 95% CI: 1.001, 1.045); results unstratified by sex were not reported. Pun et al. (2017) found positive associations between PM 2:5 exposure and symptoms indicative of depression [operationalized as Centre for Epidemiologic Studies Depression (CESD)-11 score ≥9] at cumulative 7-and 30-day lags, with ORs of 1.08 (95% CI: 1.00, 1.16) and 1.16 (95% CI: 1.05, 1.29) per 5-lg=m 3 increase in PM 2:5 exposure, respectively. Wang et al. (2014) reported a significant negative association of 0-14 d cumulatively lagged PM 2:5 with the odds of a CESD Scale-Revised (CESD-R) score ≥16 (OR = 0.67; 95% CI: 0.46, 0.98 per 3:4-lg=m 3 increase in PM 2:5 ). Lim et al. (2012) reported an increase in an elderly population's mean score on the Korean Geriatric Depression Scale-Short Form (Bae and Cho 2004) of 17.0% (95% CI: 4.9, 30.5%) per 24:2-lg=m 3 increase in 0-3 d PM 10 .

Other Outcomes
No studies specifically investigated associations between PM and bipolar disorder. Gao et al. (2017) found associations of borderline statistical significance between PM 10 and hospital admissions for all mood disorders (combined) of a 0.36% increase per 10-lg=m 3 increment (95% CI: 0.06, 0.66%) at a 5-d single lag and statistically significant positive associations between PM 10 and hospital admissions for schizophrenia, which appeared strongest at a 0-6 d cumulative lag with a 1.38% increase per 10-lg=m 3 increment (95% CI: 0.52, 2.23%).

Discussion
Our systematic review identified a statistically significant association between long-term PM 2:5 exposure and depression risk in our primary meta-analysis, which remained in two of three sensitivity analyses, whereas our primary and sensitivity metaanalyses of long-term PM 10 exposure and depression risk did not find any statistically significant association. Meta-analysis of studies reporting associations between short-term PM 10 exposure and completed suicide found a RR of 1.02 (95% CI: 1.00, 1.03) at a cumulative 0-2 d lag; the association at a 0-1 d lag was not statistically significant. Figure 6. Forest plot of meta-analyses of associations between short-term PM 10 exposure and risk of completed suicide (relative risk per 10 lg=m 3 ), at cumulative lags 0-1 and 0-2 d. Results of meta-analysis are shown as pooled effect estimates for RR of depression per 10 lg=m 3 PM 10 (95% CIs). Lag 0-1 refers to the cumulatively lagged values (moving average) of concentrations across Day 0 (the day of the outcome event) and Day −1 (the previous day), whereas Lag 0-2 refers to the cumulative lagged values across Days 0, −1, and −2. The dashed lines indicate the overall effect estimates, separately for each cumulative lag, derived from DerSimonian-Laird random effects meta-analysis, and the diamond indicates the 95% CI of the overall (pooled) effect estimate. The horizontal lines indicate the 95% CI around each study's central estimate for the adjusted RR at this exposure time lag (shown with a closed circle); the arrowhead at the right end of the line for Bakian et al. (2015) at lag 0-2 d indicates that the true location of the upper 95% CI for this study is farther from the central estimate. The percentage weights are weightings of the individual studies in the DerSimonian-Laird random effects meta-analysis, and the sizes of the shaded squares around each effect estimate are scaled according to these relative weightings. The p-values at the bottom left are from a test of the null hypothesis of heterogeneity (Cochran's Q). The covariates that each study adjusted for are detailed in Table S2. Note: CI, confidence interval; PM 10 , particulate matter of <10 lm in aerodynamic diameter; RR, relative risk.
The more limited evidence available for anxiety suggests a possible association with increased PM exposure at both shorterand long-term exposure timescales, although only two eligible studies (Power et al. 2015;Pun et al. 2017) investigated this. The effect sizes observed in the studies of associations between longterm PM 2:5 exposure and depression and anxiety identified by our review are similar in magnitude to those observed for some physical health consequences of chronic air pollution exposure. For example, a 10-lg=m 3 increase in long-term average PM 2:5 exposure appears to be associated with an approximately 10% increase in all-cause mortality (Brook et al. 2010).
We also observed an association between short-term PM 10 exposure and suicide risk at a 0-2 d cumulative time lag in meta-analysis, which was statistically significant at the 95% level. There is also some evidence to suggest an association between short-term PM 2:5 or PM 10 exposure and depressionrelated ED visits from individual studies, with several identifying statistically significant associations with ED attendances or depressive symptoms at some of the exposure lags studied, although their statistical significance may be affected by multiple comparisons. The smaller effect sizes observed in relation to short-term exposure relative to the larger magnitude of associations with long-term exposure are consistent with the results of studies of PM's physical health effects. In the review by Pope and Dockery (2006) of meta-analyses and multicity studies, a 10-lg=m 3 increase in mean daily PM 2:5 exposure was found to increase daily cardiovascular mortality risk by approximately 1.0%.
We identified no studies of associations with bipolar disorder and only one eligible study (Gao et al. 2017) of associations between PM and psychosis risk. That study found associations of borderline statistical significance between short-term PM exposure and inpatient admissions for mood disorders or schizophrenia. We excluded two studies of associations between short-term PM concentrations and psychosis from this review; the study by Lary et al. (2015) of ED admissions for schizophrenia reported only the strength of correlations with specific schizophrenia diagnostic categories, rather than comparative effect estimates, whereas the study by Tong et al. (2016) operationalized psychosis diseases as International Classification of Diseases (ICD)-10 codes F00-F99 (organic mental disorders; i.e., including psychiatric disorders with known biological causes such as brain tumors), and psychosis morbidity was undefined. In relation to associations between long-term PM exposure and psychosis risk, the two studies identified used proximity to a major road as a proxy for PM exposure, which did not meet inclusion criteria; we therefore excluded these studies from our review. An exploratory study by Pedersen et al. (2004) identified a possible association with psychosis based on proximity to a major road, although a subsequent study by Pedersen and Mortensen (2006) did not replicate this finding.
The estimated PAFs presented for PM 2:5 exposure and depression are intended to illustrate the disease burden that might be attributable to the association between long-term PM 2:5 exposure and depression observed in our meta-analysis if this relationship were established as causal and independent of the influence of potential confounding factors. These PAF estimates illustrate that this burden could-potentially-be similar in magnitude to some of air pollution's physical health effects (e.g., Burnett et al. 2014). However, such causality has not been established, and our meta-analysis is subject to several limitations, so they should be interpreted as illustrative estimates of the potential attributable burden should this association be further substantiated in future, not as representing the actual burden.

Strengths and Limitations
A key strength of this review is the comprehensive, systematic approach, with the protocol specified a priori, and searches conducted by two authors. We also used standardized data extraction and quality assessment procedures.
Our review covers a broader range of mental health outcomes than most previous reviews in this area, and we identified a moderate number of studies for some outcomes, enabling metaanalyses. However, we were only able to undertake three metaanalyses, together covering two of the five eligible outcomes, owing to the small numbers of studies with comparable exposure-outcome combinations for other mental health outcomes and incomplete reporting in two studies (Szyszkowicz 2007;Szyszkowicz et al. 2009). Our review includes 20 studies (including the 4 within Zijlema et al. 2016) that have been published since the search by Tzivian et al. (2015) of the literature up to November 2013. One further study of associations with psychosis, published since the review by Attademo et al. (2017), was identified (Gao et al. 2017).
An important corollary of the typically small individuallevel effect sizes of changes in PM exposure in relation to physical health outcomes (per 10-lg=m 3 increment) is that very large sample sizes may be needed to detect such effects if the same applies for mental health. This is particularly relevant where adjusting for a large number of confounders and in settings where pollutant levels and variability are comparatively low, as in Europe or the United States. As a result, it is likely that some included studies were underpowered to detect a putative effect on the outcomes studied (particularly because power calculations require prior assumptions about likely effect size), making the use of meta-analysis particularly valuable.
The distinction made between short-and long-term exposures on the basis of a 6-month cutoff is somewhat arbitrary, but this did not significantly influence the meta-analytic results. Only one study (Power et al. 2015) used exposure assessment periods of between 2 and 11 months' duration (3 and 6 months), and this study could not be included in the meta-analysis in any case due to the small number of eligible studies of this outcome (anxiety).
We did not detect evidence of publication bias but recognize that our review was restricted to published research in Englishlanguage journals. We also acknowledge the limitations of visual assessment of funnel plots for publication bias and that the power of statistical tests for small-sample effects such as Egger's test is low when analyzing only a small number of studies, as here. Heterogeneity in both methods and outcome definitions is reflected in our use of random-effects meta-analysis but may limit the generalizability of the results obtained, particularly given the limited number of studies we were able to include and the heterogeneity of study populations and settings.
The low I 2 value obtained in our primary meta-analysis of the association between long-term PM 2:5 and depression suggests that statistical heterogeneity of study results was not significant; this was the case in two of the three sensitivity meta-analyses conducted. This increases our confidence in the appropriateness of pooling these studies and in the result of this meta-analysis; however, it is subject to other limitations discussed below, including possible residual confounding. In addition, although more statistically powerful than Cochran's Q, I 2 has low power and can be biased as a test for heterogeneity in small meta-analyses (von Hippel 2015). I 2 was higher in Sensitivity Analysis 1b, which included a study (Vert et al. 2017) that we considered at high risk of bias due to the overly restrictive eligibility criteria of the wider cohort study within which this study was nested (excluding participants with major depressive disorder at study enrollment). The suicide metaanalysis is perhaps less likely to be affected by heterogeneous outcome definitions than our depression meta-analysis, although some outcome misclassification is possible. However, we observed greater statistical heterogeneity in the meta-analyses of PM 10 and suicide risk (at 0-1 and 0-2 d) than with both meta-analyses of long-term PM exposure and depression, reducing our confidence in the results of this meta-analysis. We hypothesize that this may reflect differences between countries in the likelihood of outcome misclassification with respect to suicide; however, there are multiple possible explanations.
Ratings for several of the EPHPP quality criteria require a degree of judgment; for example, validity and reliability are not binary constructs so the component scores allocated for data collection methods necessarily reflect the authors' subjective assessment. Estimating whether a study has controlled for less or more than 80% (or 60%) of relevant confounders is similarly challenging and inherently somewhat subjective, particularly given incomplete scientific knowledge about causal relationships between some potential confounders, PM, and the different outcomes assessed. We took the view that studies could meet this threshold without adjustment for noise exposure, but it is possible that future studies will show noise to be a more significant risk factor for outcomes such as depression, anxiety, or suicide than is currently recognized. Because we did not restrict inclusion in meta-analysis based on EPHPP ratings, the tool's arbitrary quantification of confounding does not affect our metaanalytic results. However, due to limitations in the EPHPP tool's process for determining overall ratings and the small numbers of studies identified, our ability to explore the effect of variation in study quality on the meta-analytic results was limited.
The EPHPP tool is also not specifically designed for quality assessment of environmental epidemiology studies and therefore does not facilitate in-depth critical appraisal of air pollution exposure assessment methods. Nevertheless, there does not appear to be a more robust quality assessment tool applicable to a range of study designs.
Our findings are also affected by the limitations of the included studies, and particularly by the risk of residual and unmeasured confounding. Even in generally well-controlled studies of longterm exposure, such as those by Lin et al. (2017a), Pun et al. (2017), and Kioumourtzoglou et al. (2017), some degree of residual confounding due to socioeconomic status is likely. Noise exposure, particularly from traffic noise, is an important potential confounder, particularly for studies of long-term PM exposure because it is moderately spatially correlated with air pollution (Allen et al. 2009) and has been linked to psychological effects (Tzivian et al. 2015), including through sleep disturbance (e.g., Basner and McGuire 2018) and annoyance (e.g., Guski et al. 2017). Among the studies we identified, noise was only adjusted for in the four studies meta-analyzed by Zijlema et al. (2016). We acknowledge that adjustment for such collinear factors is difficult and can produce unstable parameter estimates, which may at least partially underlie the significantly wider confidence intervals reported by Zijlema et al. (2016) than in other included studies with comparable sample sizes. There is also a possibility that noise might act as an effect modifier of the relationship between PM and some psychiatric outcomes; however, this was not investigated by any eligible studies.
Green space is another possible confounder of a relationship between PM and at least some mental health outcomes given that they can directly improve air quality (Escobedo et al. 2011) and may reduce depression risk and improve mental health overall (Gascon et al. 2015;Cohen-Cline et al. 2015); however, it was not simultaneously adjusted for by any of the studies we identified.
However, several case-only studies with good time-varying confounder adjustment (as well as good adjustment for time-invariant confounders, by design) identified statistically significant associations between short-term PM exposure and suicide (C Kim et al. 2010;Y Kim et al. 2015) and psychiatric admissions (Gao et al. 2017), providing a further line of evidence from a different study design, over shorter timescales, that PM exposure may alter psychological outcomes.
Another limitation of many included studies is that few adjusted for other air pollutants that are often spatiotemporally correlated with PM (and therefore inhaled alongside it) and which may also have physiological effects, including neuroinflammation. Most studies constructed only single-pollutant models, meaning that they could not exclude either confounding or effect modification by other pollutants such as sulfur, ozone, or nitrogen oxides. It was beyond the scope of our review to include all of the major air pollutants, although some studies met our inclusion criteria considered exposure to multiple air pollutants and found statistically significant associations with ozone (e.g., Lim et al. 2012;Szyszkowicz et al. 2016).
Biases in routine data-for example regarding underreporting of suicide, case ascertainment in emergency departments, or approximating outcome incidence with hospital attendance-may further limit some studies' generalizability. The accuracy of register-based mortality data is affected by cultural and health system-related factors, as Afshari (2017) discussed in relation to the study by Lin et al. (2016), in which underreporting of suicide appears pronounced. Nevertheless, routine data sets often represent the most complete data available, and although associations with registered suicides may not generalize fully to unregistered suicides, differential case ascertainment (with respect to PM exposure) appears unlikely.
Exposure misclassification is a common problem in air pollution epidemiology (e.g., Zeger et al. 2000), and a weakness of some studies included in our review is in the timing of exposure assessment. In particular, Zijlema et al. 's (2016) KORA and FINRISK substudies (B and D) assessed outcomes before the intensive monitoring campaigns used in developing the ESCAPE exposure model, which undermines the accuracy of exposure assessment, limits the strength of this evidence regarding a temporal relationship, and makes it unclear which exposure-response lag is under investigation.
Area-level mean concentrations were used to estimate individual exposure in all case-only studies and one cross-sectional study, which used values for 25 districts (Kim and Kim 2017). Most studies of long-term exposure comparing risk between individuals estimated individual exposure using land-use regression models to determine approximate concentrations at study participants' residential addresses as a surrogate for their long-term average exposure, at varying resolutions. High-resolution estimates are arguably less important in case-only studies investigating risk within individuals because spatial pollution gradients are typically correlated over time (Jerrett et al. 2005). Individuals' activity patterns also increase exposure misclassification; however, personal exposure monitoring remains prohibitively expensive for large-scale studies. Such exposure misclassification is more likely to be nondifferential with respect to psychiatric outcomes (and so bias results toward the null), necessitating larger samples to detect effects. In addition, area-level ambient air pollution exposure measures may introduce less risk of confounding by timeinvariant factors such as socioeconomic status than personal exposure monitoring (see Weisskopf et al. 2015), potentially making this a less significant limitation than might otherwise be assumed. Susceptibility to mental health effects of PM may also vary between individuals by, for example, age (e.g., Casas et al. 2017;Gao et al. 2017), sex (Bakian et al. 2015;Gao et al. 2015), and socioeconomic status (Pun et al. 2017) and potentially by genotype (e.g., Morales et al. 2009;Vrijheid et al. 2012). Lin et al. (2017a) also identified between-country differences in exposureresponse functions. Furthermore, physical comorbidities may both confound and/or mediate the associations observed, for example, through inflammation (Berk et al. 2013;Dantzer et al. 2008) or psychological effects of physical symptoms (Voll-Aanerud et al. 2008). Some included studies observed stronger associations among individuals with chronic illnesses (Cho et al. 2014;C Kim et al. 2010; KN Kim et al. 2016;Pun et al. 2017), although two (Kioumourtzoglou et al. 2017;Lim et al. 2012) investigated this but did not find a difference.

Potential Biological Mechanisms
There is good evidence from human and animal studies that PM exposure induces oxidative and nitrosative stress and systemic and neuroinflammation (Block and Calderón-Garcidueñas 2009;Levesque et al. 2011) as well as being directly neurotoxic and associated with structural brain changes (e.g., Block et al. 2012;Calderón-Garcidueñas et al. 2004, 2015. It has also been found to affect stress hormone production (e.g., Li et al. 2017). A range of biological mechanisms, with these being among the most likely candidates, may underlie a putative association between PM and mental health outcomes.
Neuroinflammation appears to play an important role in both depression (Liu et al. 2012;Dantzer et al. 2008) and psychosis (Barron et al. 2017), and this may be involved in both shorterterm as well as more chronic effects of air pollution on mental health, as compared with neurotoxicity and changes in brain structure, which by their nature are more likely to be involved in chronic effects and which may be particularly important in relation to exposures in early life and adolescence. Inflammation and oxidative and nitrosative stress have been hypothesized as potential mediating pathways for other risk factors for depression, including psychosocial stressors, physical inactivity, obesity, and lack of sleep, and there is evidence from administration of cytokine infusions and antidepressants' effects, for example, that the relationship goes beyond simple correlation (see Berk et al. 2013). Allen et al. (2014) found that early exposure of mice to concentrated UFPs resulted in changes in CNS neurotransmitter levels and glial activation, whereas Fonken et al. (2011) found that PM 2:5 -exposed mice displayed more depressive-like responses and higher pro-inflammatory cytokines than mice exposed to filtered air. Calderón-Garcidueñas et al. (2008) found an increased prevalence of prefrontal white matter lesions on brain magnetic resonance imaging (MRI) scans among children from a highly polluted area of Mexico City compared with children from a rural area, mirroring comparative MRI findings in young dogs. On histological examination, dogs exposed to high pollution were also found to have ultrafine PM deposition, gliosis, and vascular pathology associated with neuroinflammation. These may indicate mechanisms relevant to mental health impacts; for example, Chai et al. (2011) described differences in prefrontal cortex functional connectivity in patients with schizophrenia and bipolar compared with controls, whereas Frodl et al. (2010) found reduced prefrontal volume to be associated with depression duration.
Disruption of the HPA axis-which regulates the body's stress responses through the production of hormones such as cortisolhas emerged as a potentially important etiological factor in anxiety and depression (e.g., Zorn et al. 2017;Lopez-Duran et al. 2009). Supporting the possibility that PM exposure may act through this pathway, a robust association between indoor air pollution exposure and participants' serum cortisol levels was reported in Li et al. (2017)'s randomized double-blind crossover trial using indoor air purifiers.

Implications for Future Research and Policy
We suggest that a combination of the biological mechanisms discussed above are likely to be involved in mediating any effects of PM on mental health, if a causal association exists. However, at present, the precise pathways involved, and their relative importance across different timeframes and psychiatric conditions, are not well established. Further primary research-for example through animal studies, use of quasi-experimental or, where these can be conducted ethically, experimental studies involving humans, and the integration of toxicological and epidemiological methods in the same studies, is also needed to help further clarify the mechanisms involved. Future studies should also seek to build on current evidence by better addressing key potential environmental confounders such as traffic noise, access to green space, socioeconomic status, and chronic health conditions.
Mental health outcomes below a clinical threshold, such as stress and subjective well-being, were beyond the scope of our review but may have a comparable or even greater population health impact than diagnosable mental disorders. An innovative recent study by Zheng et al. (2019) analyzed 210 million geotagged tweets from the social media platform Sina Weibo; they found a statistically significant negative association between PM 2:5 and the expressed happiness in the tweets' content; moreover, this association persisted when analyzing only exogenous PM 2:5 variation (pollution from neighboring cities), mitigating the risk of confounding by local factors such as noise or congestion. The use of such innovative study designs is likely to be crucial in answering as-yet unanswered questions about air pollution's mental health effects, including around causality.
The generalizability of our findings is restricted by methodological limitations in individual studies, methodological heterogeneity, the relatively small number of studies, and the possibility of publication bias. In view of these limitations and the potentially significant mental health risks that may be posed by PM exposure, particularly over the longer term, further high-quality studies are warranted to more fully investigate the exposureresponse relationships and mechanisms involved. In particular, larger-scale longitudinal studies using validated outcome measures, representative samples, and improved adjustment for arealevel confounders such as noise and green space are needed to further interrogate the nature and potential causality of the associations observed.
Our results appear to support the hypothesis of an association with multiple adverse mental health outcomes, most clearly depression, for which we observed a statistically significant positive association in meta-analysis (pooled OR = 1.10; 95% CI: 1.02, 1.19) per 10-lg=m 3 PM 2:5 increment. The possibility of causal associations between air pollution and adverse mental health outcomes raises equity concerns given that deprivation is known to adversely affect mental health (e.g., Fone and Dunstan 2006), whereas poorer areas often have more polluted air (Jerrett 2009). There are, nevertheless, valuable opportunities to reduce the burden of both mental and physical ill health and health inequalities simultaneously through well-designed policies to improve air quality, such as by promoting active travel and urban green spaces, which also have psychological benefits (Gascon et al. 2015;Woodcock et al. 2009).