Top 10 Research Priorities in Spatial Lifecourse Epidemiology

Summary: The International Initiative on Spatial Lifecourse Epidemiology (ISLE) convened its first International Symposium on Lifecourse Epidemiology and Spatial Science at the Lorentz Center in Leiden, Netherlands, 16–20 July 2018. Its aim was to further an emerging transdisciplinary field: Spatial Lifecourse Epidemiology. This field draws from a broad perspective of scientific disciplines including lifecourse epidemiology, environmental epidemiology, community health, spatial science, health geography, biostatistics, spatial statistics, environmental science, climate change, exposure science, health economics, evidence-based public health, and landscape ecology. The participants, spanning 30 institutions in 10 countries, sought to identify the key issues and research priorities in spatial lifecourse epidemiology. The results published here are a synthesis of the top 10 list that emerged out of the discussion by a panel of leading experts, reflecting a set of grand challenges for spatial lifecourse epidemiology in the coming years. https://doi.org/10.1289/EHP4868


Introduction
The rapid increase in noncommunicable diseases (NCDs) poses a significant and growing global public health threat. According to the WHO, NCDs will be the leading cause of deaths in all world regions by 2030 (WHO 2014). Prevention and control will require addressing the many multifactorial and interrelated drivers of NCDs, which are thus far insufficiently understood. The drivers of NCDs can manifest at multiple levels-from individual-level risk factors and behaviors, to more upstream determinants such as arealevel socioeconomic conditions and environmental exposures (Krieger 2001). For example, chronic exposure to physical environment factors such as temperature, air pollutants, and greenness influences health behaviors (e.g., physical activity) (Durand et al. 2011) and the risk of chronic health outcomes (e.g., obesity, cancer, cardiovascular diseases) (James et al. 2015a;Jia et al. 2019b) and mortality (Chen et al. 2013). Spatial science, which includes the use of increasingly available spatial and location-aware technologies such as geographic information systems (GIS), remote sensing systems (RS), and global positioning systems (GPS) can help to identify various factors that influence NCDs. Spatial science enables the assessment of exposures or their proxies over long time frames, which can yield valuable measures of the exposome, that is, the totality of an individual's environmental and lifestyle exposures over the life course (Wild 2005).
Spatial lifecourse epidemiology is an emerging area of science that seeks to investigate the life course effects of environmental and other spatial factors (e.g., spatial accessibility) on individual behaviors and health outcomes at high spatiotemporal resolution, accuracy, and precision (Jia 2019). Spatial lifecourse epidemiology is emerging as an enabling field for the exposome during an era of increased availability of geographic and epidemiologic data, transdisciplinary collaboration, and broad global investments in team science initiatives (Jia and Wang 2019). Scientists engaged in spatial lifecourse epidemiologic research seek to harness this opportunity and infrastructure to make progress toward critical strategic global health goals such as the United Nations' Sustainable Development Goals (e.g., good health and wellbeing, reduced inequalities, sustainable cities and communities) (United Nations 2015) while maximizing cross-disciplinary integration and innovation.
The International Initiative on Spatial Lifecourse Epidemiology (ISLE) convened its First International Symposium on Lifecourse Epidemiology and Spatial Science at the Lorentz Center in Leiden, Netherlands, 16-20 July 2018, to further the science of spatial lifecourse epidemiology. The symposium drew on perspectives from a wide range of disciplines including lifecourse epidemiology, environmental epidemiology, community health, spatial science, health geography, biostatistics, spatial statistics, environmental science, climate change, exposure science, health economics, and landscape ecology. The workshop was cosponsored by the Lorentz Center, Netherlands Organization for Scientific Research, and the Royal Netherlands Academy of Arts and Sciences. The participants, spanning 30 institutions in 10 countries, sought to identify key issues and research priorities in spatial lifecourse epidemiology, and all agreed to strive toward a common language and research agenda for understanding one another better to make true progress. The following list is a synthesis of the top 10 priorities (in random order) that emerged out of the discussion and represented the consensus of perspectives from leading scientists in multiple fields related to spatial lifecourse epidemiology.
1. Create life course spatial exposure metrics 2. Define and operationalize composite and cumulative exposure concepts 3. Improve personalized exposure assessment in prospective studies 4. Understand the role of residential self-selection 5. Tap into emerging Big Data streams to capture spatial exposure and behavior information 6. Facilitate the development and use of complex systems models 7. Increase transdisciplinary collaboration to capitalize on innovative data and methods 8. Examine and address health equity 9. Expand the scope and scale of research from local and regional to national and global 10. Safeguard privacy while ensuring research needs.

Create Life Course Spatial Exposure Metrics
Incorporating spatial data into prospective cohort studies has led to many important findings in the field of epidemiology. The geocoding of participant addresses in prospective cohorts, such as the American Cancer Society Study and the Nurses' Health Studies in the United States (Hart et al. 2015;Krewski et al. 2000), has contributed to our understanding of how air pollution affects health, and these findings have driven national regulatory policies and decision making (Gilliland et al. 2017). As more cohorts incorporate spatial data, the workshop consensus was that we must develop toolkits that make it easier to geocode and merge spatial information into prospective cohort studies. Establishing geocoded residential addresses over decades of participants' lives and linking these geocoded data to temporally matched spatial data sets can enable researchers to examine the effects of exposures over the life course. In cases where residential addresses were not collected as part of the original study design, collecting retrospective residential address histories would enable researchers to reconstruct life course exposures to a range of factors distributed spatially. It should be noted that high-resolution spatial data are sometimes not available. For instance, coarse-(e.g., 1 km) and mediumresolution (e.g., 30-80 m) satellite data have been available only since the 1970s, and high-resolution data (<10 m), only since 1999 (Jia et al. 2019a). Data gaps may be filled by using system dynamics models to extrapolate the existing spatial data onward, but many challenges will remain.
Many countries, especially low-and middle-income countries with poor infrastructure, face difficulties in accurate geocoding, which is in large part due to a lack of a well-established nationwide address system. A nationwide address system can facilitate the conversion of questionnaire-derived addresses into geographic coordinates (x and y) in a GIS . Until geocoding is supported in these settings, researchers can integrate location-aware technologies (e.g., handheld GPS units, smartphone location services) in order to collect location data of study participants, especially when combined with locator data in questionnaire-based surveys (e.g., querying for the floor level of a building in questionnaires in densely populated areas where multiple households live in multi-family dwellings).
There are several excellent examples of large populationbased birth cohorts, yet none to date have followed individuals for their entire life span. These efforts have almost exclusively been implemented in high-income countries where there is strong government funding and support. Even where large populationbased studies are underway, retrospective data on exposures and other factors may be plagued by major errors due to recall bias. The workshop consensus was that building capacity for prospective data collection was vitally important and should be consistent with the emerging directions of health systems, including a shift in focus from patient health to population health, and from a focus on disease episodes to the full life course.

Define and Operationalize Composite and Cumulative Exposure Concepts
As the volume of data and the number of data sources continue to grow, researchers examining exposure to specific environmental characteristics face difficult choices about how to define exposure. Some of these choices include questions such as the following: What exposure measure to use? How to operationalize that measure? What area of exposure to choose? Defining exposure to, for instance, the food environment requires choices regarding the type of food retailers under study (e.g., restaurants, grocery stores, local food shops), the metrics used to quantify these retailers, and the definition of the area of interest. There are many options, and we know that these choices matter: The degree of exposure often differs greatly according to the definition and metrics used (Pinho et al. 2019). The use of varying definitions of exposure to the environment has even been posed as a potential explanation for the inconsistent findings in studies that, for instance, focus on environmental determinants of obesity Mackenbach et al. 2014). To move the field forward it is essential to get a better grasp on how to best define and operationalize exposures (Kwan 2012;Openshaw 1984).

Improve Personalized Exposure Assessment in Prospective Studies
The majority of studies examining associations between spatial factors and health assume that the area around a residential address is a reasonable proxy for exposure (James et al. 2015a;). However, this assumption is widely acknowledged as invalid given that studies show that, for instance, only 6% of daily activities occur in the residential census tract, 21% in adjacent census tracts, and 73% take place in other parts of the city (Matthews et al. 2005). By examining only residential addresses, we are missing a major piece of the puzzle: the daily time-activity patterns, or the places and environments that we visit throughout the day (e.g., the workplace), will be ignored Perchoux et al. 2013).
Through questionnaires, web-based approaches, time-activity diaries, or GPS-based methods, we can establish time-activity patterns and develop spatial exposure metrics that capture an individual's exposure as they move through time and space. For example, assessing personalized exposure to environmental air pollution relies both on measurements of the ambient levels of air pollution from spatially resolved models and on tracking individuals' location, their activity patterns, and their behavior. Traditionally, the quantification of human exposure to air pollution has relied on static population distributions and pollutant concentrations from ground-based data (obtained at fixed air quality network sites), satellite-derived products, or both, usually combined with land use regressions to model air pollution levels across space. New developments in sensor and GPS technology, although not perfect in the sensitivity and accuracy, may enable monitoring of personal exposure to air pollutants directly while people move through their activity spaces and varying concentration fields (Steinle et al. 2013). Some cutting-edge spatial exposure assessment approaches can also be used to estimate finer-scale physical and social environmental exposures such as dispersion models (Özkaynak et al. 2013), chemistry transport models, hybrid models (e.g., dispersion chemistry transport models) (Hennig et al. 2016), Google Street View™ cars with air pollution sensors for air pollution exposure (Apte et al. 2017), and social network analysis for social environmental exposure (Fowler and Christakis 2008). In addition, measuring the dynamic aspects of the exposure history itself (e.g., duration or time-varying intensities of exposure) used to be difficult in risk modeling and lifecourse epidemiology (Vermeulen and Chadeau-Hyam 2012), but, according to the workshop consensus, is increasingly feasible in spatial lifecourse epidemiology and should be encouraged.

Understand the Role of Residential Self-Selection
A critical issue with research linking environmental characteristics to behavioral or health outcomes is that cause and effect are difficult to disentangle. For instance, if a researcher is interested in how the amount or quality of greenspace influences physical activity levels, cross-sectional analyses may not provide much insight. In a concept known as residential self-selection, it may well be that individuals who care more about health are more likely to spend time in or move to more healthy areas rather than the other way around (Diez Roux 2004). Although some have suggested that self-selection bias is of limited importance in some specific contexts (James et al. 2015b;Sallis et al. 2009), it needs to be further explored.
Some nonexperimental research designs used by health economists may help mitigate the residential self-selection issue. These include natural experiments, driven by government regulation; unexpected changes in industrial production, or catastrophic events, resulting in unexpected shocks to environmental quality; high-frequency short-term variations in environment, assuming residential self-selection in response to environmental changes occurs more slowly than health changes (e.g., exploring the effect of day-to-day change in air pollution); and within-family designs, controlling for observed and unobserved family characteristics that may confound the associations between environmental exposure and health status of siblings (X Zhang, personal communication) (Currie et al. 2014;Zhang et al. 2018). The analysis methods to take these self-selection phenomena into account are being explored, for instance, including variables into the models that may adjust for the likelihood that someone's choice to live somewhere is linked to the outcome or independent variable of interest (Mackenbach et al. 2018). Propensity score matching techniques have also been proposed to control for residential preference and nonrandom selection into specific neighborhoods (Root and Humphrey 2014).

Tap into Emerging Big Data Streams to Capture Spatial Exposure and Behavior Information
The Big Data revolution in medical, environmental, and population registers; advances in personalized sensors; and vast, new data from social and other media can help us to better captureand understand-the complex interactions between environmental stressors and human health. The workshop consensus was that Big Data was not simply the use of large data sets, but the critical and statistical synthesis and analysis of massive data sets to reveal value greater than the sum of the individual parts. The recent availability of Big Data from sources such as loyalty cards, smartphones, wearables, web-enabled devices, social media, and technology companies hold immense potential to gather spatially referenced data at unprecedented volumes and velocity. These data streams will move spatial lifecourse epidemiology forward as we capitalize on these data sets to create intensive longitudinal measures of exposures and behaviors.
The term Big Data has also been used to encompass the use of predictive data analytics and the computational analysis of extremely large, multi-source data sets to reveal patterns, trends, and associations. For example, Google Earth Engine is a new cloud-computing platform on which massive global-scale satellite data can be processed (Gorelick et al. 2017); machine learning algorithms can process satellite data as well as Google Street View™ imagery to derive important new indicators that may be predictive of health outcomes (Maharana and Nsoesie 2018) and reduce the dimensionality of the almost explosively rich data sets (e.g., selecting the best possible indicators of early life in the decomposition of inequality of health in old age); Light Detection and Ranging (LiDAR) scanning (an RS method that uses a pulsed laser to measure variable distances to the Earth) and other measures from driverless cars can generate precise, three-dimensional information of the built environment (Arayici 2007).
A common issue that has traditionally plagued environmental epidemiology is that the influence of an environmental characteristic on a health outcome might very well be small and subtle, especially in cross-sectional studies where modest effects are cumulative over time. Obviously, rough measures will not allow researchers to detect subtle effects. This is where advanced spatial and location-based technologies, such as RS, sensors, and smartphone apps, can step in and contribute to the detection of those subtle effects by providing fine-scale, frequently repeated measurements over time (Jia et al. 2019a). However, this issue will persist in the Big Data era, where larger data sets tend to detect statistically significant effects with small clinical relevance. So with larger, higher-dimensional data, according to the workshop consensus, researchers will need to exercise care when distinguishing environmental exposures that have larger (and therefore more meaningful) effects from those that have smaller effects.

Facilitate the Development and Use of Complex Systems Models
The putative pathway through which characteristics of the environment influence behavior and, ultimately, health outcomes is long, and many factors influence these relations along the way in a complex, adaptive, and interacting manner. Although complex systems thinking currently sees a revival in this field of research (Rutter et al. 2017), complex systems doing is not so evident. Complex system maps may guide analyses (Butland et al. 2007), but as complex as they may seem, even these maps do not (yet) do justice to the complexity of real life. In the past few decades, we learned that many environmental exposures do not have a linear relation with health behaviors or outcomes, and reductionist ways of analyzing relations often provide puzzling results. With the increasing amount of available data, new statistical methods have to be developed or adapted from other fields of research to cope with these data, for instance, by translating established (network and classification) methodology from molecular biostatistics (Swinburn et al. 2011). We need to identify better ways of analyzing complex exposures and forces that represent salient environmental characteristics (Lakerveld and Mackenbach 2017). The tools and methods to study molecular and genetic drivers of disease manifestation are far ahead of those currently used to analyze the more upstream environmental determinants of health behaviors and outcomes (Swinburn et al. 2011).

Increase Transdisciplinary Collaboration to Capitalize on Innovative Data and Methods
Workshop participants agreed that one key to a better understanding of the complex interactions between environmental exposure and human health was assembling, linking, and analyzing diverse, large data sets, through the development of appropriate algorithms, analytical frameworks, and new approaches to inference. Doing so will necessitate multidisciplinary team science, that is, collaborative efforts that leverage the strengths and expertise of professionals trained in different disciplines (Hall et al. 2012;Stokols et al. 2008). An example is a transdisciplinary partnership among public health, computer science, and data science: Public health researchers have substantive knowledge and standards for understanding relationships between exposures and disease, computer scientists have the tools and methods to adapt Big Data to this purpose, whereas (spatial) data scientists have the knowledge and skills to properly deal with (spatial) data and a substantive knowledge of how these data can be used to measure environment and human-environment interaction.
The members of our panel of experts all agree that, although much collaboration between spatial scientists and epidemiologists has been fruitful, we need to move these collaborations to the next generation of transdisciplinary work. For example, highresolution satellite imagery (Jia and Stein 2017), as well as aerial and street-view photos, hold untapped information that could be linked to the geocoded addresses of individuals and derive novel metrics of exposure to physical environments through machine learning approaches (Maharana and Nsoesie 2018). Data scientists have the expertise and experiences in handling these large and complex spatial data sets, and they can engage with epidemiologists to merge novel (spatial) metrics with epidemiologic cohort data. With better measures, spatial lifecourse epidemiology can further benefit from collaboration with scholars in other fields such as environmental and health economists who have a unique focus on causal inference and may help strengthen causal evaluations of environmental exposure.
Nevertheless, tracking an individual's daily movements cannot reveal everything. There is much farther to go in terms of capturing individual day-to-day lifestyle choices and preferences that may impact health (Jia et al. 2019b). Such insights may be achieved using so-called Thick Data, which are often comprised of a complex range of data originating from primary and secondary research approaches (e.g., surveys, focus groups, videos). Thick data are capable of, for instance, capturing the influence of regular and irregular social and cultural activities that may affect behaviors and the health status of populations (Latzko-Toth et al. 2017). Transdisciplinary collaboration in improving exposure measurement will be both necessary and invigorating.

Examine and Address Health Equity
Recent analyses have demonstrated astounding differences in life expectancy within cities for neighborhoods that are only miles apart (Robert Wood Johnson Foundation 2015). The idea that one's residential location (e.g., postal code) may determine health speaks to the fundamental importance of spatial factors in driving health outcomes. Spatial lifecourse epidemiology can identify specific factors and sorting mechanisms that predict health disparities, such as socioeconomic deprivation and heterogeneity in levels of environmental stressors (e.g., air, water, soil, noise pollution), as well as isolate environmental aspects that may decrease disparities, such as greenspace. The principles and methods in environmental justice may be useful for advancing this area (Brulle and Pellow 2006;Meentemeyer et al. 2012), and it will be important to understand how populations vary in their vulnerability to environmental stressors, particularly those that relate to differences in coping abilities and behaviors.
Some further questions that the spatial lifecourse epidemiology toolset might address include how greenspace can help to reduce health disparities; how policy solutions to health disparities might be identified from spatial data (e.g., systematically characterizing gaps in parks and quality of greenspace among disadvantaged neighborhoods); how best practices with regard to built environments can be implemented in low-income settings; and how the physical, social, and policy environments interact (Mitchell and Popham 2008;Reis et al. 2016). 9. Expand the Scope and Scale of Research from Local and Regional to National and Global The overwhelming majority of spatial epidemiologic studies have taken place in North America and Europe. For instance, a recent review of the literature on green space and health showed that only 1 of 66 reviewed studies took place in Africa (James et al. 2015a). Another review of the applications of GIS in obesity research showed that only 4 of 121 reviewed studies were conducted in emerging and developing economies . Rapid population growth, uncontrolled urbanization, and profound socioeconomic inequalities in these settings are often associated with mounting environmental concerns. Environmental stressors (e.g., air pollution) are often at higher levels in those regions than in North American and European countries (Akimoto 2003), especially in urban areas. This knowledge may help better model the dose-response relationship due to larger variations in environmental exposures. Industrialization and economic development will intensify environmental health risks, making it even more critical to examine and understand how spatial factors, such as air pollution, noise, and lack of greenspace, impact chronic disease risk and health outcomes. Expanding the scale of environmental health studies, from local regions to nations and the entire world will also help address the issue of environmental spillover effects (e.g., pollution in one country may affect neighboring countries).
The members of our panel of experts all concur that the increased development of GIS data sets in emerging and developing economies, complemented by satellite data derived from the global coverage of high spatiotemporal-resolution earth observation satellites, will likely enable more spatial epidemiologic research in these regions. Advanced data analysis approaches, including machine learning approaches, could be fruitfully applied to such challenges and could inform the creation of standardized environmental metrics in places where ground-level exposure measures or sensor surveillance networks are underdeveloped either because of their cost or for lack of physical access to the regions. These powerful supporting technologies can enable researchers to scale up some computationally expensive efforts (e.g., high-resolution image processing) from local to global scales and, combined with the increasing use of smartphone location services to collect individual-level data in many emerging and developing countries, can enable more spatial lifecourse epidemiologic research (Jia and Wang 2019).

Safeguard Privacy While Ensuring Research Needs
The past 20 y have brought about dramatic technological progress in terms of location-aware smartphones, wearables, and data analytics. Although that progress has produced new and important ways to improve our understanding of how spatial factors affect human health, it has also raised concerns about privacy and confidentiality in the use of these data. People are increasingly under surveillance, and concerns of being constantly tracked and observed are greater than ever. Public closed circuit television (CCTV) cameras, smartphones, and loyalty cards, for example, record movements and habits often without the knowledge or consent of individuals (Rengel 2013;Wigan and Clarke 2013;Jia 2018). Although new technical possibilities and opportunities are immense, there are serious concerns on data interoperability across these platforms and on making these data available for scientific use (e.g., loss of privacy, identification of certain behaviors, issues with obtaining insurance). These issues have to be addressed before analytical and predictive research work can be commenced. Many of the methods and techniques used for spatial life course studies require data on an individual's activities, habits, and behaviors over long periods of time. Collection of these data may be an infringement of personal space and privacy, and it will be a challenge to keep the ethical standards up with those data collection, storage, and sharing activities. This has also been an emerging issue in satellite data due to increasingly identifiable objects and environment features on very high-resolution images, especially on street-view photos. Our panel of experts all agree that the development of ethical and technological standards and guidelines for collection and anonymization (and/or safeguarding) of individual spatial and behavioral data and the implementation of secure data handling and storage was imperative.
Further complicating the use of these data are concerns about data ownership. Personal data are valuable, and the companies collecting these data as part of their business model are often not initially willing to share them with researchers. For example, Fitbit ® , Google Maps™, MyFitnessPal ® , and Strava ® all record location, movement, and/or health data. Use of these sources will require public-private partnerships and data use agreements across platforms and research sites. Although current privacy regulations are essential and needed to protect the privacy of study populations, strict rules in many countries often prohibit data owners from using or sharing locational information of study populations or to link this information to external (e.g., GIS) data. For instance, in the United States, the Health Insurance Portability and Accountability Act (HIPAA) regulates that the postal codes or geographic coordinates of a participant's home or work location fall under the category of protected health information (Nosowsky and Giordano 2006). The European Union has also recently introduced the General Data Protection Regulation, which fundamentally impacts how personalized data is to be handled (De Hert and Papakonstantinou 2016). Methods for using these data with the consent of individuals will be necessary. In short, there are many advantages to personal exposure measurement made possible through spatial lifecourse epidemiology. At the same time, however, we must weigh the privacy and confidentiality concerns of participants. The implication is that there is an urgent need for researchers using these data to acknowledge the immense responsibility and public trust that comes with access to these data. Researchers must be held to high standards and take personal responsibility to address privacy concerns and data confidentiality to ensure that these data are always used ethically.

Conclusions
This article presents 10 key issues that confront the emerging field of spatial lifecourse epidemiology. They represent the consensus of perspectives from leading scientists in multiple fields related to spatial lifecourse epidemiology. In an era with rapidly growing environmental exposure, increasing volumes of health data, and rising attendant demands for modeling complex interactions and systems, it will be essential for scientists to launch transdisciplinary research programs built on team science traditions that integrate a wide array of intelligence, methods, and data to yield a better understanding of the etiology of human diseases. This top 10 list is intended to serve as a start of a discussion on future research priorities in spatial lifecourse epidemiology while stimulating an open and critical debate on the philosophical and methodological foundations of this emerging field.