Use of Remotely Sensed Data to Evaluate the Relationship between Living Environment and Blood Pressure

Background Urbanization has been correlated with hypertension (HTN) in developing countries undergoing rapid economic and environmental transitions. Objectives We examined the relationships among living environment (urban, suburban, and rural), day/night land surface temperatures (LST), and blood pressure in selected regions from the REasons for Geographic and Racial Differences in Stroke (REGARDS) cohort. Also, the linking of data on blood pressure from REGARDS with National Aeronautics and Space Administration (NASA) science data is relevant to NASA’s strategic goals and missions, particularly as a primary focus of the agency’s Applied Sciences Program. Methods REGARDS is a national cohort of 30,228 people from the 48 contiguous United States with self-reported and measured blood pressure levels. Four metropolitan regions (Philadelphia, PA; Atlanta, GA; Minneapolis, MN; and Chicago, IL) with varying geographic and health characteristics were selected for study. Satellite remotely sensed data were used to characterize the LST and land cover/land use (LCLU) environment for each area. We developed a method for characterizing participants as living in urban, suburban, or rural living environments, using the LCLU data. These data were compiled on a 1-km grid for each region and linked with the REGARDS data via an algorithm using geocoding information. Results REGARDS participants in urban areas have higher systolic and diastolic blood pressure than do those in suburban or rural areas, and also a higher incidence of HTN. In univariate models, living environment is associated with HTN, but after adjustment for known HTN risk factors, the relationship was no longer present. Conclusion Further study regarding the relationship between HTN and living environment should focus on additional environmental characteristics, such as air pollution. The living environment classification method using remotely sensed data has the potential to facilitate additional research linking environmental variables to public health concerns.

volume 117 | number 12 | December 2009 • Environmental Health Perspectives Research Hypertension (HTN) is a risk factor for heart disease, stroke, other cardiovascular diseases, and renal disease and has been identified as the second leading cause of diseases worldwide (Ezzati et al. 2002). It has been estimated that 26.4% of the global adult population is hypertensive, with 333 million hypertensive individuals in developed countries and 972 million hypertensive individuals in develop ing countries (Kearney et al. 2005). In devel oped countries, 10.9% of disabilityadjusted life years lost has been attributed to HTN. Cardiovascular health issues relating to urbanization are of particular concern because in this century the world has experienced an unprecedented urban growth, with > 50% of the world's population residing in cities and megacities (Godfrey and Julien 2005). Urbanization has been correlated with HTN in developing countries undergoing rapid eco nomic and environmental transitions, such as China, India, and many African countries (Kusuma and Das 2008;Opie and Seedat 2005). The United States, a large, ethnically diverse, and relatively wealthy country, has a vastly different distribution of HTN risk factors compared with both developing coun tries and most other developed countries, but urban-rural differences remain an under researched issue (Gillum 1996;Gillum et al. 2004;Obisesan et al. 2000). In the United States and worldwide, there remains the ques tion of how much the urban environment contributes as an independent risk factor for blood pressure differences, and how much is attributable to a variety of environmen tal, lifestyle, and demographic correlates of urbanization (Ala et al. 2004;Nirmala 2001;Sobngwi et al. 2004). In particular, race and ethnicity are often involved in urban-rural blood pressure differences (Appel et al. 2002;Ruixing et al. 2006).
Many studies have used remotely sensed data for land cover/land use (LCLU) classi fication of urban areas (Jacquin et al. 2006;Kampouraki et al. 2006;Lu and Weng 2005;Lu et al. 2008;Xu and Gong 2007). However, very few studies in the United States have evaluated how these LCLU classifications and living environments affect human health. In this study we explore the relationship between urban, suburban, and rural land classifications and selected correlates with blood pressure among participants of the large, wellcharac terized AfricanAmerican and white cohort from the REasons for Geographic and Racial Differences in Stroke (REGARDS) study. This innovative study used remote sensing data to apply LCLU techniques to classify the geo coded REGARDS participants. LCLU urban classification has been used extensively in envi ronmental studies; however, its application to public health research is rare and represents an innovative opportunity to explore this and potentially many other public health issues.
We examined the relationship between living environment (defined as rural, sub urban, and urban) and day/night (maximum and minimum) land surface temperature (LST), and blood pressure in persons from the REGARDS cohort living in selected regions of the United States. We hypothesized that elevated blood pressure may be a function of living environment. In the past, urbanization classification was based on U.S. Department of Agriculture rural-urban continuum codes. Our study used remote sensing spatial data to classify rural, suburban, and urban areas to examine differences in living environ ment and blood pressure measurements. Our first objective was to examine the relation ship between rural, suburban, and urban resi dents and blood pressure [measured as systolic blood pressure (SBP), diastolic blood pres sure (DBP), and HTN] to determine whether higher blood pressure is associated with living environment after adjusting for known risk factors. The second objective was to examine differences in LST for each category of living environment for the 2 weeks before the blood pressure measurement to validate expected temperature variations between the living environment classes and to determine if higher LST is associated with higher blood pressure.

Materials and Methods
Four geographically and climatologically distinct U.S. regions that include major urban centers and that have varying stroke mortality rates were selected for this study. These regions are a south eastern region centered on Atlanta, Georgia; a northeastern region centered on Philadelphia, Pennsylvania; and two Midwestern regions cen tered on Chicago, Illinois, and Minneapolis, Minnesota. Stroke mortality is lowest among residents of Minnesota, followed by residents of Pennsylvania, then residents of Illinois, and highest among residents of Georgia (Casper et al. 2003). General land use configurations are similar; however, Philadelphia and Chicago have higher density and more compact urban areas, whereas Atlanta and Minneapolis have more fragmented urban development patterns. Significant differences in vegetation and local climatology also exist among the regions. The regions are approximately 200 km × 200 km, which allowed for significant populations in rural, suburban, and urban locations to be eval uated, as shown in Figure 1.
The study is an urban to subregional scale investigation that uses 1km data products from NASA (National Aeronautics and Space Administration) ModerateResolution Imaging Spectroradiometer (MODIS) for day and night LST and 30m LCLU infor mation from the 2001 National Land Cover Data (NLCD2001). These data have been compiled on a 1km grid for the four regions selected for analysis. The environmental data have been matched with the REGARDS data via an algorithm that uses the spatial location of the participants' residences.
Description of REGARDS. The REGARDS study is a national populationbased cohort study that recruited 30,228 participants ≥ 45 years of age, with 45% male, 55% female (goal was 50% male, 50% female), and 42% African American, 58% white (goal was 50% African American, 50% white). The national distribution of the REGARDS cohort is shown in Figure 1. Recruitment of the cohort began in January 2003 and was com pleted in October 2007. Twentyone percent of the cohort was recruited from the "buckle" of the stroke belt (coastal plain region of North Carolina, South Carolina, and Georgia), 35% from the stroke belt states (remainder of North Carolina, South Carolina, and Georgia, plus Alabama, Mississippi, Tennessee, Arkansas, and Louisiana), and the remaining 44% from the other 42 contiguous states (goal was 20% buckle, 30% belt, 50% remainder of nation). The methods have been published elsewhere (Howard et al. 2005). Participants were selected from commercially available lists of residents, using a combination of mail notification and telephone contact. Baseline data, including demographics, cardiovascular risk factor history, and others, were collected via computerassisted telephone interview, at which time verbal consent was obtained. After completion of the baseline interview, an in home visit was performed to collect physi cal measurements, including blood pressure, a blood sample, and a urine sample; at this time signed informed consent was obtained. Followup phone contact is made at 6month intervals to assess occurrence of stroke and cardiac events. The study was approved and monitored by the institutional review boards at all participating institutions.
Blood pressure was determined during the inhome visit as the average of two seated measurements. HTN was defined as SBP > 140 mmHg, DBP > 90 mmHg, or self reported use of antihypertensive medications. Age, sex, race, income, and education were determined by selfreport, whereas weight and height were evaluated during the inhome visit. REGARDS participants were geocoded using SAS/GIS (geographic information sys tem) geocoding software (version 9.1; SAS Institute Inc., Cary, NC; SAS Institute Inc.   , based on the street address of the par ticipant and were then matched by census tract to communitylevel poverty, expressed as the proportion of the census tract below the poverty level (0-5%, 5-10%, 10-25%, or > 25%). SAS/GIS generates a score describing the likelihood that the provided longitude and latitude are matched identically to the address, and 88% of the participants included in these analyses had a score of ≥ 80%.
Landsat and MODIS remotely sensed data. The Landsat satellite program has continuously gathered information on changes in Earth's landscape since the 1970s (NASA 2009). Since 1999, the Landsat 7 satellite has been collect ing visible and infrared data on Earth's sur face at spatial resolutions from 15 to 60 m. These data were used to develop the NLCD 2001 used in this project as a baseline from which to determine living environments. The NLCD2001 product at 30m spatial resolu tion represents land cover based on Landsat 7 data from 1999-2003 and provides 16 LCLU classes (an additional four classes are avail able in Alaska only and another nine classes in coastal areas). All NLCD2001 products were generated according to protocols out lined in Homer et al. (2004) using 66 mapping zones for the conterminous United States and 23 zones in Alaska. Formal accuracy assess ment of the NLCD2001 products is planned in the near future based on the design outlined in Stehman et al. (2008). However, users can gain initial feedback on product accuracy from the crossvalidation estimate of product accu racy provided from the algorithms employed in NLCD2001 modeling (Homer et al. 2007). Crossvalidation accuracy of the land cover product was weighted by class occurrence in each mapping zone. Accuracy estimates across mapping zones ranged from 70% to 98%, with an overall average accuracy across all zones of 83.9% (Homer et al. 2007).
The MODIS instrument is carried by the Aqua satellite, which views most of the earth's surface twice daily, at nominally 0130 hours and 1330 hours local time. The instrument provides a validated LST product in Kelvin on a 1km grid for a pair of daytime and nighttime observations. Global observations are available from 2000 to the present. This product is the result of the generalized split window LST algorithm developed by Wan and Dozier (1996). Comparisons between data from this LST product and in situ values in 47 clearsky cases, which were made in a study by Wan (2008), indicated that the accuracy of this product is better than 1 K in 39 of 47 cases. The root mean squared differ ence was determined to be < 0.7 K.
Method for urban, suburban, and rural delineations. A methodology was developed using the NLCD2001 LCLU grids over the study area to delineate rural, suburban, and urban zones and resample to a 1km grid. The NLCD2001 classification contains four developed classes: developed high intensity, developed medium intensity, developed low intensity, and developed open space. The developed highintensity class is consistent with urban living near the central business dis tricts and other highly urbanized land use areas containing a mixture of commercial, industrial, and residential land uses. Condominiums, apartment complexes, and row houses are typical living environments in the developed highintensity class. Conversely, the developed openspace class commonly includes very low density suburban environments with largelot singlefamily housing units along with parks and golf courses. The developed lowintensity and mediumintensity classes are similar in that both contain singlefamily housing units in conjunction with low to moderate levels of urban development, with the main difference being the average size of the housing lots.
Based on the land use characteristics of the developed classes, a remapping scheme was developed to map the developed high intensity and developed mediumintensity classes to the urban living environment class and the developed lowintensity and devel oped openspace classes to the suburban liv ing environment class. All other NLCD2001 classes, such as forest and agriculture classes, are included in the rural living environment, as demonstrated in Figure 2.

Dominant class algorithm role in methodology.
Resampling is the process of assigning, interpolating, or extrapolating new cell values when transforming raster data, or images, to a new coordinate space or cell size (i.e., spatial resolution). To be consistent with the MODIS LST data spatial resolution and because people spend most of their time outside a 30 m × 30 m area, we resampled the raw NLCD2001 data in this study from 30 m to 1 km in order to evaluate the LCLU and living environment effects on SBP, DBP, and HTN. We also ana lyzed data at a 3km scale to study the effect of scale on such potential relationships. We devel oped a resampling algorithm that uses the raw data set (30m NLCD2001), calculates the areas of all the LCLU classes within each coarse 1km or 3km grid cell (filter window), and assigns the dominant LCLU class to that coarse grid cell in a GIS, as demonstrated in Figure 2 for the city of Atlanta.
LST data processing. To determine tem perature effects on SBP, DBP, and HTN, LST data from the MODIS sensor aboard NASA's Aqua satellite were obtained for the four focus cities for the period 2003-2007. Both day (~ 1330 hours) and night (~ 0130 hours) data were obtained. The MODIS LST product is produced only for land surfaces determined by an algorithm to be cloudfree. Therefore, data are missing for many days and locations.
Data linkage method. After delineating the living environments as rural, suburban,   Figure 3, and in order to assess the living environment effects on SBP, DBP, and HTN, the REGARDS participants located within those study areas were spatially linked to the living environment categories in a GIS. Given the geographic coordinates of the REGARDS participants' residences, those participants were assigned to the living environment category of the grid cell within which they reside. That linkage process was done at three different spatial resolutions (30 m, 1 km, and 3 km) to study the effect of scale on that assessment.
Day and night LST data were processed separately in conjunction with geographic data on the residence locations of the REGARDS participants in the four cities. LST data were extracted for the grid cell for which the cen troid was closest to the residence location of each subject. This was done for each day and night of the 5year study period, creating tables of LST linked to the list of REGARDS participants. These tables provide the daytime and nighttime LST observations, when avail able, for each day and each subject, facilitat ing temporal analysis of these data together with the blood pressure data.
Statistical analysis. Descriptive statistics are expressed overall and by living environment (rural, suburban, or urban). We used analysis of variance or chisquare tests of association to determine differences in baseline characteristics of the population by living environment clas sification. We used linear regression to deter mine whether an association exists between each of SBP and DBP, and living environ ment. Initially, a univariate model was run, fol lowed by a model adjusted only for race, then a model adjusted for race (African American or white), sex (male or female), age, body mass index (BMI), income (selfreported household income), education (less than high school, high school degree, some college, college degree or greater), city of residence, and community level poverty (based on census tract). The risk factors were selected based on previously pub lished reports linking them to HTN. City of residence was included in the model to account for characteristics particular to the city, which may confound the relationship between living environment and SBP and DBP. The relation ship between the presence of HTN and living environment classification was modeled using logistic regression, and modeling progressed in the same fashion as for both SBP and DBP (univariate, age adjusted, and multivariable adjusted). An additional model was run assess ing the effect of including average LST over the 2 weeks before the inhome visit on the multivariable model. We used 2week averages of available data to allow more observations to be included in the analysis, because LST was frequently missing (as described above).
Data were analyzed using a 1km scale for the living environment classes, but all analyses were repeated using classifications based on both the 30m scale and the 3km scale. Further, comparisons were made between the propor tions of subjects residing in a specific living envi ronment class, depending on the scale used to determine the classification, to assess how the scale would influence the living environment.

Classification and resampling of living environments.
To evaluate the effect of the scale or spatial resolution on characterizing the liv ing environments of the REGARDS partici pants and later on their potential relationship with SBP, DBP, and HTN, we first calculated the percentage of areal coverage of each liv ing environment for the four study areas and at the three spatial resolutions. As shown in Figure 4A-C, the dominant living environment over the spatial coverage of each study area was rural, with percentages > 80% at all spatial reso lutions including the 30m raw resolution. The Minneapolis study area has the highest rural and lowest urban and suburban spatial cover age. The rural living environment domination increases in all study areas as the spatial resolu tion decreases from 30 m down to 1 km and 3 km and the LCLU map becomes smoother.
We also determined the distribution of the REGARDS participants within each of the four study areas for the three spatial scales.
As shown in Figure 4D-F, the living environ ments for most (> 55%) of the REGARDS participants in Atlanta and Minneapolis were characterized as suburban at all scales. On the other hand, the living environments for most of the REGARDS participants in Chicago and Philadelphia were characterized as urban. We calculated Moran's I (spatial autocorrelation statistic) for the 1km living environment research data set. In Atlanta and Minneapolis the results reflect the spa tial designs or structures of those cities that tend to be more scattered (Moran's I = 0.642 and 0.636, respectively), and in Chicago and Philadelphia, those that tend to be more clus tered (Moran's I = 0.706 and 0.869, respec tively). In all cases when moving from urban or suburban classes, the percentage of partici pants in rural living environments increased as the spatial resolution decreased. Minneapolis was the only case where when moving from urban areas, the percentage of participants in suburban living environments increased as the spatial resolution decreased, due to the increased scatter of the urban classes com pared with the other study areas. Table 1 shows how the participants would be reclassified moving from one scale to another. Between 1 km and 3 km, there is little change in the classification of the par ticipants: 81% of the participants remain in the same classification. The most divergent changes observed were from rural to urban (0.15%) and urban to rural (1.5%), neither of which occurred frequently. The other changes, from either rural or urban to suburban, or from suburban to either rural or urban, are more plausible and account for the remaining 18% of the changes. In examining the change in classification when moving from a 1km scale to a 30m scale, there is more variation in the reclassifi cation of the participants. Only 63% of the participants remain in the same classification at 30 m as they were for 1 km. Twelve per cent change from rural to suburban, whereas 10% change from suburban to urban, and 8% change from urban to suburban. The more divergent changes, from rural to urban (2%) and from urban to rural (0.43%), happen more frequently than between the 1km and 3km scale, but still not often.
LST data analysis. We averaged daytime and nighttime LST data for the grids covering each of the four cities by month for each of the three living environments: rural, suburban, and urban. This analysis, performed at the 1km scale, demonstrated that grid cells classified as urban were warmest, and rural grid cells were the coolest, as shown in Figure 5 for 1 August 2004 in Atlanta as an example. The largest dif ference was between suburban and rural liv ing environments, where the mean difference, averaged over the entire 5year period and for all cities, was 2.2°C during the day (~ 1330 hours) and 1.3°C at night (0130 hours). Results were similar among cities; the largest urbansuburban difference was in Philadelphia, and the largest suburban-rural difference was in Chicago. These results are consistent with work previously performed using aircraft remotely sensed data collected in May 1997 for the Atlanta metropolitan area that showed warmer temperatures in the central business district compared with midtown residen tial areas (Quattrochi et al. 2000). By reveal ing an "urban heat island effect" as illustrated    00-20.85 20.86-24.66 24.67-27.21 27.22-29.75 29.76-32.3 32.31-34.84 34.85-38.66 38.67-40 40.01-51.39 in Figure 5, these results provide a firstorder validation of the land use-based living environ ment classification.
Linked data analysis. Table 2 presents the distribution of the baseline characteristics of the population, both overall and by liv ing environment classification, based on the 1km scale. Most of the population resides in suburban areas (52%), with almost a third in urban areas (32%), and the remaining 16% in rural areas. Those in urban areas had higher SBP and DBP than did those in suburban or rural areas, and also a higher incidence of HTN. Urban dwellers were also slightly older, more likely to be female, and more likely to be African American. The proportion of urban residents with a college degree or higher was lower than for rural or suburban areas. Residents of urban areas were more likely to have lower income, and a higher pro portion of an urban census tract lived below the poverty line. Table 3 presents the results of the uni variate and multivariable modeling for SBP, DBP, and presence/absence of HTN. In the univariate model, those in urban areas had higher SBP and DBP and were more likely to be hypertensive compared with those in sub urban or rural areas (all p < 0.0001). However, after adjustment for race, only SBP was signifi cantly higher among those in urban than in rural or suburban areas (p = 0.0021), and after multivariable adjustment, even SBP was no longer significantly different across living envi ronment classes. Adding average LST from the 2 weeks before the inhome visit to the model did not influence the results (data not shown). These results remain consistent, regardless of the scale used for the classification algorithm.

Discussion
We found that among occupants of the four selected cities in the REGARDS study, those residing in urban areas had the highest blood pressure (both systolic and diastolic) and were more likely to have HTN compared with their counterparts in rural or suburban dwellings. However, adjustment for race and further adjustment for other known cardiovascular risk factors attenuated this association, such that these factors were no longer significantly different. This indicates that it is likely that observed differences in blood pressure by liv ing environment classification are attributable to the distribution of race and other cardio vascular risk factors across the classes.
The development of a methodology to delineate LCLU classes into rural, sub urban, and urban regions should benefit future research relating to the impact of urbanization on public health. Landsat and MODIS LCLU data are available for all areas of the United States and most areas of the world. Standard GIS software and tools used herein should be readily replicable for use in other applica tions. This methodology, in conjunction with remote sensing data, offers the potential to characterize physical environment features for comparison with public health data to deter mine correlations in multiple areas of interest.
The interpretation of results should take into account several study limitations.
Although the four metropolitan study areas are diverse, the limited geographic scope did not include any Rocky Mountain, desert, or West Coast areas, so extrapolation of find ings to these regions will be more difficult to substantiate. The study also considered only one public health concern, blood pressure, whereas living environment may contribute to  a variety of other public health issues. Finally, the REGARDS data included only one visit during which blood pressure was measured, making temporal analysis of living environ ment influences unfeasible. Future work. The REGARDS database offers a unique and valuable opportunity to perform additional research to investigate cor relations between environmental conditions, HTN, and strokes. With additional temporal data points, further evaluation of blood pres sure levels and/or stroke events and correla tions with either living environment or other environmental variables such as temperature or humidity could be evaluated. Further stud ies in geographic areas unique to this study are desirable and would make the results more robust and potentially useful for environmen tal public health tracking and possibly for establishing public policy (National Research Council 2007).
The National Research Council's Earth Science and Applications from Space: National Imperatives for the Next Decade and Beyond (National Research Council 2007) encourages continued research to firmly establish the pre dictive relationships between remotely sensed environmental data and patterns of environ mentally related health effects. Additional exploration of the uses of remotely sensed data to provide environmental data for linkage to various types of public health data is needed to gain more understanding of the potential for remotely sensed data to benefit public health research.