Toward Consistent Methodology to Quantify Populations in Proximity to Oil and Gas Development: A National Spatial Analysis and Review

Background: Higher risk of exposure to environmental health hazards near oil and gas wells has spurred interest in quantifying populations that live in proximity to oil and gas development. The available studies on this topic lack consistent methodology and ignore aspects of oil and gas development of value to public health–relevant assessment and decision-making. Objectives: We aim to present a methodological framework for oil and gas development proximity studies grounded in an understanding of hydrocarbon geology and development techniques. Methods: We geospatially overlay locations of active oil and gas wells in the conterminous United States and Census data to estimate the population living in proximity to hydrocarbon development at the national and state levels. We compare our methods and findings with existing proximity studies. Results: Nationally, we estimate that 17.6 million people live within 1,600m (∼1 mi) of at least one active oil and/or gas well. Three of the eight studies overestimate populations at risk from actively producing oil and gas wells by including wells without evidence of production or drilling completion and/or using inappropriate population allocation methods. The remaining five studies, by omitting conventional wells in regions dominated by historical conventional development, significantly underestimate populations at risk. Conclusions: The well inventory guidelines we present provide an improved methodology for hydrocarbon proximity studies by acknowledging the importance of both conventional and unconventional well counts as well as the relative exposure risks associated with different primary production categories (e.g., oil, wet gas, dry gas) and developmental stages of wells. https://doi.org/10.1289/EHP1535


Introduction
Background A number of studies indicate that there may be negative health outcomes associated with living in close proximity to oil and gas development. Degraded air quality; surface water, groundwater and soil contamination; and elevated noise and light pollution are exposure pathways that contribute to potential human health impacts Hays et al. 2017;Shonkoff et al. 2014). Studies have identified multiple symptoms reported by residents living with oil and gas infrastructure in their communities, including respiratory symptoms, such as nose, eye, and throat irritation; headaches; and fatigue, among others (Macey et al. 2014;Rabinowitz et al. 2015;Steinzor et al. 2013;Tustin et al. 2017). One study has pointed to increased hospitalization rates for multiple medical categories, including cardiology, neurology, and oncology (Jemielita et al. 2015). Increased asthma incidence and severity has also been reported in Pennsylvania . Preliminary epidemiological studies that use distance of oil and gas development as the exposure metric have found positive associations with adverse birth outcomes, including preterm birth , lower birth weight, and small for gestational age (Stacy et al. 2015), as well as neural tube defects and congenital heart defects . McKenzie et al. (2017) also identified increased incidence of childhood hematologic cancer among children that live in close proximity to oil and gas development compared to those that live farther away. While many findings in the public health literature on oil and gas development are sometimes inconsistent and studies often lack the designs to arrive at causal claims, the body of literature serves as an indication that proximity to oil and gas development is associated with adverse health risks and impacts.

Previous Population Proximity Studies
Public concern and the public health scientific literature to date has spurred interest in quantitative assessments of populations potentially at increased risk of health impacts from living in close proximity to oil and gas development. Four peer-reviewed studies were published in the last 2 y: two reporting population counts (Meng 2015;Slonecker and Milheim 2015), and three reporting demographic subgroups (Clough and Bell 2016;Ogneva-Himmelberger and Huang 2015;Slonecker and Milheim 2015). Three additional studies were identified in the gray literature (Earthworks et al. 2016;Ridlington et al. 2015;Srebotnjak and Rotkin-Ellman 2014). The earliest study we could identify was published in The Wall Street Journal (Gold and McGinty 2013). This early study has substantial methodological flaws, but is included in our review because it was the first published attempt to quantify populations near oil and gas wells.

Conventional and Unconventional Well Types
Of the eight proximity studies published, five focus their analyses explicitly on unconventional wells. This is in part due to the increased public and academic interest in the impacts of the rapid expansion of unconventional oil and gas development over the past decade. The recent increase in unconventional oil, gas, and other hydrocarbon production is enabled by recent technological advances consisting primarily of the pairing of directional well drilling and high-volume hydraulic fracturing in shale formations (Ratner and Tiemann 2015).
While public controversies have largely focused on human health impacts of unconventional gas development, it is important to recognize that both conventional and unconventional oil and gas development involve emissions of hazardous air pollutants and other harmful air emissions that can present risks to human health Dusseault and Jackson 2014;Field et al. 2014;Jackson et al. 2014;Pekney et al. 2014;Shires et al. 2009;Zammerilli et al. 2014). Detectable levels of harmful pollutants, including particulate matter, nitrogen oxides, ozone, volatile organic carbons (VOCs), carbon monoxide, and in some locations, hydrogen sulfide, are commonly reported on and near hydrocarbon well sites and areas of associated infrastructure (Gilman et al. 2013;Jackson et al. 2014;Moore et al. 2014;Zammerilli et al. 2014). These air emissions are present as a result of normal well pad activities, such as venting, flaring, transportation activities, and the running of equipment such as drill rigs, dehydrators, separators, and compressors (Gilman et al. 2013;Goetz et al. 2015;Jackson et al. 2014;Moore et al. 2014;Zammerilli et al. 2014). Additional releases from storage tanks and fugitive emissions from wellheads, pipelines, and related infrastructure are also common (Field et al. 2014;Jackson et al. 2014;Moore et al. 2014;Warneke et al. 2014). These emission sources are characteristic of both conventional and unconventional well sites and are not explicitly tied to wells that have been hydraulically fractured or directionally drilled . In summary, while the relative magnitude of health hazards across different types of oil and gas development remains a current topic of research, many of the same hazards are shared across all of them.
Where unconventional wells differ from conventional wells is in the relative scale of operations and spatial intensity. Unconventional wells are associated with deeper geological zone targets, the common use of long lateral wellbores, and multiwell pads that require longer cumulative drilling time per well pad compared with conventional wells (Field et al. 2014;Kargbo et al. 2010;Manda et al. 2014). Conversely, conventional wells are typically shallower, vertical, or near-vertical wellbore configurations, and developed as individual wells per pad rather than clusters of multiple wells. The increases in production caused by multiwell pads and horizontal wellbores, coupled with the massive increase in emitting infrastructure per well pad and corresponding increases in fugitive and process vent emissions that are a function of production throughputs, cause unconventional wells to have spatially concentrated unconventional atmospheric pollutant loads Field et al. 2014;Omara et al. 2016;Skone et al. 2014). Also, due to the continuous nature of unconventional formations compared with the relatively discrete nature of conventional formations, unconventional development often spreads over much larger geographic areas, and thus, the cumulative burden over a geographic region may be higher than with conventional development .
Hydraulic fracturing, a method used for well stimulation, is also dependent on scale. Water quantities used for hydraulic fracturing in unconventional development are orders of magnitude higher than in conventional development (U.S. EPA 2016), and the proportion of chemical mass to water mass stays relatively constant over conventional and unconventional development (U.S. EPA 2016). Therefore, chemical use by mass is generally higher in unconventional development. Unconventional wells are often refractured every few years, repeating the opportunity for potential chemical releases and elevated levels of VOCs, including benzene and toluene Jackson et al. 2014;Lee et al. 2011;Moore et al. 2014;U.S. EPA 2014a;Warneke et al. 2014;Zammerilli et al. 2014).
On the other hand, the higher risk of aging conventional well infrastructure and associated well site equipment in combination with the sheer numbers of conventional wells can potentially overwhelm the higher per pad emissions associated with unconventional development (Omara et al. 2016). For instance, while mean methane emissions in the Marcellus shale per unconventional well pad may be, on average, up to 23 times higher than conventional single-well sites, conventional well site emissions may dominate regional emissions from oil and gas development in this area (Omara et al. 2016). Like the Marcellus shale region, conventional wells make up a significant percentage of overall well composition in many parts of the U.S. (U.S. EIA 2009). Nonmethane VOCs are often correlated with methane emissions; thus, the same trend likely follows for other volatiles (Pétron et al. 2012;U.S. EPA 2014b). This research suggests that, depending on the well density and composition of the local well inventory, aggregate air emissions from conventional wells may be higher than that of unconventional wells. Comparative assessments of conventional and unconventional development relative to human populations and the effects of each are currently lacking in the literature.

Primary Production and Well Status
The primary production category of the hydrocarbon (oil, wet gas, dry gas) and well status (permitted, under construction, drilled and completed but not producing, producing, abandoned, plugged) are other aspects of hydrocarbon development that are often ignored in the existing literature, but are critical to assessments of risks to human populations living near oil and gas development. Primary production types have varying chemical compositions and, therefore, different air pollutant emission profiles and implications for potential exposures and human health impacts (Field et al. 2014;Goetz et al. 2015;Macey et al. 2014;Roy et al. 2014;Warneke et al. 2014;Zammerilli et al. 2014). For example, crude oil generally contains various proportions of single-bond hydrocarbons (alkanes), aromatic rings (e.g., benzene, toluene, o-xylene), and naphthenes (e.g., cyclohexane, cyclopentane), depending on the maturity and depth of the resource (Wang et al. 2003). Wet gas contains various mixtures of natural gas liquids (ethane, propane, butane, isobutane, and pentane), light crude oil, and methane gas (Field et al. 2014;Zammerilli et al. 2014). Produced dry gas, on the other hand, is typically ≥95% methane with relatively small percentages of other volatile organic compounds and other air toxins (Field et al. 2014;Zammerilli et al. 2014). The relative risks of air pollutant exposures across these production categories can vary greatly and hold implications for assessments of health impacts of interest. Population health and exposure studies should account for this variability by including and parsing out primary production categories in their well inventories.
Well status values can also correspond with substantially different air pollutant concentrations. Permitted unspudded wells would likely have relatively minor emissions associated with earth-moving activities during site preparation. The drilling stage is relatively short term, but can release substantially higher emissions per unit time than production (Brown et al. 2015;Colborn et al. 2014;Field et al. 2014). Finally, a well status designation of active without production data does not provide proof of active hydrocarbon production. Proximal population analyses should explicitly differentiate the stages of well development and levels of risks associated with each stage.

Objective
We are not aware of any analysis to date that provides a defensible comprehensive well inventory and quantitative assessment of population counts in proximity to actively producing or recently drilled and completed (i.e., confirmed active) oil and gas wells at the national scale. Our analysis fills this gap and provides a methodological template for additional studies to build upon. We calculate population counts in proximity to at least one confirmed active hydrocarbon production well across a series of buffer distances that are most relevant to air pollutant exposure (100 m to 2,000 m, or ∼ 0:06 mi to ∼ 1:24 mi). We differentiate between primary production type (oil, wet gas, dry gas), well status (recently drilled, producing), and geographic boundaries (national, state). Finally, we nest this analysis in a review of published studies to compare, contrast, and contextualize our findings.

Oil and Gas Inventory
We obtained oil and gas well attribute and locational data primarily from DrillingInfo (http://info.drillinginfo.com/), a private sector company that supports the oil and gas industry and independent researchers by maintaining a national database for oil and gas wells collated from state regulatory records (Environmental Research Group 2013; Hughes 2014). We limited our analysis to wells in the conterminous United States that either produced or were drilled in 2014, which we consider confirmed active. We categorized a well as producing if it had a recorded last production date in 2014 or later, as long as it had recorded first production before 2015, to exclude wells that first produced after our 2014 base year. We categorized a well as recently drilled if it a) had a completion date in 2014 or later, as long as it did not have a spud date after 2014, to exclude wells that were spudded after 2014; or b) had no recorded completion date, but first produced oil or gas in 2014. We kept plugged and abandoned wells in the database if they were either completed or producing in 2014, regardless of later plugging and abandonment.
The DrillingInfo database was missing data for certain time frames. Base year (2014) production data were missing for five states: Tennessee and Kentucky records have data from 2012; Oregon, New York, and West Virginia records have data from 2013. We obtained and incorporated 2014 production data from state regulatory databases for New York and West Virginia (NYDEC 2014;WVDEP 2015). More recent data were unavailable for the remaining states; therefore, we used the 2012 and 2013 production data as a proxy for 2014 production in Tennessee, Kentucky, and Oregon.
Base year drilling data were also missing or incomplete for multiple states. We did not classify any wells in Tennessee and Oregon as recently drilled due to lack of data. Drilling data for Kentucky, West Virginia, and New York ended months before the end of the year and most likely were incomplete. More complete data were unavailable from other sources.
Based on a 2013-2014 national well count compilation prepared by the Independent Petroleum Association of America (IPAA 2014), we determined that DrillingInfo lacks well data for Illinois and Indiana. We were unable to obtain well data from other sources, and, as a result, these states were excluded. Well data for 30 states remained.
We defined well type (conventional or unconventional) by geology and, when data were available, technology. Geologic constraints on well type were based on U.S. Energy Information Administration (EIA) listings of continuous or emerging resources. (U.S. EIA 2011, 2014). We categorized any play or formation labeled by the EIA as tight gas, shale gas, or shale oil as an unconventional geological formation.
In our oil and gas database, we queried the records by the DrillingInfo "Reservoir" field for the EIA list of unconventional formations. If the reservoir field for a well contained the name of an unconventional formation per the EIA list, and, in addition, was in close geographic proximity to the unconventional formation (same state), then we categorized it as unconventional. We also classified wells as unconventional if the well was drilled horizontally as indicated by the "DrillType" field in the DrillingInfo database, or if the "Reservoir" field contained the keywords "unconventional," "shale," or "*sh*" (a variation of the keyword "shale") as long as they coincided with the correct geographic location. This approach was corroborated by assessment of other reputable databases, including U.S. Geological Survey (USGS) reports and state geological survey resources, where reservoir names matched shales, for example, "Devonian Sh" in West Virginia (Nuttall 2012) and "Sunbury Sh," also in West Virginia (Pepper et al. 1954). We categorized all remaining wells as conventional. We tested our results based on an understanding of the history of commercial-scale, high-volume hydraulic fracturing and horizontal drilling, as well as known locations of coalbed methane development, resulting in a range of counts (see Supplemental Material).
For each well, we determined primary production classification (oil, wet gas, dry gas) based on gas-liquid ratio (GLR) over a well's lifetime. The Environmental Research Group separated oil wells from gas wells at a GLR cutoff of 12,500 standard cubic feet per barrel (scf/bbl) (Environmental Research Group 2013; U.S. EPA 2014a). Consistent with this methodology, we classified any well with a GLR equal or greater than this threshold as a gas well, and any well with a lower GLR as an oil well. For gas wells, any well with a cumulative liquids value of 0 bbl was classified as dry, and all else as wet. There were 4,956 wells that had evidence of drilling or production in 2014, yet had no production quantities listed, which we categorized with an unknown well type.
Using ArcGIS (version 10.3; ESRI, Inc.), we created buffers around each well location in the model data set of 100; 400; 800; 1,000; 1,600; and 2,000 m. Well buffers for each distance were cut to the boundaries of each state, with all states including any cross-border buffer overlap that may have occurred due to well proximity to state boundaries. We merged well buffers around each individual well into a single polygon per buffer distance and state to account for overlapping buffers within states due to higher well densities, in order to avoid duplication in population allocation.
We removed 4 wells with reported negative production values, 6,848 wells lacking spatial coordinates, and 145 wells with coordinates that plotted outside of their listed state. There were 808,485 confirmed active wells that remained in our final oil and gas model data set.

Population Data set by Demographic
We obtained location and demographic information for populations from the U.S. Census Bureau. We downloaded age, race, and ethnicity data from the 2010 Decennial Census at the block level (U.S. Census Bureau 2011a) to determine population counts for the following variables: total population, Hispanic, minority, non-Hispanic minority, 5 y and younger, under 18 y, and 75 y and older. Minority represents the entire non-White population. The U.S. Census Hispanic designation is independent of race, and Hispanic individuals can identify as white or non-white. Non-Hispanic minority represents the non-white population that does not identify as Hispanic.

Spatial Analysis
We intersected the Census block polygons with each of the six buffers by state, and then allocated block-level counts to areas within each buffer polygon by calculating the percentage of each census block residing with each aggregated buffer polygon, applying these percentages to population counts. We summed the calculated population counts over each buffer distance and over each oil and gas variable of interest.

National-Level Results and Literature Review
Our well inventory includes 808,485 oil and gas wells across 30 states that are confirmed to be actively producing or newly drilled as of 2014 (Figure 1). Conventional wells make up 86.8% to 89.4% of the national well count (702,057 to 722,469 wells) with wells classified as unconventional making up the remainder (86,016 to 106,428 wells). The range of well counts reflects uncertainty in the well type classification method (see "Methods" section and Supplemental Material). The ratio of conventional to unconventional wells is not equally distributed across states, with ratios varying from 100% conventional to more than 75% unconventional (Figure 2). Oil wells make up 40% (323,580) of the national well count, and Western states generally have larger proportions of oil wells compared to Eastern states (Figure 2). Wet and dry natural gas wells account for 15% (122,432) and 44% (357,517) of the total well count, respectively.
Across the conterminous United States, we estimate that 17.6 million people, or roughly 6% of the conterminous U.S. population [308.7 million, 2010 U.S. Census data (2011a)], live within 1,600 m ( ∼ 1 mi) of one or more confirmed active oil or gas wells (Table 1). Of these, we estimate that 45%, 31%, and 55% of them live in proximity of one or more oil, wet gas, or dry gas wells, respectively. Table 2 presents national population counts by well data category. Please note that it is common to have different types of wells (primary production classification, status) in close proximity to one another; therefore, population counts within the same grouping may sum to greater than the total population. For results across all buffer distances, demographics, and oil and gas variables assessed, see Tables S1-S3.
We are aware of only two other national-level analyses that quantify the proximal population to oil and gas development at the national scale (Earthworks et al. 2016;Gold and McGinty 2013). Table 3 compares methodology and results across the reviewed studies and the current study. Gold and McGinty (2013) report 15.3 million people across 11 top producing states living within ∼ 1,600 m (1 mi) of what the authors designate as an unconventional well. This is over three times (382.5%) the population we estimate within 1,600 m of an unconventional well for the same states (4.0 million), which is slightly lower than our count over all states (4.7 million), but within range of what we estimate for conventional and unconventional wells combined nationally (17.6 million). Gold and McGinty (2013) designate a well as unconventional based on the reported spud date (i.e., commencement of well construction) or date of first production, with all wells drilled or producing after 1999 assumed to be unconventional. Based on our well data, 72.8% of the U.S. onshore wells completed or showing first production after 1999 are conventional wells, with the minority remainder being unconventional. Therefore, the Gold and McGinty (2013) count is more appropriately interpreted as the count of people living in proximity to at least one conventional and/or unconventional well. However, the post-1999 subset constitutes just 48.7% of the confirmed active wells nationally. In other words, our national estimate of persons living within 1,600 m of a conventional or unconventional well accounts for twice as many confirmed active conventional well locations. Accounting for these additional wells would substantially increase the Gold and McGinty (2013) population estimates. Gold and McGinty's (2013) use of complete apportionment to allocate population data to buffer boundaries (i.e., apportionment that assumes that any block group that partially intersects a buffer boundary is  (Table 3). The threat map study also includes ancillary infrastructure associated with oil and gas development (i.e., compressor stations and natural gas processing plants), which are not accounted for in our inventory. This complicates a direct comparison of population counts. For national security reasons, geographic locations of natural gas processing facilities are reported as the approximate centroid of the zip code in which they are located (U.S. EIA 2015). Average population density for all processing plant locations is 192 people per square mile (U.S. Census Bureau 2011a; U.S. EIA 2015). Based on these population densities and a total of 521 natural gas processing plants nationwide, we estimate that roughly 1.5% of the total population living within ∼ 800 m of an active oil and gas facility as estimated by Earthworks et al. (2016) is geographically associated with natural gas processing plant locations. Based on EPA data (U.S. EPA 2015), we estimate that there are 27,500 gas compressors nationally. Of these, 42% are part of gathering systems that are co-located with production wells, 24% are co-located with natural gas processing plants, and the remaining 34% are located within the natural gas transmission and storage sector. Transmission sector compressors may account for at least part of the small difference between our results and those reported in the threat maps. However, the threat risk study bases its well inventory on active status as reported in state databases without confirming drilling or production status. Potential inclusion of active permits for wells not yet drilled and wells not in production may also explain the difference in population counts.

State-Level Results and Literature Review
Texas, Ohio, California, Oklahoma, and Pennsylvania all have one million or more people living within 1,600 m (∼1 mi) of a well (Figure 3). Texas has the greatest number of people in proximity to active wells, with 4.5 million people living within the 1-mi buffer distance. Eight states have greater than 10% of their population living within a mile of an active well: Arkansas, Kansas, Louisiana, Ohio, Oklahoma, Pennsylvania, Texas, and West Virginia ( Figure  3). In fact, Ohio has just under a quarter of its population (24.3%), and West Virginia and Oklahoma have just short of half of their population living within this 1-mi buffer distance (49.6% and 46.7%, respectively). Population counts by state for the 400-, 800-, and 1,600-m buffers are provided in the Supplemental Material.
Previous state-level assessments include five Pennsylvania studies (Clough and Bell 2016;Meng 2015;Ogneva-Himmelberger and Huang 2015;Ridlington et al. 2015;Slonecker and Milheim 2015) and one California study (Srebotnjak and Rotkin-Ellman 2014). Each either omits or includes wells in their inventories that can substantially impact the estimate of populations at risk from oil and gas development. Only two of these report results directly comparable to those of the current study (Ridlington et al. 2015;Srebotnjak and Rotkin-Ellman 2014). Srebotnjak and Rotkin-Ellman (2014) based their population count for the state of California on a well inventory that included new permitted wells, which were not yet drilled, and wells reported in state records as active. As with the threat map study previously discussed, this study did not confirm production activity for active wells. The authors report a total of 84,434 wells (7,177 new wells and 77,257 active) and 5.4 million people living within 1,600 m of a well. Our analysis, which excludes wells not yet drilled and older wells that are no longer producing (approximately 58,000 wells in California), estimates the population of Californians living within 1,600 m of a well at 2.1 million people (Table 3), or 61% lower than that reported by Srebotnjak and Rotkin-Ellman (2014). Given the dominance of active wells in the Srebotnjak and Rotkin-Ellman (2014) well inventory and the development history of the area, a substantial portion of the 3.3 million people who make up the difference are likely co-located near an older well that is no longer producing, but not plugged. Older, nonproducing wells do likely pose increased atmospheric risks for nearby populations compared to sites never developed, as discussed in the "Discussion" section, but the scale of potential emissions differ from those of actively producing wells, and therefore, these counts should be separated.
The omission of conventional wells can also have a large impact on population counts. Due to the relatively large numbers of conventional wells in many oil-and gas-producing states, it can potentially overwhelm the effect of including wells not drilled or actively producing. Ridlington et al. (2015) limit their scope to Marcellus shale gas wells (unconventional) as designated in Pennsylvania state well records for wells permitted between January 2007 and May 2015. Wells in the inventory may or may not have been drilled or currently be in production. The population assessment from Ridlington et al. (2015) is limited to counts within vulnerable age demographics, i.e., counts of children and the elderly. The authors estimate 25,000 children under the age of five and 41,000 adults age 75 and older live within ∼ 1,600 m (1 mile) of an unconventional well. Our youth demographics consist of children age 5 y or under, or one additional year of age. Assuming an equal distribution across all years of age for children, which is consistent with 2010 Pennsylvania U.S. Census data (U.S. Census Bureau 2011b), the year 5 age bracket is expected to add another 4,100 children to the Ridlington et al. (2015) count for a total of 29,100 children age 5 and under at risk. Our estimate of 20,200 children age 5 and under living within 1,600 m of an unconventional well in Pennsylvania is 30% lower. Similarly, our estimate of the elderly population within 1,600 m of an unconventional well in Pennsylvania is 34% lower (27,000 persons age 75 or older; Table 3). Moreover, inclusion of conventional wells increases the estimates of children and elderly living within 1,600 m of one or more wells to 102,000 and 342,000, respectively, which are a 400% and 1,200% increase over counts from unconventional wells only. This exemplifies the possibility of dramatically underestimating populations potentially at risk when omitting conventional wells from these types of studies.
Both Ridlington et al. (2015) and Srebotnjak and Rotkin-Ellman (2014) include wells not drilled and wells without confirmed production status in their well inventories. This results in a moderate overestimate of populations in proximity to California conventional and unconventional wells (Srebotnjak and Rotkin-Ellman 2014). However, Ridlington et al. (2015) omit the very large population of actively producing conventional wells. This omission overwhelms any potential inflation of population counts caused by inclusion of wells not yet in existence, resulting in a gross underestimate of the population potentially at risk in Pennsylvania.
Other published studies are not directly comparable to our results (Clough and Bell 2016;Meng 2015;Ogneva-Himmelberger and Huang 2015;Slonecker and Milheim 2015), but common methodological issues highlighted in this review do provide guidance in assessing these studies. Meng (2015), Slonecker and Milheim (2015), and Clough and Bell (2016) focus on Pennsylvania, whereas Ogneva-Himmelberger and Huang (2015) focus on the Marcellus shaleproducing region, which includes parts of Pennsylvania, Ohio, and West Virginia. All of these, with the exception of Slonecker and Milheim (2015), limit their well inventories to unconventional wells only. Like Pennsylvania, in which 92% of wells are conventional, Ohio and West Virginia are predominantly comprised of conventional wells, at 92% and 88%, respectively. As illustrated in our review of Ridlington et al. (2015), the effect of excluding conventional wells in Pennsylvania can be substantial. Ogneva-Himmelberger and Huang (2015) and Clough and Bell (2016) assess subsets of the general population relevant to social justice questions. Depending on the distribution of the subset populations in relation to conventional wells, the effect of omitting conventional wells may differ from the dramatic effect noted for the Ridlington et al. (2015) estimates. Still, it is clear that each of these studies has underestimated the populations at possible risk, potentially significantly.

Discussion of Buffer Distances
The buffers used in this analysis are designed to encompass populations within various proximities to oil and gas development and associated emissions, with the assumption that exposure to emissions will be highest at the 100-m buffer and will continue at decreasing exposures through the remaining 400-, 800-, 1,000-, 1,600-, and 2,000-m buffers as distance from development increases (Meng and Ashby 2014). At this time, there is no single distance or set of distances from oil and gas wells that is accepted across the scientific community as conveying health consequences or lack thereof to adjacent human populations. This is demonstrated in the wide range of buffer distances used in previously published studies that range from approximately 400 m (0:25 mi) to up to 5,000 m (Meng 2015;Ogneva-Himmelberger and Huang 2015;Ridlington et al. 2015;Slonecker and Milheim 2015;Srebotnjak and Rotkin-Ellman 2014), as well as setbacks enacted at various regulatory levels around homes, schools, churches, and other locations where people congregate (Fry 2013;Macey et al. 2014;Richardson et al. 2013). Across 31 states with either existing or potential shale gas production, 20 have restrictions in place for well siting setbacks ranging in distance from 30:5 m to 305 m from the wellbore, with a mean setback of 94 m (Richardson et al. 2013). However, a review of the recent literature suggests that current regulatory setbacks may be inadequate to protect local populations from adverse health effects. Steinzor et al. (2013) found that self-reported health-related symptoms were most prevalent for community members living within 457 m (1,500 ft) from a natural gas facility. McKenzie et al. (2012) found greater hazard for cancer and noncancer health endpoints in residents living up to ∼ 800 m (0:5 mi) from unconventional gas development compared to residents located farther from a well site. Rabinowitz et al. (2015) report higher counts of reported health symptoms per study participant in residents living up to 1,000 m from an unconventional well, compared with those living >2,000 m from a well. The findings in the oil and gas epidemiological literature are corroborated by atmospheric dilution data of conserved pollutants. For example, a U.S. EPA report on dilution of conserved toxic air contaminants found that the dilution at 800 m (0:5 mi) from the source of the emission was on the order of 0:1 mg=m 3 per g=s (U.S. EPA 1992). Going out to 2,000 m (6,562 ft) would increase this dilution to 0:015 mg=m 3 per g=s, and at 3,000 m (9,843 ft), the dilution would be an estimated 0:007 mg=m 3 per g=s. For benzene, there is increased risk of health risks at a dilution of 0:1 mg=m 3 (1 ppb) (CalEPA 2016). As such, it is not clear that 2,000 m to 3,000 m (6,652 ft and 9,843 ft) from the source can always be considered safe. However, beyond 3,000 m (9,843 ft), where, all else being equal, concentrations fall more than two orders of magnitude relative to the 0:5-mi radius, there is likely to be a sufficient margin of safety for a given point source emission (Shonkoff and Gautier 2015).
Additionally, there is the added burden of regional hydrocarbon development emission sources, such as increased heavy vehicle traffic, or compressors, pipelines, and processing plants located outside of well sites (Brown et al. 2015;Jackson et al. 2014;Moore et al. 2014;Pekney et al. 2014;Warneke et al. 2014).

Future Research
Research published in the past 3 y has shed new light on emissions from abandoned and plugged wells. We defined abandoned wells as wells that are not currently producing but have not been plugged yet, compared with plugged wells that have been plugged, regardless of whether plugging techniques were up to Note: The column "This study" reports population counts from our well inventory, using the limitations criteria specified in the column "Well/facility inventory limits." We limit population counts to those within proximity to unconventional wells when applicable. We did not include ancillary infrastructure. All other criteria used by other studies, such as the inclusion of permitted wells and inactive wells, are excluded from the population counts we provided, for comparison purposes. current regulation. Abandoned and orphaned wells are not always adequately tracked in state databases, and these wells were excluded with the assumption that they may have some air pollutant emissions, but that these emissions account for a small fraction of regional oil and gas emissions. Methane sampling around abandoned and plugged wells in Pennsylvania indicated that the majority of emissions in the study area were limited to a small subset of sampled wells labeled as high emitters (Kang et al. 2014(Kang et al. , 2016. The most recent Pennsylvania numbers estimate that abandoned well emissions make up approximately 5-8% of all annual anthropogenic methane emissions in Pennsylvania (Kang et al. 2016). Townsend-Small et al. (2016) tested gas emissions from abandoned and plugged wells in active production areas of Wyoming, Colorado, Utah, and Ohio. They too found that emissions from abandoned wells were significantly higher than those from plugged wells. Unlike Kang et al. (2014Kang et al. ( , 2016, Townsend-Small et al. (2016) estimated that abandoned wells contribute <1% of regional methane emissions in their study areas. However, emissions in Ohio were estimated to be significantly higher than emissions in Wyoming, Colorado, and Utah. Higher emissions in Pennsylvania and Ohio may be due to the long history of oil and gas development in these states compared with the relatively short oil and gas history in the sampled Western states. It is generally accepted that wells plugged before the enactment of modern plugging regulations are more likely to develop larger gas leaks over time (Dilmore et al. 2015;Kang et al. 2015Kang et al. , 2016King and Valencia 2014). However, Boothroyd et al. (2016) measured gas emissions at wells plugged in accordance to current regulatory standards in the United Kingdom and found that 30% of sampled wells had significantly elevated methane concentrations relative to control samples. This aligns with suggestions made by Miyazaki (2009), and suggests that even wells plugged to current regulatory standards can develop substantial leaks. The relative importance of such leaks to local emission concentrations and exposure, however, is still uncertain. More research is needed to better quantify the proportional contribution of plugged well emissions to degraded air quality.
An important aspect of human health outcomes not yet addressed at this point in our assessment is the effect of well density on atmospheric public health hazards (Meng 2015;Ogneva-Himmelberger and Huang 2015). As with well type and primary production category, the density of wells in or near where people live, work, and play can influence human health impacts because of the spatial intensity of emission sources, which may contribute to more elevated concentrations of health-damaging air pollutants and other potential exposures ).

Limitations
Lack of availability and missing data values contributed to limitations in this analysis. Errors and incomplete records in well data may have caused wells to be incorrectly categorized as not yet drilled or not producing and, therefore, wrongly excluded from the analysis. Spatial coordinates, date information, and up-to-date well data were missing in some cases, and well data for Illinois and Indiana were unavailable. These missing data may have caused underestimates in population counts.
Additionally, lack of data may have resulted in misclassification of well types and errors in apportionment of population counts. Unavailability of oil production quantities led us to base our oil/gas cutoff using GLR instead of the more commonly used gas-oil-ratio and may have caused well primary production type misclassification in some cases. We took a conservative approach categorizing a gas well with any hydrocarbon liquids as wet gas, which may also have caused misclassification error. Our methods of categorizing wells as conventional or unconventional by target formation, presence of shale, and drilling orientation may have overestimated unconventional wells by including some coalbed methane wells. For further detail on this issue, see the Supplemental Material.
Another potential limitation was our exclusion of abandoned wells, plugged wells, and an assessment of well density. Uncovering more information on these issues is a topic for future research.

Conclusions
This analysis is the first national proximity analysis to date that examines well data by primary production (oil, wet gas, dry gas) and status (recently drilled, producing). We estimate that 17.6 million people live within 1,600 m of a confirmed active oil or gas well in the conterminous United States. We provide a literature review of previous studies and assess the strengths and weaknesses of prior research efforts. Of the eight studies we examined, three overestimate proximal populations potentially at risk by including nonproducing or not yet drilled wells and/or by using inappropriate population allocation methods. Five studies underestimate populations at potential risk by excluding conventional wells and/or oil wells.
The well inventory and proximity analysis guidelines discussed above are applicable to any study that makes use of these data and techniques. Specifically, when the driving force behind a hydrocarbon proximity study is to assess potential risks to human health, it is important to: • Include both conventional and unconventional well locations. • Differentiate the relative exposure risks from different primary production categories (e.g., oil, wet gas, dry gas).
• Confirm the well status from its drilling and/or production history. Well density is also an important metric in assessing risk to human health and should be accounted for in future proximal population studies. Future analyses can build upon the methodology presented here to construct well inventories and population proximity evaluations grounded in a solid understanding of the oil and gas lifecycle and the varied exposure pathways associated with different hydrocarbon production types and stages. These future analyses will be critical to move forward future areas of research and health protective regulations and policies that potentially include minimum surface setbacks, environmental and health equity considerations, and air pollutant emission exposure assessments that employ more than distance metrics as the exposure variable of interest.