An empirical comparison of lead exposure pathway models.

Structural equation modeling is a statistical method for partitioning the variance in a set of interrelated multivariate outcomes into that which is due to direct, indirect, and covariate (exogenous) effects. Despite this model's flexibility to handle different experimental designs, postulation of a causal chain among the endogenous variables and the points of influence of the covariates is required. This has motivated the researchers at the University of Cincinnati Department of Environmental Health to be guided by a theoretical model for movement of lead from distal sources (exterior soil or dust and paint lead) to proximal sources (interior dust lead) and then finally to biologic outcomes (handwipe and blood lead). The question of whether a single structural equation model built from proximity arguments can be applied to diverse populations observed in different communities with varying lead amounts, sources, and bioavailabilities is addressed in this article. This reanalysis involved data from 1855 children less than 72 months of age enrolled in 11 studies performed over approximately 15 years. Data from children residing near former ore-processing sites were included in this reanalysis. A single model adequately fit the data from these 11 studies; however, the model needs to be flexible to include pathways that are not frequently observed. As expected, the more proximal sources of interior dust lead and handwipe lead were the most important predictors of blood lead; soil lead often had a number of indirect influences. A limited number of covariates were also isolated as usually affecting the endogenous lead variables. The blood lead levels surveyed at the ore-processing sites were comparable to and actually somewhat lower than those reported in the the Third National Health and Nutrition Examination Survey. Lessened bioavailability of the lead at certain of these sites is a probable reason for this finding.

558-0525. Fax: (513) 558-4838. E-mail: paul.succop@uc.edu Abbreviations used: GM, geometric mean; mg/cm2, milligram per square centimeter; pg, microgram; pg/m2, microgram per square meter; pg/dl, microgram per deciliter; NHANES Ill, the Third National Health and Nutrition Examination Survey; ppm, parts per million. outcomes between that which is caused by a direct influence (i.e., not mediated by other variables in the model) and that which is related through one or more indirect pathways (i.e., mediated by at least one other variable in the model). The structural equation approach allows researchers to postulate richer and more realistic models of environmental exposure, as not every possible predictor must be specified to have a direct influence on the final outcome (e.g., a measure of internal exposure such as children's measured blood lead).
These models are specified in a manner similar to ordinary regression models. However, because an entire causal chain of events or outcomes must be specified, additional thought must be given to the order in which these events take place (as reflected in changes/differences in the measured variables). In the models fit by researchers at the University of Cincinnati, the postulated chain of events is that of movement of lead from paint and soil to dust on floors (and/or other surfaces), to children's hands (and/or toys), and finally to their blood ( Figure 1). The model shown in Figure 1 is a theoretical model that has served to guide the various statistical analyses. The direction of causality, of necessity, is imposed on the model's pathways, much as independent variables are chosen to be predictors of dependent variables, which are the outcome variables used in both a structural equation or a multiple regression analysis. Choice of the direction of causality may be made in terms of one or more considerations, including timing of events (antecedents are expected to predict future outcomes), proximity (the model is built from distal to proximal sources or biologic outcomes), and amount of contamination (a sampled media with larger amounts of lead might be expected to contaminate a sampled media that demonstrated a lesser amount of lead). The cross-sectional model shown in Figure 1 was built primarily on proximity arguments. Site-specific differences (e.g., in the bioavailabilty of lead, the general condition of the housing stock, the amount of groundcover, or the type of soil) also have introduced variability into the relative contributions that each of these sources makes to predicting a child's blood lead.
In addition to the pathways between the lead sources and the outcome of blood lead, covariates (i.e., other measurements made at the site that may modify the  associations shown in Figure 1) must be taken into account. Covariates that frequently are associated with children's blood lead levels include the family's socioeconomic status (income and/or education), the child's age, and the child's hand-tomouth activity. House age and condition often are significant predictors of the lead source levels, in particular, the paint lead loading (measured in amount of lead per unit area, e.g., mg/cm2), or soil lead and dust lead loadings (measured in amount of lead per unit area, e.g., pg/mi2) and concentrations (characterized as parts per million). Proportion of soil area with groundcover affects exterior dust lead levels, as well as soil lead levels in some studies. The covariates that are statistically significant are often not the same for each community, as the effects of these variables are more likely to be siteor population-specific than the endogenous pathways shown in Figure 1. The question that is addressed in this article is the degree of similarity in the results documented from a number of different sites in which community lead studies have been undertaken. The more similar the results, the more appropriate that a single model be used for scientific and regulatory purposes and to estimate the dose-response relationships in different communities.
Bingham Creek, Utah, 15 miles south of Salt Lake City, is heavily contaminated with mill tailings that were transported downstream from the copper and lead mines located 10 to 15 miles to the west of the most heavily contaminated area. Transport of tailings down the creek ceased in the mid-1930s. Recently, new middle income subdivisions have been built on lead-contaminated soils along a 10-mile stretch of the Bingham Creek flood plain (1).
Butte, Montana, population 39,000, is the site of extensive deep and open-pit copper mining. Some open-pit mining continues today. Smelters ceased operation by about 1910. There are extensive leadcontaining mine waste dumps and tailings ponds throughout the community. Lead in paint and drinking water pose additional concerns (2).
The Cincinnati Lead in Children Study is a longitudinal study of the effects of lead on child development in a cohort of lowincome families living in the inner city. Most housing, which was built in the late 19th century, is poorly maintained and contains large quantities of lead-based paint. Extensive environmental data on this cohort was collected between 1981 and 1986 (3,4). A cross-sectional sample of children that maximized the amount of nonmissing data and minimized the lag time between the measurement of the children's blood lead and the measurement of the lead associated with their residences was chosen for the purpose of this reanalysis.
The Cincinnati soil lead abatement project focused on lead exposure reduction in families residing in inner city apartment buildings that had been renovated and contained little or no lead-based paint. Primary sources of exposure were high levels of lead in street dust and soil in vacant lots and playgrounds (5-7). Only preintervention data collected for this study were used in this reanalysis.
Leadville, Colorado, population 3800, was home to very extensive lead mining, milling, and smelting operations. Most mines and smelters closed by 1970. One mine and mill remain in operation. Leadbased paint is a major source of exposure in over half the housing (8).
Magna, Utah, is located just south of a 6000-acre tailings pond and downwind from a very large active copper smelter about 15 miles west of Salt Lake City. Residential soils in a 15-square-block, lowincome neighborhood closest to the smelter were evaluated and found to have acceptably low levels of lead. However, lead-based residential paint posed a considerable concern (9).
Midvale, Utah, population 12,000, is located about 12 miles south of Salt Lake City. It was home to a zinc/lead mill and smelter. The smelter closed in the mid-1950s and the mill ceased operation in the early 1970s. Residential soils contained high levels of zinc and lead. Homes built prior to 1950 contain high levels of lead in paint (10,11).
Palmerton, Pennsylvania, population 5400, is located about 100 miles northwest of Philadelphia. A large zinc smelter with plant sites on each side of the town operated for over 75 years in this valley. Residential soils contain high levels of lead and cadmium (12).
Sandy, Utah, historic community about 15 miles south of Salt Lake City, was the site of several lead smelters at the turn of the century. Residential soils in a 15square-block area around these former smelters contain lead and arsenic. Housing is old but well maintained (13).
Telluride, Colorado, population 1200, was a silver and lead mining and milling community at the turn of the century.
Trail, British Columbia, population 15,000, is located on the Columbia River due north of Spokane, Washington, and is home to one of the few remaining active lead and zinc smelters in North America. High lead emissions contaminated much of the residential area in the immediate vicinity of the plant. New technology has greatly reduced emissions but residual lead dust remains a problem (16).
Data collection in these studies was performed between 1981 and 1995, with most of the studies being performed since 1989. Environmental and blood collection procedures were standardized and all studies incorporated a rigorous quality control plan. Community geometric mean (GM) blood lead levels ranged between 2.6 pg/dl in Bingham Creek, Utah, and 12.9 pg/dl in the Cincinnati Lead in Children Study. Handwipe lead was highest in Trail, British Columbia (GM = 9.0 pg), and lowest in Bingham Creek (GM = 2.0 pg); however, handwipes were not collected in four of the studies. A complete comparison of lead exposures at the various sites is shown in Table 1. A summary of the number of elevated blood leads at various ore-processing sites is shown in Table 2. This table also includes data collected in the town of Aspen, Colorado, a site at which a small study was performed in 1996 (17). From this table, it can be seen that the blood lead levels at the ore-processing sites do not differ appreciably from that of children nationally (according to the the Third National Health and Nutrition Examination Survey (NHANES III) National Blood Lead Survey (18), and actually are somewhat lower than the national average. Lessened bioavailabilty of the lead at certain of these sites is a probable reason for this finding.
The generic model for pathways from lead sources to children's blood leads (Figure 1) was applied to data from each of the 11 studies as closely as possible. Not all data were collected at all sites (e.g., handwipe lead), and the questionnaires used for collecting much of the covariate information did not become completely standardized until the later field studies undertaken in the western United States. Thus, some variables were not measured at certain sites.
In the analyses of each community's data, a strategy of backward elimination of no data exists. 'Tabled values are geometric means except for "When housing was constructed," whose values are arithmetic mean year of construction, where available, or the modal epoch of construction if not (i.e., 1890-1940); and "Child's age," "Mouthing," "Residency length," and "Socioeconomic status," whose tabled values are arithmetic means. bPossible range = 10 (low score) to 40 (high score); constructed from 10 items each measured on a 4-point scale. cPossible range = 8 (low score) to 66 (high score). insignificant pathways was employed. An initial model was specified, similar to that shown in Figure 1, that included all possible endogenous pathways and covariates and interaction variables postulated to affect one or more variables in the causal chain. The effects of insignificant interactions, i.e., cross-products formed by multiplying two or more explanatory variables in a given equation, were removed before the simple main effects from which they were formed. Insignificant effects were removed until all the remaining predictors in each structural equation were significant. At this stage, a single forward inclusion step was performed that allowed previously removed variables the opportunity to re-enter the structural equation. A final model was obtained when all effects in the model were significant and all excluded predictors were insignificant, as based on a test for inclusion in the final model. In all of these analyses, blood lead and environmental lead data were transformed to their natural logarithm to normalize the statistical distributions. An alpha level of 0.05 was used to judge statistical significance for each postulated pathway in each study.

Results
A summary of the significant pathways among the endogenous variables from structural equation models fit separately to each site's data is shown in Table 3. Exterior paint lead loading was found to be a significant predictor of soil lead concentration in 3 of 7 studies (43%) in which this pathway was tested. The concentration of entry dust lead was predicted significantly in 3 of 8 studies (37.5%) by exterior paint lead loading, in 1 of 8 studies (12.5%) by interior paint lead loading, and in 6 of 7 (86%) studies by soil lead concentration. Concentration of floor dust lead was significantly related to exterior paint lead loading in 2 of 9 studies (22%), to interior paint lead loading in 4 of 10 studies (40%), to soil lead concentration in 6 of 8 studies (75%), and to entry dust lead concentration in 7 of 10 studies (70%). Handwipe lead was predicted by interior paint lead loading in 2 of 6 studies (33%), by soil lead concentration in 2 of 3 studies (67%), by entry dust lead concentration in 1 of 6 studies (17%), and by floor dust lead concentration in 5 of 6 studies (83%). Finally, exterior paint lead loading was found to be a significant factor in explaining the variance in blood lead in 2 of 9 studies (22%). Interior paint lead loading contributed significantly in 1 of 10 studies (10%), soil lead concentration in 3 of 8 studies (37.5%), entry dust lead concentration in 1 of 10 studies (10%), floor dust lead concentration in 7 of 11 studies (64%), and handwipe lead in 4 of 6 studies (67%). These results are also shown in Figure 2, which depicts the relative importance of each of the postulated pathways by arrows of various widths (no arrow is shown for a pathway that was never found to be significant). The results shown in Table 3 and Figure 2 address the extent to which the various pathways postulated to exist by the theoretical structural equation model can be generalized to other sites. The strength of the individual pathways to affect blood lead can be demonstrated by estimating the simple regression relationship between the geometric mean blood lead and the geometric mean environmental lead, as observed from the data collected within each study. The results of this approach are summarized in Table 4. The strongest relationships were observed between blood lead and interior floor dust lead loading (R2 = 0.96, after excluding the Cincinnati, Ohio, soil lead abatement project and Trail, British Columbia, which were outliers) or handwipe lead (R2 = 0 .90); the weakest were between blood lead and exterior or interior paint lead (R2= 0.07). The regression models also predict that increasing exposure to lead is associated with elevated blood lead levels. If these models indeed reflect causal relationships, then diminishing the amount of lead in the environment will result in a lesser lead burden for children who, in the future, reside in these environments. The model for estimating blood lead from handwipe lead predicts that children whose handwipe lead differs by about 9 pg (from 10-1 pg) have a reduced blood lead of about 14 pg/dl (from approximately 15-1 pg/dl). Somewhat smaller reductions in blood lead would be expected for differences in dust lead or soil lead concentrations of 1000 ppm, from 1 100 to 100 ppm (ranging between 6.5 pg/dl for dust and 2.2 pg/dl for soil); for differences in floor dust lead loadings of about 1000 jig/m2, from 1100 to 100 pg/mi2 (approximately a 9.0 pg/dl decrease); or for a difference in paint lead of about 2.5 mg/cm2, from 3.0 to 0.5 mg/cm2 (ranging between 2.1 pg/dl for interior paint lead and 1.3 pg/dl for exterior paint lead).
For comparison, a linear regression model was estimated for predicting each site's arithmetic mean blood lead from the arithmetic mean environmental lead. These results are shown in Table 5. The fit of the   linear model to the data was generally somewhat poorer, with R2s varying between 0.02 for exterior paint lead and 0.90 for interior dust lead loading. The linear model also predicts a smaller decline in blood leads over the stated ranges. For example, the expected change in blood lead because of a change in interior dust loading from 1 100 to 100 pg/m2 is less than 1 pg/dl given the specification of a linear model.

Discussion
Despite variability among the sites in terms of lead bioavailabilty, type of site, and population sampled, origin of the lead currently found in the environment, and the exact study instruments used, a great deal of commonality is found among the results in these 11 studies. With some exceptions, soil lead and paint lead were found to be indirect influences on blood lead, usually operating through a dust lead or through dust on the children's hands. Soil lead had numerous important indirect influences in these studies. However, these studies confirmed that the proximal influences of handwipe lead and interior floor dust lead tended to be the most important direct contributors to the blood lead of children under 6 years of age. This analysis confirms that enough commonality exists among the studies and the sampled populations that a single structural equation model is probably adequate. However, the model needs to be flexible in order to account for pathways that are infrequently observed to participate in the causal chain to elevated blood lead. For example, exterior paint, interior paint, soil lead, and entry dust lead directly affected Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 S ea 86% (ppm)  "These are simple relationships unadjusted for covariates. bPredicted decline in blood lead for a reduction in hand lead of 10-1 pg; dust lead loading of 1100 to 100 pg/mi2; dust lead or soil lead concentration of 1100-100 ppm; or paint lead loading of 3.0-0.5 mg/cm2, as calculated from the fitted linear regression equation: /n)blood lead) = intercept + slope x /n(environmental lead). cExcluding the Trail and Cincinnati soil project studies, which appear to be outliers. The exposure in these two studies appears to be primarily from exterior dust lead. aThese are simple relationships unadjusted for covariates. bPredicted decline in blood lead for a reduction in hand lead of 10-1 pg; dust lead loading of 1100-100 pg/m2; dust lead or soil lead concentration of 1100-100 ppm, or paint lead loading of 3.0-0.5 mg/cm2 as calculated from the fitted linear regression equation: blood lead = intercept + slope x environmental lead. CExcluding the Trail and Cincinnati soil project studies, which appear to be outliers. The exposure in these two studies appears to be primarily from exterior dust lead.
blood lead in the studies done in Utah, where the bioavailablity of the lead in the environment may have been low or the lead in the proximal environment did not tend to break down into smaller particles (and thus become part of normal house dust). Infrequently observed effects (representing either endogenous lead source variables or the covariates of lead exposure) must be ruled out on a site-by-site basis. The effects of covariates are likely to be site or population specific. A list of the covariates that have been most frequently observed to affect each endogenous variable in the structural equation model's causal chain is provided in Table 6. The individual reports provide the estimates of the effects on blood lead and environmental lead of each of the covariates studied at the various sites. A simple regression model performed on the various studies' blood lead and environmental lead geometric means indicated that reducing handwipe lead and dust lead levels may result in significant declines in the blood lead of children who come to reside in these environments in the future. Reducing a child's handwipe lead level from 10 to 1 pg results in an expected decrease in a young future occupant's blood lead of about 14 pg/dl. It must be pointed out that these models are not corrected for covariates or the other measured lead sources and therefore are most useful in demonstrating the relative expected declines, not the absolute declines that might be obtained in reducing a single source of lead. Lead in the environment presents a multimedia multisource problem that is unlikely to be resolved by abating only one source or media in a community. A comparable linear model for each site's arithmetic means provided a somewhat poorer fit to the data and also predicted a smaller decline in low-level blood leads.
Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 In summary, the modeling technique used to fit the data collected in these studies provides an empirical method for predicting community lead exposure levels and the inherent variability about these levels. The accuracy of these predictions is crucial, not only because of scientific interest and medical concern, but also because such expectations often serve as the basis for regulation of lead levels in the environment.

Conclusions
The statistical technique of structural equation analysis is an extremely flexible method that may be used to model environmental source data linkages with human population exposure. Although a good deal of variability has been observed in lead exposure data collected at various former ore-processing and urban sites, the structural equation models tailored to each locale were similar to one another. These results occurred despite low levels of lead in blood at the ore-processing sites and obvious source and population differences. In conclusion, the technique of structural equation modeling is a powerful tool for helping resolve the complicated interactions between contamination in the environment and human uptake of these pollutants.