The role of genetic factors in autoimmune disease: implications for environmental research.

Studies in both humans and in animal models of specific disorders suggest that polymorphisms of multiple genes are involved in conferring either a predisposition to or protection from autoimmune diseases. Genes encoding polymorphic proteins that regulate immune responses or the rates and extent of metabolism of certain chemical structures have been the focus of much of the research regarding genetic susceptibility. We examine the type and strength of evidence concerning genetic factors and disease etiology, drawing examples from a number of autoimmune diseases. Twin studies of rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), type I diabetes, and multiple sclerosis (MS) indicate that disease concordance in monozygotic twins is 4 or more times higher than in dizygotic twins. Strong familial associations (odds ratio ranging from 5-10) are seen in studies of MS, type I diabetes, Graves disease, discoid lupus, and SLE. Familial association studies have also reported an increased risk of several systemic autoimmune diseases among relatives of patients with a systemic autoimmune disease. This association may reflect a common etiologic pathway with shared genetic or environmental influences among these diseases. Recent genomewide searches in RA, SLE, and MS provide evidence for multiple susceptibility genes involving major histocompatibility complex (MHC) and non-MHC loci; there is also evidence that many autoimmune diseases share a common set of susceptibility genes. The multifactorial nature of the genetic risk factors and the low penetrance of disease underscore the potential influence of environmental factors and gene-environment interactions on the etiology of autoimmune diseases.

http.//ehpnet1.niehs.nih.gov/docs/1999/suppl-5/693-700cooper/abstract.html Autoimmune diseases include a wide variety of conditions with differing clinical presentations, natural histories, and treatment options. A common underlying feature of both organspecific and systemic forms of these diseases is that the immune system's ability to respond appropriately to self-tissues is altered, resulting in the production of B-and T-cell responses directed against self-antigens (autoimmunity). The mechanism through which autoimmunity progresses to produce pathology (an autoimmune disease) is not understood. Studies in both humans and in animal models of specific disorders suggest that polymorphisms of multiple genes are involved in conferring either a predisposition to or protection from autoimmune diseases (1,2). It is important to note that these are common genetic polymorphisms present in 5% or more of the population rather than rare disease-causing mutations such as those involved in cystic fibrosis or galactosemia. This suggests that these genes may have served some selective advantage during human evolution. The multifactorial nature of the genetic risk factors and the low penetrance of disease underscore the potential influence of environmental factors on etiology.
Recent reviews of genetic aspects of specific autoimmune diseases have been published (3)(4)(5)(6)(7). In this summary, we examine the type and strength of evidence concerning genetic factors and disease etiology, drawing examples from systemic (e.g., rheumatoid arthritis [RA], systemic lupus erythematosus [SLE]) and organ-specific (e.g., type I diabetes, multiple sclerosis [MS]) diseases. Our focus is on evidence from studies in humans, and a particular emphasis is placed on issues relating to gene-environment interaction and issues affecting the interpretation of evidence and design of future studies.

Potential Genetic Influences on Autoimmune Diseases
Evolutionary forces have resulted in the development in mammals of a complex array of genes beyond those encoding monomorphic proteins involved in basic metabolic processes found in prokaryotes and simpler vertebrates. Among the evolutionarily recent genes are those that encode either polymorphic proteins that regulate immune responses (immunogenetic loci) or the rates and extent of metabolism of certain chemical structures (pharmacogenetic loci). It is thought that environmental exposures, primarily in the forms of infections and toxic agents, have shaped the types and functions of this diverse array of genes. Presumably similar evolutionary forces have resulted in different distributions of polymorphisms in different ethnic groups. This creates significant challenges to the proper design of population-based studies requiring appropriately matched control groups.

Inmune Regulation Genes
Omnunogenetic Loci) Immunogenetic loci encode the major histocompatibility complex (MHC) class I and II proteins, as well as complement components, immunoglobulins, cytokines/ chemokines and their receptors, transporters associated with antigen processing genes, T-cell receptor genes, and minor histocompatibility markers. The MHC genes are located on chromosome 6 in humans, and the class I (A, B, C) and II (DR, DQ, DP) genes are highly polymorphic (Figure 1). Class I and II molecules (human leukocyte antigens [HLA]) comprise a light chain and a heavy chain that combine to form a peptidebinding site; the bound peptide is then presented to T-cell receptors. Differences in amino acid sequence can produce differences in the shape of the binding site and thus differences in binding affinity. Some of the alleles of the MHC (e.g., Al-B8-DR3) are in strong linkage disequilibrium (8). Linkage disequilibrium arises when alleles of different genes occur together more frequently than would be expected if random assortment were taking place during meiosis; the shorter the distance between genes on a chromosome, the greater the chance that linkage will occur. The class III MHC includes molecules involved in antigen recognition (heat-shock proteins), inflammatory responses (the COOPER ET AL. Class 11 (HLA) Class Ill (non-HLA) Class I (HLA) Figure 1. Genes of the human major histocompatibility complex, located on the short arm of chromosome 6. Abbreviations: HSP, heat shock protein; PRL, prolactin, TAP, transporters associated with antigen processing; TNF, tumor necrosis factor. Class (A, B, C) and class 11 genes (DR, DO, DP), shaded in grey, comprise the human leukocyte antigens. These molecules can be expressed on most cells, bind to peptides, and present the bound peptide to T-cell receptors. Other immune-regulating genes are shaded white: TAP, complement components (C4B, C4A, Bf, C2), HSP, TNF. Genes with nonimmunologic functions are shaded in black: collagen (COLilA), 21aand 21 i-hydroxylase (CYP21 and CYP21 P, respectively), and PRL. complement proteins), and macrophage activation (tumor necrosis factor [TNF]) (9). Other cytokines that are not encoded by the MHC play important roles in stimulating T cells and B cells (e.g., interleukin [IL]-2, IL-6, IL-12, interferons) and therefore could be involved in autoimmune responses. Genetic variability in the structure of immunoglobulins (immunoglobulin allotypes) and T-cell receptors can also influence immune responsiveness to self-antigens and foreign antigens. Prolactin may also have important immune-modulating influences affecting the risk of autoimmune disease (10). The prolactin gene is located close to the MHC region of chromosome 6, and Brennan et al. (11) recently reported associations between genetic markers close to the prolactin gene in SLE patients who also had DRB1*0301 and in RA patients who had DRB1*0401. Thus, linkage disequilibrium may occur between the class I, class II, and class III genes of the MHC and also between the MHC genes and other nearby genes that are not directly involved in immune regulation. Linkage disequilibrium between TNF-a and DRB may explain the conflicting results from studies of TNF-a polymorphisms, DRB alleles, and either the incidence or clinical presentation of RA (12)(13)(14)(15).

Metabolism Genes
The metabolism of drugs, chemicals, and dietary constituents can require several different steps involving oxidation (sometimes referred to as phase I) and conjugation of oxygenated (electrophilic) intermediaries into hydrophilic compounds that are more easily excreted (phase 2). Oxidation enzymes include cytochrome P450 enzymes (e.g., CYPlAI, encoding arylhydrocarbon hydroxylase, and CYP2D6, encoding debrisoquine hydroxylase), myeloperoxidase, alcohol dehydrogenase, and aldehyde dehydrogenase. Glutathione S-transferase, epoxide hydrolase, sulfotransferase, and N-acetyltransferase (NAT) can act as phase II enzymes (16)(17)(18). The liver is the primary site of metabolism of drugs and other compounds, but additional steps can occur in the bladder, lung, colon, and other tissues. One of the isoforms of NAT, NAT-1, is present in leukocytes (19), and myeloperoxidase is present in neutrophils (20). The toxicologic or carcinogenic activity of the metabolites along a pathway varies; conjugated compounds are generally but not always less reactive. Polymorphisms in many of these enzymeencoding genes have been reported. These polymorphisms result in relatively slow and fast metabolism phenotypes, resulting in differences in exposures to the parent compound and to specific metabolites. Polymorphisms in receptor genes such as the aromatic hydrocarbon receptor gene may also influence metabolic activity (17).
Much of the work with respect to metabolism has focused on drug-induced lupus. Drug-induced lupus shares some of the clinical and autoantibody features of SLE but differs in other respects. These syndromes are unintended outcomes of many commonly used drugs such as procainamide, hydralazine, isoniazid, and penicillamine (21,22). One important aspect of drug-induced lupus is that the condition most often resolves after the medication is discontinued (23). N-Acetyltransferase activity has been associated with the development of drug-induced lupus, with slow acetylation conferring higher risk for developing specific autoantibodies or other features of this condition (24,25).
Studies of NAT activity in idiopathic (non-drug-induced) SLE have found little evidence of an association (26)(27)(28)(29). These studies used a phenotypic assessment of NAT activity based on a dapsone challenge rather than polymerase chain reaction-based techniques that can identify the genotype for each of two NAT isoforms (NAT-1 and NAT-2). Only one of these studies also assessed exposure to aromatic amines (from dark hair dyes and from smoking) (29).
The cytochrome P450 enzyme system is involved in the conversion of cholesterol into the various metabolites of testosterone and estradiol (30,31). The microsomal enzyme aromatase converts androgens to estrogens (androstenedione to estrone and testosterone to estradiol), and is encoded by CYP19 ( Figure  2). The C2 hydroxylation and Cl6a hydroxylation of estrone involves other P450-mediated pathways (CYP1A2, possibly CYP3A4), and the C16a-hydroxylated compounds have greater estrogenic potential than the catechol metabolites (31,32 It is important to note that there are multiple steps in most metabolic processes that involve different enzymes and different genes. Determining the overall significance of variation of one enzyme in a system requires consideration of all the steps, particularly with respect to possible rate-limiting steps, in the pathway. There may be factors that affect variation within pathways (inducibility of enzymes), and there also may be competing pathways involving different enzymes. Thus, metabolism of exogenous and endogenous compounds is an important potential source of variability in risk for autoimmune diseases, but its full importance is not yet understood.

Evidence for Genetic Factors in the Etiology of Autoimmune Diseases Twin Studies
Studies of concordance in monozygotic (MZ) and dizygotic (DZ) twins provide one line of evidence concerning the contribution of genetics to the onset, presentation, or severity of disease. Table 1 summarizes twin studies for several autoimmune diseases (36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51)(52). For each, disease concordance in MZ is much higher than in DZ twins. MZ twins are genetically identical, but DZ twins share on average only 50% of genes in common, thus the greater disease concordance in MZ twins suggests a strong influence of genetic factors on disease susceptibility. It should be noted, however, that stochastic events and environmental exposures alter the immune system and its response over a lifetime, so that MZ twins are not identical for long in terms of their specific immunocyte distributions and receptors. The concordance in MZ twins is higher for type I diabetes (mean pairwise concordance across studies, 30.1%) than for SLE, MS, Graves disease, or RA. This pattern may reflect a greater role of genetic susceptibility in early-onset compared with later-onset diseases. However, twins are more likely to share environmental exposures (e.g., diet, infectious diseases) as children than as adults. The higher concordance among DZ twins for type I diabetes (6.8% for pairwise concordance) than for older-onset diseases (< 3.5%) may also reflect the influence of shared environment.

Familial Association Studies
Several studies have compared disease incidence or prevalence among relatives of patients with autoimmune diseases to the dis-ease frequency among relatives of a selected control group or to estimates from the general population. Table 2 summarizes data from studies of first-degree relatives (i.e., parents, siblings, and children) (53)(54)(55)(56)(57)(58)(59)(60)(61)(62)(63)(64)(65)(66)(67)(68)(69). MS shows a strong familial association. Although the risk of MS occurring in first-degree relatives of MS patients is low (< 5%), it is much higher than that in the general population (< 0.5%). Strong associations (odds ratio ranging from 5-10) are also seen in studies of type I diabetes, Graves disease, discoid lupus, and SLE. The familial association with RA is weaker (odds ratio < 2) in two of the studies that validated the RA diagnosis of relatives by physical examination or medical record review. The validation of diagnosis is important, as the false-positive reporting of a history of RA may be high (> 50%) for both self-reports (70) and proxy reports (71).
Several studies also examined the familial association with other diseases (both autoimmune and nonautoimmune diseases) ( Table  2). There is some evidence for a weak association (odds ratio -2.0) between type I diabetes and a history of type II diabetes in first-degree relatives. It may be difficult to correctly classify diabetes type on the basis of limited questionnaire data, however, so this association may reflect misclassification. Although no familial association with other autoimmune diseases was reported in a study of MS patients (56), four studies of relatives of patients with a systemic autoimmune disease   (68). The other examined family history of autoimmune diseases among patients with multiple myeloma; the odds ratio for any reported autoimmune disease (e.g., RA, SLE, and pernicious anemia) was 3.0 (95% CI, 1.3-7.1) (72). These observations raise the possibility of common pathogenic mechanisms involving cancer and autoimmune diseases, such as dysregulation of apoptosis and detoxification pathways.

Gene Association Studies
Gene association studies compare the frequency of a specified genetic marker (measured through either phenotypic assays or genotyping) in patients and in a control group. Much of the work with respect to autoimmune diseases has focused on the MHC genes. One complication in the design and interpretation of these studies, however, is the degree of ethnic variability in the prevalence of specific MHC alleles. Ethnicity in this context refers not to broad racial groups but rather to much smaller groups defined by specific historical, migration, and sociocultural patterns. This is particularly problematic in geographic areas that have been the destination of significant immigration. Another complication is the degree of linkage disequilibrium between the genes of the MHC, which may obscure the identification of the effects of specific genes, particularly in early studies that relied on serologic measures of antigens (e.g., DR2 or DR3). Examples of gene association studies in SLE that address ethnicity are shown in Table 3 (73-78). The studies by Schur et al. (75) and Goldstein and Sengar (76) analyzed ethnic groups within broad racial categories (e.g., French Canadian and non-French Canadian). Both reported evidence of different associations among different ethnic groups, although the small sample size in the Goldstein and Sengar study resulted in variable estimates that make it difficult to definitely interpret the observed differences.
The selection of controls in populationbased gene association studies is very important, as it may be difficult to adequately account for genetic admixture of the population. Alternative designs such as gene association analyses using case-parent triads avoid this problem and do not require assumptions about type of inheritance or disease penetrance (8,79) that are needed in other analytic approaches. Pedigree Studies: Segregation Analysis, Link Analysis, and Genome Scaithes Segregation analysis is the first step in identifying the relation between an individual's genotype and the resulting phenotype (80). Using appropriate statistical methods, one compares the inheritance of the disease within families with that expected under specific models. The models may evaluate a) whether there is a single major gene responsible for the autoimmune disease, b) whether the susceptibility to the disease is controlled by many genes (polygenic inheritance), and c) the environmental transmission model. The model that is most compatible with the observed family data is adopted. Identification of a major gene does not mean that it is the only gene responsible for the disease; rather, its effect is large enough to be discernible from those of the other genes implicated in the etiology of the disease. twice the number of patients. hAllele frequency, based on phenotypic measurement; total number used in calculations is total number of haplotypes. cAllotype frequency; based on phenotypic measurement; total number used in calculations is number of patients. dSerologic measurement of DR3; total number used in calculations is number of patients. 0DR3(17) specificity based on analysis of restriction fragment length polymorphisms; total number used in calculations is number of patients. fDR3*0301 allele frequency; total number used in calculations is number of patients. 'Frequency of TNF-cr238, the G to A substitution at the -238 postition of the promotor region of TNF-a, which results in the TNF-A variant. *Frequency of TNF-ae3-8, the G to A substitution at postition -308 of the promotor region of TNF-a, which results in the TNF-2 variant.

THE ROLE OF GENETIC FACTORS IN AUTOIMMUNE DISEASES
Recently, more powerful methods of segregation analyses, called complex segregation analyses, have been developed (81). These can be applied to both quantitative and qualitative traits and can elucidate complex patterns of genetic/environmental transmission. An early segregation analysis involving 18 selected kindreds suggested that autoimmunity is controlled by a single autosomal dominant gene (82). The postulated major autoimmune gene has not been mapped, but in two studies of familial patterns of autoimmune diseases, linkage to HLA or genetic markers of human immunoglobulin gamma or kappa chain (GM and KM) allotypes was excluded (82)(83)(84). Recent investigations are more consistent with the belief that autoimmunity is polygenetic (1). Development of specific autoantibodies or an autoimmune disease may depend on the epistatic interactions of autoimmunity-predisposing genes and environmental factors.
Linkage implies cosegregation of alleles at two different loci. Linkage of a marker locus and a disease provides much stronger evidence for a substantial genetic component in the etiology of the disease than that provided by segregation analysis. It is important to remember that the association analysis discussed in the previous section specifies a relationship between an allele and a disease, whereas linkage denotes a close physical localization of a marker locus and the putative locus for the disease. Loci are linked but not their alleles (unless there is linkage disequilibrium/allelic association). In other words, the marker allele segregating with the disease may be different in different families.
Different analytical strategies have been developed that take advantage of information provided by genetic markers in families of individuals affected by autoimmune diseases. These linkage approaches include the logodds score method, the affected-sibpair method, the affected-pedigree-member method, the variance component method, and linkage disequilibrium-based approaches (80,(85)(86)(87). Each of these has intrinsic advantages and disadvantages, and the choice best suited for a given study depends upon many factors, including the epidemiology of the disorder investigated, the state of knowledge about the nature and frequencies of genetic risk factors and linked loci, and the resources available. All of these approaches are efficient in defining genetic linkages for well-defined monogenic traits when large numbers of individuals are available for analyses. They have limitations, however, when applied to rare, complex disorders in which multiple genes, or gene-environment interactions are likely to play pathogenic roles. It has been suggested that family-based association studies (e.g., case-parent triad designs) employing a large number of candidate genes are more powerful than other approaches in dissecting the genetic contribution to such disorders (88).
Until recently most linkage and association analyses in autoimmune diseases studied candidate genes coding for HLA, the T-cell receptor, GM and KM allotypes, the complement components, and other relevant proteins. Now, with the availability of very polymorphic microsatellite markers scattered throughout the genome, one can search the entire genome for autoimmunity genes without knowledge of their mode of inheritance or function. Genomewide searches in RA, SLE, and MS provide evidence for multiple susceptibility genes involving MHC and non-MHC loci (89)(90)(91)(92)(93). A recent comparison of the linkage results from 23 published genomewide scans of human and animal model autoimmune or immune-mediated disease has been published (1). This review found that approximately 65% of the human positive linkages map nonrandomly into 18 distinct clusters, and these susceptibility loci overlap with those from animal models. These nonrandom clusterings suggest that many autoimmune diseases share a common set of susceptibility genes, reminiscent of findings from earlier studies involving kindreds with multiple autoimmune diseases (82,83).

Studies ofGene-Environment Interctions in the Etiology ofAuto une Diseases
There are several examples of environmental exposures that are involved in the etiology of specific autoimmune diseases. These include lupus induced by medications (e.g., hydralazine, procainamide) (94-97), toxic oil disease and contaminated rapeseed oil (98), Eosinophilia myalgia syndrome and L-tryptophan (99)(100)(101), and Lyme disease and the spirochete Borellia burgdorferi (102,103). Several studies have examined immunogenetic susceptibility factors that may influence the development or severity of disease among exposed individuals ( Table 4). As noted in the discussion of gene-association studies, the issues of linkage disequilibrium and control selection make it difficult to interpret many of these studies. (In most studies, little information was provided concerning the source of controls other than that they were normal or healthy.) An exception is the recent study by Ruberti et al. (103) examining specific DP, DQ, and DR alleles in relation to the development of long-term arthritis in Lyme disease. This is similar to the approach used by Richeldi et al. (104) in the analysis of genetic risk factors for chronic beryllium disease, an immune-mediated inflammatory lung disease caused by occupational exposure to beryllium. In their analysis of specific DP alleles in patients and controls, an association with alleles coding for glutamate in position 69 of the DP-31 chain was seen among workers with high and with low levels of beryllium exposure (105). This example illustrates the need for genetic studies based on functional analyses of allelic variation rather than on antigenic phenotype. Ou et al. (106) recently proposed a classification scheme for DR alleles based on function. Ottman (107) described a framework for studies of gene-environment interaction that could address different types of relations between genotype and environmental exposures. This approach to conceptualization, implementation, and analysis could contribute to future studies of gene-environment interactions in autoimmune diseases. There are many difficulties in studies of the role of genetic factors in autoimmune diseases. The polymorphisms in the genes we have discussed are relatively common: 45% of Caucasians may have the slow acetylation NAT genotype (16), and 10-25% of the population may carry higher risk MHC genotypes ( for example, see Table 3). But the prevalence of any of the specific autoimmune diseases is very low (approximately 1 per 100 for RA and MS, 1 per 1,000 for SLE). In this type of low-penetrance situation, it is necessary to consider multigene and gene-environment interactions.
Within a clinically defined autoimmune disease, there may be several different etiologic pathways. There may also be common etiologies between different autoimmune diseases ( Figure 3). The same environment (or constellation of environmental exposures) operating on different genetic profiles may result in different physiologic responses and clinical conditions. It is also possible that the same clinical condition could result from different environmental exposures operating on either the same or on different genetic backgrounds. Thus, the idea that a specific exposure (i.e., either a genetic or an environmental risk factor) will lead to a specific response is not necessarily true.
There has recently been a great deal of interest in methodologic issues concerning the study of gene-environment interactions involving case-control, case-only, and familybased designs (108,109). Power and samplesize estimates for various designs have been published (110)(111)(112)(113). Examples of this approach have involved studies of smoking, NAT, and bladder cancer (114); alcohol, alcohol dehydrogenase, and oral cancer (115); and maternal smoking, transforming growth factor alpha, and cleft palate (116). An important assumption in these designs is that environment is independent of genotype, that is, exposure level or opportunity is not influenced by the genetic factor being studied. This assumption is not always true and must be assessed within the context of any proposed study (117).
The past two decades have brought considerable understanding of the role of genetics in autoimmune diseases. Progress in this field has depended upon the development of more refined measurement tools-from serologic or other phenotypic assessments to DNA sequencing. Within the context of autoimmune diseases, the environment side of gene-environment interactions has received significantly less attention. Thus, to fully understand the complex etiology of these diseases, it is important to develop and apply appropriately refined measures for environmental exposures in study designs that allow examination of gene-environment interactions.