Teratological research using in vitro systems. V. Nonmammalian model systems.

In this review of alternative tests to whole-animal rodent studies, the use of sub-mammalian and sub-vertebrate systems is investigated. The history, methodology, known limitations, end points, dose response, and requirements of virus, hydra, planarian, cricket, fish, amphibia, Drosophila, and chicken embryo systems are discussed.


Introduction
Recently viruses have been considered as one of a number of simple, rapid, and quantitative biological screens for teratogens (1). This screen, like several other model systems, is still being developed, and has a limited history. Valerianov et al. (2) examined five pesticides by induction of the prophage of the lysogenic bacteria Bacillus anthracis STI-I (No. 2) and stated that the system was convenient for the testing of teratogenic action. Heinemann (3) suggested the use of prophage induction in lysogenic bacteria as a method for detecting potential teratogenic agents. Keller and Smith (1) reported that the test, when applied to 51 teratogenic and nonteratogenic compounds at multiple dose levels, predicted 86% of the teratogens (36 of 42) and eight of the nine nonteratogens.

Methodology
The virus assay is based on the ability of a primatederived cell culture to support infection by vaccinia vi-*Division of Toxicology, Food and Drug Administration, Washington, DC 20204. rus. Methods and materials have been described by Keller and Smith (1). Vaccinia-infected cell monolayers are exposed to the test compound in serial dilutions for 24 hr. The cells are collected, centrifuged, and serially diluted, and aliquots from each dilution are used to infect cell monolayers. The monolayers are covered with medium and agar and are then incubated at 37°C in a carbon dioxide incubator for 36 to 48 hr. The resultant plaques are visualized by staining the monolayers. The endpoint is the number of active progeny virions released from an infected cell that has been treated with a teratogenic agent. Control cell cultures consist of untreated but infected cells, based on the rationale that the virus will undergo reproduction only if allowed to infect a cell that is already actively proliferating. Since the virions take over the cell's biochemical machinery, the number of virions produced is a very sensitive indicator of the cell's health. Any teratogen that inhibits cell proliferation or disturbs cell metabolism in subtle ways that may not be immediately visible will cause a change in the virions produced.
The specific sequence of steps that the virus undergoes may be considered as a developmental pathway.
Each stage of viral development has been well described by Moss (4). The end point measured reflects direct interference with the virion's ability to uncoat itself in the cell's cytoplasm, translate its message, replicate its DNA, synthesize its protein core or lipoprotein membrane, and complete its morphogenetic pattern. The endpoint thus tests two specific parts of embryogenesis, the proliferative and metabolic state of the host cell, and the cell's ability to support a specific genetic and molecular pathway that may be analogous to differentiation.
Vaccinia and the mammalian cells used in the developmental phase of the study by Keller and Smith were both obtained from primate sources. If the assay were widely used, commercial sources would likely be available.

Critical Review
This assay should be considered for use as a first screen after further evaluation. Both teratogenic and nonteratogenic compounds need to be evaluated to validate the test.
Types ofCompounds That Can Be Tested. At present, there is insufficient information about the virus assay to determine if the types of compounds that can be tested are limited. The assay has not been tested for mixtures of chemicals.
End Points and Dose Response. A dose-response relationship does not appear possible, as the RD50 is used (50% inhibition/stimulation of the number of virions when compared to a nontreated, infected culture). Death of the cell is the only end point evaluated. At present, there is no way to predict stage-specific responses, nor is there any evidence concerning the repair of teratogen-induced anomalies. Both the cell and the virion have their own endogenous activation system, and the virions take over the cell's system.
According to Keller and Smith (1), 33 of 42 teratogens inhibited the virus, and eight of nine nonteratogens had no effect on virus proliferation. Information is not sufficient to determine whether the results are reproducible.
Time, Personnel, and Cost Requirements. Although described as a simple system, a considerable amount of equipment is necessary, and there could be major expenses in setting up the laboratory. In brief, a tissue culture laboratory must be set up including at least the following items: sterile hood, carbon dioxide incubator, inverted microscope, gas tanks and regulators, -80°C freezer (to store extra cells in case of contamination), and appropriate medium, ifiters, tubes, and plates. Presumably, the assay can be performed by a technician trained in cell culture and familiar with aseptic technique.
The assay time for one compound is 3 days. System contamination is not considered a problem; if contamination occurs, the system is sterilized and backup cells are taken from the freezer. Because vaccinia virus has no known host, it can be studied in a laboratory doing normal tissue culture work.

Hydra Introduction
Adult hydra and artificial embryos composed of randomly reaggregated cells of dissociated hydra have been proposed by Johnson and colleagues for the detection of teratogenic compounds. The first results from the assay were published in 1980 (5). In the few years since then, the original technique has been modified, the protocol and procedures have been defined, and the routine dose-response relationship has been refined (6)(7)(8). A technique similar to that of Johnson and colleagues was published by Kudia (9).
Many genera and species of hydroids live in fresh or salt water. They share with planarians a common capability for total body regeneration from parts. In addition, they can regenerate a whole body from dissociated cells that have been randomly reassociated into pellets (10). When the currently accepted mammalian tests are performed according to Johnson and Gabel (7), two basic factors are explored in the analysis of the results: 1. The relationship of the doses required to affect the adult and the developing embryo. Testing protocols for mammalian teratology tests, such as those proposed by the Interagency Regulatory Liaison Group, currently require that at least three dose levels be tested. The highest dose should be slightly toxic to the dam, the lowest dose should have no effect on either the dam or developing fetus, and the middle dose level should be intermediate between these two levels. Substances that affect the developing fetus (D) at or very near the adult (A) toxic dose level are not considered hazardous to the fetus, even though they can produce toxicity, runts, and/or terata. The ratio of A to D is a measure of the developmental toxicity hazard of the compound being tested. The greater the ratio of A to D, the greater is the likelihood that the test compound will disrupt the developing fetus without harming the adult. For example, the AID ratio is 60 for thalidomide and less than 1 for benzene. This low ratio for benzene indicates that the dam is affected by benzene at lower doses than those necessary to affect the developing fetus.
2. The determination of the no-observed effect level. The no-observed effect level is the starting point for cross-species extrapolation, designed to ultimately determine a safe exposure level for humans.
According to Johnson et al. (8) and Kudia (9), Hydra attenuata is the hydroid commonly used for testing. Readily available, this species can be maintained easily in fresh water in a laboratory. It is generally 2 to 25 mm in height, and it is fed freshly hatched Artemia salina nauplii. Vegetative reproduction by budding is the most common means of reproduction. The population doubles approximately every 4 days under optimum temperature conditions (11). Adult hydra are readily dissociable into their component cells, which can then be packed into pellets of randomly aggregated cells. If left undisturbed, each pellet develops into an adult hydra in approximately 90 hr. During the course of development, the artificial embryos undergo the same phenomena as do mammalian embryos: proliferation and regression of cells, induction and response, creation of organ fields, induction of competence, spatial orientation, pattern fornation, membrane changes, directional migrations, metabolic changes, cellular morphogenesis, histogenesis, organogenesis, and functional ontogeny (7). Some developmental events, such as membrane changes and directional cellular migrations, are achieved by previously differentiated cells of the adult hydra, while other responses, such as organ field formation, occur through the action of previously undifferentiated or interstitial cells (12). Methodology Treatment procedures are described by Johnson et al. (8). Adult polyps are placed in 5-mL glass wells containing the test compound and an antibiotic in the defined hydra medium. The test protocol consists of four tests: two range-finding tests and two confirmatory tests. The range-finding tests are used to determine the minimal concentration necessary to produce the endpoint toxic response. Each test compound starts with an evaluation in which intact adult animals are chronically exposed at whole-log concentrations ranging from 10to 103 mg/L. The whole-log concentration which produces the irreversible endpoint effect is thereby determined. This concentration is tested by a second study of the same levels. After the lowest effective whole-log concentration has been confirmed, this concentration is divided into tenths, and the experiment is repeated at these concentrations. The end point of the test has been determined to be the "tulip" stage of the polyp; it is the stage which immediately precedes the death of the polyp. "Tulips" removed to normal media seldom recover.
In the assay for teratogenic compounds as described by Johnson et al. (8), one polyp per log concentration is tested in the first experiment, and one polyp per onetenth log concentration is tested in the third experiment. The second experiment, which confirms the results of the first range-finding study, uses two polyps each to test the next higher whole log, the lowest effective whole log, and the next lower whole log. One polyp is used as a control. The fourth experiment, which confirms the study using the one-tenth log concentrations, uses three polyps each to test 2/10 log dose higher, 1/10 log dose higher, the effective 1/10 log dose, 1/10 log dose lower, and 2/10 log dose lower; one polyp is maintained as a control. No statistical procedures have been applied because each test group is so small.
Extensive, detailed instructions for the derivation of the artificial embryos from the adult polyps are given by Johnson et al. (8). Approximately 300 adult hydra are dissociated into cells and fragments by repeated pipetting, and the fragments are allowed to settle out before the supernatant is transferred to a centrifuge tube. After centrifugation, the "pellets" of dissociated cells are expelled into 5-mL test wells, each containing medium and the test compound. Each well contains one of a series of log concentrations from 103 to 10-3, as described above for the adult tests. The following tests are carried out by the same procedures described for adults.
The reaggregated embyros are observed at 4, 18, 26, 42, and 90 hr; each time corresponds to a specific stage in the development of an untreated embryo. The endpoint of the test is the dissolution of the embryo, but the manner of dissolution differs among compounds.
After the adults and the embryos have been tested, the AID dose ratio is calculated. Following this calculation, the toxic and teratogenic doses for rats and mice are computed and compared with the AID ratio ofhydra.
Depending on the species and route of administration, the ratio may vary between hydra and mammals, but it is generally in the same range.

Critical Review
Thus far, 38 compounds have been tested by the hydra assay in the laboratory of Johnson and colleagues. The first 24 compounds included a diverse group of nonnutritive sweeteners, alkaloids, metallic salts, hair dye products, solvents, and vitamins, for which published reports were available and in which the low end of a dose-response curve had been determined (7). Mammalian toxicity data are not available for all ofthe second group of 14 glycols and glycol ethers (13). Within the laboratory in which they were carried out, the hydra assays have been duplicated readily (8). Nine additional compounds were tested by Kudia (9). These are acetaminophen, Agent Orange, cabon tetrachloride, 2,4-D, formaldehyde, lead dioxide, lithium chloride, salicylic acid, and 2,4,5-T.
Types of Compounds That Can Be Tested. According to Kudia (9), the hydra assay is capable of evaluating any compound except those which contain copper, because copper interferes with protein synthesis in the hydra. According to Johnson et al. (8), test substances that normally occur in a liquid state, as well as compounds soluble in water, can be readily tested. The assay's ability to test water-insoluble agents is limited (14,15). According to Johnson et al. (8), prior information concerning test compounds is not needed, except for information on solubility and stability. These criteria, then, make the assay available for the testing of a complex mixture of unknown composition. The results of the hydra assay may provide for a ranking of substances according to their hazard potential. Because ofits single end point, the assay should not be considered for an estimation of risk of any compound.
End Points and Dose Responses. A dose-response line is sought in which death is the single end point. Statistical procedures cannot be applied to this assay in its present form because one or a very small number of animals is tested. If the number of animals per group could be increased, routine statistical analyses could be applied, but the cost of each assay would increase. The amount of increase is not known.
There is no indication that the assay provides any information on the mechanism of the insult, nor is it designed to test any stages other than the toxic end point. The developing stages of the hydra embryo that are monitored have no direct correlation with other developing organisms. Within the assay, there is no procedure for the evaluation of the possible repair of teratogen-induced anomalies. It would be possible to make biochemical evaluations of the developing system, but because of the differences in metabolism between hydra and mammals, any attempt to extrapolate the results to mammalian development would be suspect.
Time, Personnel, and Cost Requirements. The time required to complete one assay is approximately 4 weeks. The cost of setting up a laboratory for routine testing using hydra is modest. If the technique ofJohnson and colleagues is to be used, a licensing agreement with Thomas Jefferson University must be obtained, and a charge per assay is to be paid to the University. The most expensive equipment needed is a centrifuge to separate the dissociated cells. Two large water tanks are needed for holding several thousand stock animals. Equipment for monitoring and analyzing contaminants in water must be available; this equipment may already be available if the laboratory is conducting other Good Laboratory Practices investigations.
The assays can be performed in a room without special environmental conditions. Contamination of the water by bacteria or fungi is a minimal problem. Contamination is introduced by brine shrimp and can be overcome by centrifuging animals in the contaminated medium and placing them in new medium.

Planarians Introduction
Free-living flatworns, common invertebrates with regenerative capability, have been proposed as a screen for potential mammalian teratogens. The suggestion is based partly on the fact that although planarians such as Dugesia dorotocephala and D. tigrina are only onefive millionth the mass of an adult human, the ratio of brain to body weight approximates that of a rat (16).
Best and Morita (16) have advocated the use of the planarian as a screen for compounds with teratogenic potential. Other more general studies of the toxicity of specific compounds have also been made, e.g., studies of aflatoxin by Llewellyn (17).
Planarians such as Dugesia sp. are inexpensive, readily available, and easy to maintain in the laboratory. The regenerative process is well known (16), as is their reproductive cycle. Planarians can reproduce either sexually or asexually. In sexual reproduction, the animals are hermaphroditic. After ovaries and testes develop, sexually mature animals copulate with other sexually mature animals and lay cocoons containing several eggs that hatch into small planarians. In asexual reproduction, the animal lengthens and anterior and posterior portions of the animal tear apart. The anterior portion regenerates a new tail and the posterior portion regenerates a new head. Similar regeneration follows surgical fragmentation. Aberrant and delayed morphogenesis and variations in the normal regenerative process are studied in tests for the effects of specified compounds when the animals are used as a screen for teratogenic compounds. The process of differentiation in planarians is considered to be similar to embryologic development. The various cell types needed for planarian regeneration are formed by differentiation of undifferentiated stem cells called neoblasts. Complete planarians can regenerate from neoblasts (18). Sur-gically decapitated animals can regenerate a new head in 2 to 3 weeks.
When tested, planarians exhibit both lethality and sublethal responses such as visible lesions and tumors, head resorption, changes in the incidence of fissioning, changes in behavior, and aberrant morphogenesis. A variety ofcompounds have been tested for acute toxicity with planarians, including alkaloids, drugs, metals, pesticides, toxins, petroleum derivatives, ethanol, and complex toxicant mixtures (16). In terms of sensitivity and type of response, the results of the acute toxicity tests showed values that fell within the range of variability of mammalian values, with the exception of arsenate (16). Teratologic responses have been observed in two types of tests: exposure to the compound after surgical dissection of the animal, and exposure of intact animals to see if they develop and respond abnormally.

Methodology
The methodology for using the planarians as an assay for teratogens is described by Best and Morita (16). The animals can be kept in a laboratory, and they will reproduce asexually. However, the stock should be invigorated by the addition of a new stock every few years. Two species of planarians have been studied previously, D. tigrina and D. dorotocephala. Both species have regenerative capabilities, and both react to toxic compounds. Before dissection, the animals are starved to lessen the chances of infection from the intestinal contents. To provide greater consistency in the cutting of sections, they are anesthetized by placing them on a piece of wet ifiter paper on a Petri dish half filled with ice and covered with aluminum foil. The predetermined cuts are made under a dissecting microscope with two fine needles. The pieces are placed in solutions of the test compound and are observed for delayed development, aberrant development, and lethality. According to Best et al. (19), chlordane has produced head aberrations, and head resorption was seen after methylmercury exposure. According to Ansevin and Wimberly (20), when the pieces were allowed to regenerate in water before being placed in actinomycin-D, they were more resistant to treatment by the test compound; however, the generality of this phenomenon is unvalidated.

Critical Review
Types ofCompounds That Can Be Studied. Watersoluble compounds are the easiest to test, but water solubility is not a requirement. Water-insoluble compounds have been tested by using emulsions in water or an egg yolk carrier (16). Complex chemicals have been tested, such as shale oil process water.
Because of their lack of skeletal myoneural endplates (their muscle fibers and myoneural junctions resemble smooth muscles), the animals are insensitive to curarelike agents such as succinylcholine (16).
End Points and Dose Responses. For teratogenicity testing, the major end point is head regeneration.
For example, grossly visible abnornalities in head regeneration of decapitated planarians were produced by concentrations of 0.1 and 0.2 ppm methylmercury (21). The planarians exhibited a dose-response relationship. Tumors and teratogenic effects were induced in planarians after exposure to the carcinogens dimethylbenzanthracene, benzypyrene, and benzanthracene. The effects included a supplementary eye, a second head growing from the caudal end of the animal, and an extra head growing out of a tumor mass (22).
Regeneration in planaria is certainly a reproducible phenomenon. However, strains appear to differ. For example, Ansevin and Wimberly (20) reported different responses to the same compound when animals from an old and a new culture were compared. As with other in vitro systems, a series of validation tests in different laboratories with known teratogens is needed to determine this animal's usefulness as a screen.
Enough animals are tested per dose level so that standard statistical tests such as chi-square can be applied. In the range-finding study which precedes the actual study, 20 animals are usually tested per concentration.
In the actual test, 10 geometricaliy graduated dose levels are tested with 40 to 60 animals per dose level.
The animals can be used in mechanistic studies such as mechanisms of repair. These animals have their own activation systems, and the effects of extrinsic enzyme (S-9) preparations are not known.
Time, Personnel, and Cost Requirements. Performing a range-finding and regular exposure study in which exposure lasts 2 weeks takes approximately 1 month. If additional histological or behavioral results are needed, time for these studies must be added.
A laboratory to perform assays using planaria can be set up at moderate costs. A room with air conditioning is needed, as well as dissecting microscopes, standard glassware, and containers for holding planaria (glass or enamel pans). With regard to a source of the animals, D. tigrina is available commercially, or the species may be collected from wild stock. The animals are fed raw beef liver, or Daphnia if available. The tests can be perfonned by technicians with proper training.

Crickets Introduction
The use of the embryo of the cricket, Acheta domesticus, as an invertebrate screen for the detection of teratogenic compounds was first described in a series of papers in 1980 and 1981 by Walton and colleagues (23)(24)(25)(26). The assay was developed to study the toxicity of chemical components of coal-derived synthetic fuels on insect eggs. The cricket was selected because it is a terrestial insect representative of those inhabiting leaf litter and soil that might be exposed to synthetic fuel toxicants.
The first cricket studies used acridine, a compound known to be present in coal and coal by-products. Acridine toxicity to Selenastrum capricornutum and Daphnia magna was already known.

Methodology
The assay appears to be simple and relatively inexpensive, as described by Walton (27). Crickets are commercialiy available, or they can be raised in the laboratory. Enough cricket eggs are obtained by depriving adults of suitable oviposition sites in the rearing cage. Under these conditions, crickets retain eggs in the ovarioles until a moist substrate is present. Females maintain sperm in the spermatheca, and the eggs are fertilized as they leave the body during oviposition.
Two versions of the test have been performed. In one version, the eggs develop in a sand substrate contaminated with compound. This method most closely resembles the natural route of exposures for these animals. In the second version, the compound is applied topically to the eggs, which develop in a Petri dish. Any compound that can be administered through a syringe can be tested by the second method.
Range finding is performed to determine five concentrations that give a survival rate between 0 and 100%. Each concentration is replicated three times in 1 day. The entire experiment is duplicated on a second day to verify the results of the first day.
In the first version, the female is introduced into a beaker containing a layer of damp, contaminated sand, where she lays approximately 300 eggs. The female is removed after 24 hr and the eggs are incubated. Three replicates are run, for a total of approximately 900 eggs per concentration.
In the second version, the eggs are laid in damp, clean sand, removed from the sand, and treated topically by syringe under a dissecting microscope. This version has the advantage of requiring fewer eggs; approximately 20 eggs per Petri dish are treated, and three replicates are run, for a total of 60 eggs. Both solvent and untreated controls are included in each experiment. The eggs are incubated at 31°0. Two end points are examined: teratogenicity and teratogenicity plus embryotoxicity. In the first instance, the eggs are observed on day 7 for the presence of a greater than normal number of compound eyes. The compound eyes appear as red spots in the developing egg when observed under a dissecting microscope. In the second instance, the complete test is performed until the nymphs emerge, normally on days 10 to 14. At emergence, the nymphs are observed for terata. Terata described have included extra eyes, antennae, and heads, as well as appendages distally duplicated. Treated eggs may also take longer to develop.
Walton (27) showed that treatment of eggs 0 to 8 hr of age produces the highest mortality, and survival increases with increasing age up to 32 to 40 hr of age. Treatment after 40 hr of age produces no difference in survival. Treatment of 0 to 32 hr of age that increased mortality also produced embryos with extra compound eyes, whereas treatment after this period produced no extra compound eyes. Analysis of stained embryos by Walton (27) showed that maximum vulnerability preceded germband formation (36 hr). In insects, the germ-band is formed by a fusion oftwo clusters of cells (lateral anlagen) that coalesce at one pole of the egg.

Critical Review
The assay is simple, quantitative, inexpensive, and relatively rapid, and permits testing a diversity of compounds. It needs to be validated against a number of known mammalian teratogens.
Types of Compounds That Can Be Tested. Watersoluble compounds can be tested, as well as hydrophobic compounds, and any liquids that can be administered by syringe. Mixtures of unknown compounds have been tested, including contaminated soil. Solid compounds are well mixed with the sand, and the sand is moistened with water. Most of the compounds examined thus far have been coal-derivative and synthetic compounds (28). Mammalian teratogens need to be tested to evaluate the assay for routine use. Reproducibility ofresults within the same testing laboratory has been very good, but there is also a need for interlaboratory comparison of results.
End Points and Dose Responses. As described previously, there are two end points, teratogenicity and teratogenicity plus embryotoxicity. The first end point is quantitative and easily observed in the intact egg at 7 days. For the second end point, the nymphs are observed for development aberrations under a dissecting microscope at the time of emergence.
In studies of impure acridine, a dose-response relationship was seen in the number of embryos with abnormal eyes. The mechanism of the insult has not been determined, but there is a possibility that comparative studies could be made with eggs of a different species such as the field cricket, Gryllus rubens. In tests with substances causing multiple compound eyes in A. domesticus, normal numbers of eyes were seen in G. rubens, but the nymphs lacked as many as five legs (27). Thus far, only eye abnormalities have been determined, and the window appears to be narrow. According to Walton (27), the most vulnerable stage is from 0 to approximately 36 hr of age.
Cricket embryos have their own endogenous metabolic system. Both nymphs and adults have been reported to metabolize xenobiotics (27). It is not known what effect exogenous S-9 preparations would have on the eggs.
Time, Personnel, and Cost Requirements. According to Walton (27), if only the first end point is sought, i.e., the difference in the number of compound eyes, it can be obtained in approximately 7 days. A complete test requires approximately 14 days.
One person can perform a test, apparently with minimal training. Manual dexterity and normal eyesight are required to do the assay.
The cost of setting up a laboratory is modest. One or more dissecting microscopes, an electrobalance for weighing compounds, sand, beakers, metal mesh screening, Petri dishes, filter paper, and an incubator to maintain the eggs at 31°C are needed. A place to maintain a cricket colony is also needed; this consists of a cage in a room with controlled temperature and photoperiod. A commercial source of crickets is available. The largest expense in this test appears to be the salary of the operator.

Fish Introduction
The use of fish embryos as a screen for teratogenic compounds is relatively recent. It appears to have developed from studies of aquatic contaminants by Birge and colleagues in the 1970s (29)(30)(31)(32)(33), and from a reproduction test on zebra fish by Kihlstrom et al. (34) and Streisinger (35). Fish embryos have been advocated for use in prescreening environmental contaminants for teratogenic potential (36).

Methodology
In the method advocated by Birge and colleagues, the assays can be performed in either of two ways, by a static system in which the eggs are placed in deep Petri dishes or similar dishes for testing, or by a flow-through system in which the water is continually being circulated and monitored. The eggs are observed to hatching; hatchability is measured, as well as larval mortality and teratogenicity. Observations continue for 4 days posthatching; this time allows observation of delayed hatching due to interference by the compound tested. In the hatched fish, the time permits the sensitive early larval period to be monitored.
Fertilized eggs obtained from spawning fish or a hatchery are placed in deep Petri dishes or similar vessels that contain water or a solution of the test compound. Approximately 100 eggs are placed in each dish. The water or treatment solution is changed every 12 hr. If a flow-through system is used, the water is constantly circulated and monitored, as described by Birge and Black (30). Flathead minnows, northern trout, largemouth bass, goldfish, and channel catfish have been tested by Birge and colleagues (30)(31)(32)(33).
Japanese medaka eggs have also been used for the bioassay for aflatoxin B1 (37). The medaka is an oviparous freshwater killifish indigenous to Japan, Taiwan, and southeastern Asia. The eggs have exceptional optical clarity and are regularly available commercially (38). A short-term screening test for reproduction toxicity using the zebra fish has also been proposed by Landner et al. (39).

Critical Review
Types of Compounds That Can Be Studied. The eggs need water for sui-vival and growth; hence, watersoluble compounds can certainly be tested as well as other environmental contaminants, including methylmercury, mercury, copper, cadmium, arsenic, zinc, and selenium (36).
End Points and Dose Responses. The end points measured are hatchability, larval mortality, and teratogenicity. Survival of eggs can be quantified and expressed in terms of concentration of the compound. In the atrazine tests (36), five dose levels were tested; dose relationship was evident in rainbow trout and was seen in channel catfish except for one dose level. Terata are sometimes measured in terms of survivability, based on the assumption that the terata observed preclude survival. In the study by Landner et al. (39), adult zebra fish were exposed to the test compound prior to, but not during spawning. The impairment of reproductive success was measured by an alteration in the hatching rate of eggs and by the survival and stress tolerance of embryos and larvae. In a study of medaka eggs, development was retarded in proportion to the increasing concentration of aflatoxin (37). The numbers of embryos that showed teratogenic changes were also correlated with increasing compound dose. Prominent changes were observed in the circulatory system, the optical system, the swim bladder, and the spleen. Appendage development was also affected: the tail was shorter, the pectoral fins were blunt, and there was evidence of somite degeneration. Common terata described by Birge and Black (30) included several degrees of Siamese twinning, immovable lower jaw, anomalous or absent eyes, truncated upper jaw, defective vertebral column, absent or reduced fins, and retarded yolk sac absorption.
Because of the numbers of eggs tested per dose level and the series of doses tested, routine statistical tests can be employed. The reproducibility of the tests in other laboratories has not been determined.
No information on mechanisms is provided. The window of information provided appears narrow and does not include information on stage specificity. The specific terata do not appear to be directly relatable to mammalian responses at the present time.
Time, Personnel, and Cost Requirements. The equipment to conduct testing by the static system should be less expensive than that for flow-through testing. Expenses for the static system include one or more fish tanks, appropriate glassware and oxygenation equipment, and equipment for testing hardness, pH, solute content, etc., of the water. The flow-through system requires several pumps, as well as an exposure chamber. If cold-water species are being tested, an environmental room should be available, where the constant cold temperature can be maintained. For warmwater species, regular room temperature should be sufficient. Details ofthe flow-through system are described by Birge and Black (30).
In a normal test, the eggs are tested from early fertilization to 4 days posthatching. Cold-water species, such as trout, hatch in approximately 24 days; thus a test can be completed in 28 days. Warm-water species, such as channel catfish, hatch in 5 to 6 days; therefore, in these species, a study can be completed in 10 days. Other species of fish hatch in a time intermediate between the two species.
A few technicians should be sufficient to handle the assay. They need laboratory experience in handling embryos. The cost for testing one compound by the flowthrough system is expected to be more than for testing by the static system. A problem that has been only partly resolved is the availability ofthe eggs on a year-round basis. A constant source of eggs is essential if the test is to be accepted as a routine assay. Most of the species investigated, such as rainbow trout, goldfish, largemouth bass, and channel catfish, have a definite spawning season. However, the flathead minnow spawns throughout the year, and eggs can be obtained at any time. The zebra fish also spawns regularly throughout the year, and it develops rapidly (40).

Amphibians Introduction
Amphibian embryos such as toads and frogs have been used by Birge and colleagues (30,31,35,41,42) since the 1970s in assays similar to those used for testing fish embryos. Thus far, several species of frogs and toads have been tested, including Fowler's toad (Bufo fowleri), leopard frog (Rana pipiens), narrow-mouthed toad (Gastrophryne carolinensis), pig frog (Rana grylio), red-spotted toad (Bufo punctatus), and southern gray tree frog (Hyla chrysoscelis). More recently, Dumont and colleagues (43,44) have developed a teratogenicity screen called the Frog Embryo Teratogenesis Assay: Xenopus (FETAX), which uses the early embryos of frogs.

Methodology
In the fundamental techniques as described by Birge et al. (36), fertilized eggs were placed in large Petri dishes or similar dishes, approximately 100 eggs per dish. The compound to be tested was dissolved in water; solutions were renewed at 12-hr intervals. Both pH and hardness of the water should be measured, and moderate aeration, as well as 19 to 22°C temperature, should be provided. Teratogenesis was determined at hatching as the percent of survivors affected by gross, debilitating anomalies. Times to hatching are 5 to 6 days for the leopard frog and 3 to 4 days for the other species. The types of gross anomalies observed are expected to preclude survival under natural conditions (45). Terata were counted as mortalities in Birge's studies (36). The end point measured was the median lethal concentration (LC50); it was determined at 4 days posthatching to allow for delayed hatching due to interference by the compound tested and to allow for the sensitive early larval period. The LC50 was determined by using log probit analysis.
If one species of frog were to be used as a year-round teratogenicity screen, a year-round source of viable eggs would be needed. As listed by Birge and Black (30), pig frog eggs are available March through June; red-spotted toad eggs, April through August; southern gray tree frog eggs, May through June; narrowmouthed toad and squirrel tree frog eggs, May through August; and leopard frog eggs, December through May. No eggs were available from August to December, according to Birge and Black (30).
In the FETAX assay by Dumont et al. (43,44), early embryos of Xenopus were exposed to ranges of compound concentration for 4 days. The embryos were examined for growth, development, abnormalities, motility, and pigmentation. The end points measured were LC50 (concentration lethal to 50% of the embryos) and EC50 (concentration producing 50% terata). The ratio of LC50 to EC50 was calculated to produce the teratogenic index (TI), a possible measure of risk assessment.

Critical Review
In proposing a bioassay system using amphibian embryos, one of the problems which must be overcome is that ofa bias concerning the use ofamphibians to predict mammalian effects.

Types of Compounds That Can Be Studied. In
Birge's studies, water-soluble compounds have been tested. Reproducibility of the tests has not been determined, as most of the screening work has been done in a single laboratory. Other similar studies have been made for toxicity determination using frog eggs or tadpoles (46); 34 known teratogens and six nonteratogens have been examined by the FETAX system (47). Of the 34 known teratogens, three gave false negative results. The positive or negative response was determined by the TI described above. The compounds tested included folic acid antagonists, vitamin A, ethanol, saccharin, and aspirin (44). Several concentrations of creek water containing discharges from abandoned lead and zinc mines were tested by the FETAX system (48). Teratogenicity was observed in Xenopus given heavy metals. In this study, the FETAX system was also able to detect negatives, that is, the Xenopus embryos were able to develop normally in sample water from the portion of the creek that supports life. End Points and Dose Responses. In Birge's studies, the end points measured are hatchability, larval mortality, and teratogenicity. Survival of eggs can be quantified and expressed in terms of concentration of the compound. The numbers of dose levels tested and the numbers of eggs tested per dose level are similar to those of fish assays, and the terata produced are the same as those seen in fish. In using amphibian eggs as test organisms, it is particularly important to be aware of differing sensitivity to toxicants. Sensitivity may vary with hatching time, egg yolk volume, and development stages; for example, in testing different species, it is important to regulate the exposure period to cover the same embryonic and postembryonic stages. An A/ D ratio is not applicable to this system, as the adults are not tested.
Favorable attributes of the FETAX system include the large numbers of embryos tested and the ease of detection of the endpoints (lethality and abnormalities, as described above). According to Dumont et al. (44) and Dumont and Epler (47), developmental defects were observed that were similar to those observed in mammals. The compounds also showed a linear dose response and decreased growth of the offspring. Based on the study by Dawson et al. (48), the FETAX system is also excellent for testing mixtures in field analysis.
The statistics need to be developed for this assay. Depending on size and maturity of the animals, egg production per female may vary in different species: for pig frogs, 6,000 to 10,000 eggs; red-spotted toads, 2,000 to 4,000; southern gray tree frogs, 100 to 500; narrowmouthed toads, 1,000 to 3,000; squirrel tree frogs, 80 to 150; and leopard frogs, 2,000 to 4,000. As stated previously, availability of the eggs is seasonal, and none may be available from August to December (30).
Time, Personnel, and Cost Requirements. The in vivo part of a study using amphibian eggs can be completed in approximately 2 weeks. The animals are observed posthatching; times to hatching are 5 to 6 days for the leopard frog and 3 to 4 days for the other species.
Setting up a laboratory is the same for amphibians as for fish. Either a static system or a flow-through system can be used, with a corresponding increase in cost. The technical personnel required are also the same. As with fish, the technical personnel need to do very careful dissections and should be biologists with an appropriate background in vertebrate anatomy.

Drosophila Introduction
Although fruit ffies have been used in studies of mutagenicity for many years, the use of the fruit fly, Drosophila melanogaster, as a possible teratogenic screen is of recent origin. The first studies were performed in the early 1970s on vinblastine (49), thymidine and deoxycytidine (50), and 5-bromodeoxyuridine (51). Seven additional compounds were studied; the results have been described by Schuler et al. (52).

Methodology
Directions for the test are given by Schuler et al. (52). In the test, five male and five female adult fruit ffies of the standard wild type are anesthetized and placed in vials containing distilled water and instant Drosophila medium. The chemicals to be tested are added to the distilled water, and the medium provides nourishment for the developing larvae. The vials are maintained at a temperature of 2500 and 60% relative humidity. The ffies are allowed to mate and females deposit eggs for 6 days. From the time the larvae hatch to the time the adults emerge is approximately 9 or 10 days at 25°C. A range-finding test is perfonned to find the maximum tolerated dose (MTD), that is, the dose that reduces the number of adults. Following larval exposure to the MTD, the adults are examined within 16 hr of emergence from the puparium.
Scoring of abnornalities and the procedure for observation are clearly defined. At least 200 ffies per dose level are scored. The ffies and the individual body parts are examined for shape, size, color, body part alignment, excess tissue growth, extra parts, and missing parts.

Critical Review
At present, the utility of the assay as a screen for teratogenic compounds needs to be evaluated by testing known teratogens and nonteratogens.
Types of Compounds That Can Be Studied. Thus far, only water-soluble or ethanol-soluble chemicals have been tested successfully. Apparently, unknown chemicals or mixtures of chemicals can be tested only if they are water-or ethanol-soluble. As a low-level screen, the system might be useful in selecting positively reacting compounds for mammalian testing. Thus far, there is no evidence that the assay has been used for prioritizing compounds for testing.
End Points and Dose Responses. The embryos have yielded different abnormalities in response to various test compounds. Four types of bristle abnornalities, three types of leg abnormalities, and irregular shapes of the head, thorax, and abdomen have been seen. Multiple dose levels ofdimethyl sulfoxide and several ethylene glycol derivatives were given, and there was evidence of a dose relationship, according to developmental work referred to by Schuler et al. (52). These investigators reported dose relationships in wing notches after administration of ethylene glycol monoether. Results after administration of dimethyl sulfoxide or sodium heparin are less distinct. The compounds are not tested on adult ffies; hence an AID ratio is not computed. Of the compounds that have been tested, none are on the list of candidate compounds for in vitro test validation (53).
Data from treated and control groups can be analyzed statistically by the chi-square test, according to Schuler et al. (52). The reproducibility of the results has not been determined.
The Drosophila test presents a possibility of testing for retardation of development and emergence; malformations such as bristle defects, wing notches, and haltere defects; and death if ffies fail to emerge from treated eggs. At the present time, only water-soluble or ethanol-soluble compounds have been tested. However, because of the information available on the genetics of the ffies, it may be possible to produce strains of ffies that are sensitive to specific compounds or specific structures within compounds.
Because of the way the compound is administered (i.e., in the medium) stage-specific response is not feasible, nor is it possible to determine the repair ofinduced anomalies. As the ffies develop, they could be evaluated for specific biochemical activities. The system has its own endogenous activation system, and it is not known what effect exogenous preparations would have on the system. Time, Personnel, and Cost Requirements. The equipment needed for the Drosophila assay is neither expensive nor unusual. An incubator is necessary, as well as one or more binocular dissecting microscopes. A sufficient supply of Drosophila medium, yeast, and glassware is needed also.
According to Schuler et al. (52), technical proficiency in performing the assay can be obtained in approximately 2 weeks, but the degree of objectivity and the visual skills vary with different individuals. Some dexterity is required for the fine manipulations required during the systematic examination for external morphological anomalies. Because of the observable differences between treated and control vials, e.g., in the reduced numbers and delayed emergence of the ffies, it is not possible to observe the effects "blind" unless a second person examines the ffies after removal from the vials.

Chick Embryo Introduction
The avian egg and developing embryo have been used to test for toxic and other effects for a longer period of time than any of the other assays considered thus far. Dareste (54) referred to studies on chick embryos, but Ancel's work (55) appears to have been the first report of malformations induced in the chick embryo with chemical agents. Karnofsky, one of the pioneers of the field, made a systematic study of the toxic and teratogenic effects on chick development of various substances, particularly metals (56,57). He selected the fourth day of incubation for injecting test compounds in the yolk. This day was chosen so that developmental defects could be induced that were compatible with embryo survival on day 18. Recently, Luepke (58) has elaborated on a variation of the method called the Hen's Egg Test.

Methodology
The procedure currently in use for chick embryo tests was described by McLaughlin et al. (59) and Verrett et al. (60) and can be summarized as follows: In its simplest terms, the assay involves the selection of eggs, injection, and incubation. Eggs of White Leghorn chickens are readily available, and hatchability of the eggs and reproducibility of the results are very high under laboratory conditions. Eggs to be injected are first candled to determine their viability, and defective eggs are discarded. The location of the air cell is outlined with a pencil. A restrictive range of egg weights is also applied to provide uniformity. Before injection, the eggs are randomized to avoid a series of infertile eggs in any group.
Range finding for the compound is achieved by testing two or more concentrations with 10 eggs per level. On the basis of the range-finding study, 20 or more eggs are injected with the appropriate amount of the com-pound. If the chemical is nontoxic, the experiment is repeated with the minimum number of eggs that will give a reliable and reproducible value for hatchability. If the compound is toxic, additional eggs are injected to determine the specific effects of the compound. Apparently, no set number of eggs are tested; rather, the number varies from fewer than 100 to several hundred eggs.
The compounds are injected at volumes up to 0.1 mL in solvents such as water, propylene glycol, corn oil, peanut oil, and dextrose solutions. Contamination of the eggs should be avoided by working in a sterile room, if possible. A hole is drilled in the center of the surface of the egg over the air cell. Fine particles of shell are removed with an aspirator to prevent the needle from carrying them into the yolk. Immediately before injection, each egg is shaken with a quick twist of the wrist to allow the germinal disk to float free in the egg, and the test compound is injected with a hypodermic needle inserted horizontally through the air cell into the yolk. After injection, the shell hole is covered by a small piece of tape.
The injected eggs are incubated and candled on the fifth day of incubation and each day thereafter. Clear eggs and dead embryos are removed for examination. If the hatchability of the eggs is being tested, the fertile eggs are transferred to a hatcher on the 17th day and kept at 37°C until they hatch. Embryos and hatched chicks are examined for structural abnormalities and signs of toxicity such as edema and hemorrhages. Skeletons and viscera are also examined. Percentage of mortality for each dose level and percentage of chicks with one or more abnormalities are usually determined. Statistical analysis can be perforned by routine chi-square test or more complex tests such as those described by Khera and Lyon (61).
The fourth day's procedure, described by Gebhardt (62), can be summarized as follows: At 96 hr of incubation, a hole is bored in the blunt end of viable eggs, a needle is inserted into the yolk through the hole, and 0.1 mL of the test compound is inserted. The hole is sealed with paraffin and incubated for 14 to 18 days.
Bacterial contamination rarely occurs because the chick egg contains bactericidal substances (63). Embryos older than 24 hr usually survive treatment. Sometimes it is not possible to use this route, e.g., for testing some fat-soluble compounds (64). The assay is therefore best suited for the testing of water-soluble or water-stable compounds. One exception to this rule is that compounds soluble in some nontoxic solvents, such as propylene glycol, can be tested. Propylene glycol is not teratogenic when injected into the yolk (65).
A similar testing procedure for aflatoxin B1 has been used for many years by the Association of Official Analytical Chemists (66). In this standardized procedure, toxicity level is sought, as well as teratogenic effects such as short feet, edema, and hemorrhages.
Injections can also be made in the subgerminal cavity (67). When trypan blue was administered in the subgerminal cavity, Beaudoin and Wilson (68) obtained a higher percentage of malformations than when the dye was introduced in the yolk. However, the treatment was also more toxic to the control group, and more malformed fetuses were seen in this group.
Compounds can also be injected via the allantois. The technique is commonly used in virology and has been used to establish the teratogenicity for viruses in chick embryos (69,70). From the third day of development, compounds can be injected via the amnion (62), but this method of injection is also toxic to control animals treated with saline (71,72). The compounds can also be injected via the air chamber (73) or intravenously. Intravenous injection is possible only after the 9th day (74) because of the size of the blood vessels before then. By this time, however, the sensitive period for the induction of malformations has passed.
The problems involved in injecting water-insoluble compounds into chick embryos surfaced with injection of thalidomide shortly after the discovery of limb malformations in rabbits. Various solvents were tried, such as carboxymethyl cellulose, dioxane, and glycerol. Some of the malformations obtained were the same as those induced by injecting other insoluble compounds, such as sand, alumina powder, clay, or glass. The World Health Organization Teratology Committee (75) recommended that three dose levels be used. The highest dose should be toxic but not lethal to all embryos; the lowest dose should be in the range used for clinical purposes; and the third dose should be intermediate in effects between the first two. Difficulties in interpretation can arise when the highest dose is teratogenic only at very high levels, such as greater than LD10. In teratological screening tests in chick embryos, it is very important to maintain physiological conditions that are nearly normal.
Perhaps the most complete system for the testing of compounds has been devised byJelinek (76,77), Marhan et al. (78,79), and Vesely et al. (80). This system is called the Chick Embryotoxicity Screening Test (CHEST). The procedure requires 126 embryos and 5 mg of test compound. Subgerminal (day 2) and intra-amniotic (days 3 and 4) injections are administered. A dose-response curve is plotted, and the embryotoxicity effect level is determined at the point of intersection of the dose-response curve and level of nonspecific effects.
In addition to the chick embryo, other avian embryos have been used, such as quail and duck embryos. Khera and Lyon (61) evaluated 30 duck eggs per group for pesticide toxicity. The procedures for handling duck eggs are approximately the same as for chick eggs. Lansdown et al. (81) has also used the Japanese quail, Coturnix coturnix japonica, to test compounds for reproductive effects. The quail is smaller than the chicken, hatches in 16 to 17 days (compared with 21 days for the chicken), and requires only 6 to 8 weeks to mature. Under suitable laboratory conditions, quail lay eggs throughout the year. The eggs are approximately the size of pigeon eggs, but they are more variable than pigeon eggs in shape and color. Quail eggs also show variable fertility, and they cannot be candled to deter-mine viability because they are nearly opaque. The size of the eggs also necessitates the use of extremely small instruments for any type of injection.

Critical Review
Types of Compounds That Can Be Studied. The solubility of the compound may be a determining factor in the method of injection. For example, ifyolk injection is used, the compound should be soluble in water, propylene glycol, corn oil, peanut oil, or dextrose solution, all of which have been tested. If the test compound is administered via the air sac, solubility is not a limiting factor. The assay can be used to test combinations of unknown compounds, e.g., extracts from potatoes infected with Phytophora infectans. Like other screening assays, the positive results can be used as the basis for further evaluation of the compound in mammals. However, because of the sensitivity of the system, a positive response in the chick system does not necessarily mean that the compound is teratogenic in mammals. The results of the chick embryo test, when applicable, will allow the extrapolation of the mechanism to mammals, i.e., whether retardation, malformation, or death is the end point.
Recently, the CHEST system was used to estimate the embryotoxic potential of 130 compounds, including industrial and agricultural chemicals, analgesics, antiarrhythmic drugs, antibiotics, antihistamines, antiparasitic and chemotherapeutic drugs, cytostatics, hormones, vitamins, hypnotics, psychotropic drugs, and food additives and contaminants (82). The substances were diluted in sterile redistilled water, 30% ethanol, 10% dimethyl sulfoxide, sunflower oil, or suspended in 1% carboxymethyl cellulose.
End Points and Dose Responses. The normal incidence of abnormalities in the species must be known. Once this is determined, the number of embryos needed to determine the probability level can be calculated and routine statistical procedures can be employed. To determine the optimum teratogenic dose, three dose levels are recommended: a high dose that is toxic but not lethal; a low dose within the clinical range that has no effect; and an intermediate dose.
The chick embryo test has been used in several laboratories for many years, and the results do not indicate any gross discrepancies. As stated above, the test as described by Jelinek (77) requires 126 embryos and 5 mg of test compound; the traditional test requires up to 100 eggs per dose level. The time for chick development is 21 days. If skeletal and visceral observations are required, additional time is necessary for fixing and staining the embryos.
An AID ratio is not applicable to this method of study, as no adults are tested. Instead, the emphasis is on the provision of dose-response and specific stage response. The chick embryo test, and especially the testing devised by Jelinek (77), have high predictive value for pregnant laboratory mammals. The test does not, how-ever, permit the evaluation of the possible repair of teratogen-induced anomalies.
Once the embryology is known for a species, it is possible to stop the development at a predefined time and test the biochemical reaction, or to test for reaction after administering antimetabolites. It should be remembered that the system has its own metabolic activation system, which is different from that ofmammals. The drug-metabolizing capacity of chick embryos is evident from day 2 onward (77).
The traditional chick embryo procedures have been known for several years, but the new procedures advocated by Jelinek need to be elaborated further and tested with both teratogenic and nonteratogenic compounds.
Time, Personnel, and Cost Requirements. Setting up a laboratory for chick embryo testing can be expensive. Several incubators are required, as well as staining equipment, candling equipment, glycerin, and appropriate glassware. If posthatching studies are done, one or more rooms are needed to observe the chicks. Because of the amount of work involved, several technicians are required for a study that includes posthatching observations.
A supplier of good fertile eggs should be found. The eggs should not have been treated with pesticides or any medication that could affect tests. The feed for the chicks also should be pesticide-free.