Comparable measures of cognitive function in human infants and laboratory animals to identify environmental health risks to children.

The importance of including neurodevelopmental end points in environmental studies is clear. A validated measure of cognitive function in human infants that also has a homologous or parallel test in laboratory animal studies will provide a valuable approach for large-scale studies. Such a comparable test will allow researchers to observe the effect of environmental neurotoxicants in animals and relate those findings to humans. In this article, we present the results of a review of post-1990, peer-reviewed literature and current research examining measures of cognitive function that can be applied to both human infants (0-12 months old) and laboratory animals. We begin with a discussion of the definition of cognitive function and important considerations in cross-species research. We then describe identified comparable measures, providing a description of the test in human infants and animal subjects. Available information on test reliability, validity, and population norms, as well as test limitations and constraints, is also presented.

Impairment of cognitive function is a recognized primary outcome of exposure to developmental neurotoxicants, such as lead, methyl mercury, polychlorinated biphenyls (PCBs), and other chemicals. Efficient inclusion of this end point in environmental studies will rely on a validated measure of cognitive function in human infants that has a parallel test in laboratory animal studies. The identification of a comparable measure of cognitive function in human infant and animal studies will facilitate toxicology studies designed to evaluate mechanistic and dose-response aspects of effects observed in human infants.
In this article, we present the results of a review of post-1990, peer-reviewed literature examining measures of cognitive function that can be applied to both human infants (0-12 months old) and laboratory animals.
"Cognition" is vaguely defined as "the act or process of knowing, including both awareness and judgement" (Merriam-Webster On-Line: The Language Center 2003). Hence, it is important to define cognitive function in the context in which it is used. For this article, we define "cognitive function" as encompassing learning, memory, and attention processes (Cory-Slechta et al. 2001). "Learning" is classically defined as a relatively permanent behavior change as a result of practice or experience. When an infant or young animal responds in an adaptive way to a stimulus, learning (or information processing) has occurred (Fagen and Ohr 2001). "Memory" is then defined as the persistence of a learned behavior over time (U.S. EPA 1998). "Attention" refers to a global behavioral construct that includes numerous response classes such as impulsivity, sensitivity to delay, activity level, sustained attention, and ability to manage delay of reward (Bushnell 1998;Bushnell and Rice 1999;Cory-Slechta et al. 2001). In infants, attention research has focused on four areas of visual attention: alertness, spatial orienting, attention to object features, and endogenous or internally directed, attentional functions (e.g., attention span, perseverance, and distractibility; Colombo 2001).

Cross-Species Developmental Neurotoxicity
The adverse effects of developmental exposure to neurotoxicants on various cognitive functions can be assessed in both humans and animals. However, the degree to which specific assessment techniques are comparable across species can vary dramatically. The 1990 Workshop on Qualitative and Quantitative Comparability of Human and Animal Developmental Neurotoxicity (Stanton and Spear 1990), sponsored by the U.S. Environmental Protection Agency and the National Institute on Drug Abuse, proposed four criteria for evaluation of such animal models: a) Developmental profiles of functional capacity should resemble those found in humans; b) conceptual or operational similarities should exist between behavioral measures of those capacities in developing humans and animals; c) developmental profiles of neurobiologic changes should resemble those found in humans, particularly those that underlie the functional capacity in question; and d) treatments that alter neural or behavioral maturation in humans should cause similar alterations in the animal model.
Over the past decade, neurotoxicologists have directed considerable effort toward modeling human cognitive function in animals and applying animal cognitive function tests to humans (Adams et al. 2000;Anderson 2000;Paule 2001; Rice and Barone 2000). Examples include the following: a) The operant battery test (OTB) from the U.S. Food and Drug Administration's National Center for Toxicological Research (NCTR), used with laboratory rhesus monkeys, has been successfully applied to assessments in 6-year-olds (Paule et al. 1999a(Paule et al. , 1999bSlikker et al. 2000). Performance of children on money reinforcement (nickels) operant tests of motivation, color and position discrimination, learning, short-term memory, and time estimation were compared with standardized IQ (intelligence quotient) tests. Many tests in the OTB have also been adapted for use in rats (Mayorga et al. 2000a(Mayorga et al. , 2000b. b) The Wisconsin General Testing Apparatus, radialarm maze, and the Morris search apparatus, used to test cognitive function in nonhuman primates or rodents, have been successfully adapted for tests of toddlers and preschool children (Overman 1990;Overman and Bachevalier 2001;Overman et al. 1996aOverman et al. , 1996b. c) The well-studied Computer-Assisted Neurotoxicology Assessment Battery, developed for older children and adults, has also been applied to animal models (Fray and Robbins 1996).
The models presented above have not been applied in infants. Similar applications of tests in animals to the study of human infants present obvious obstacles. Human infants lack language, display poorly developed motor skills, and undergo a prolonged period of infancy.
Nevertheless, a wide body of research in developmental psychology shows that infants, even newborns, learn, remember, and focus attention (Nelson and Luciana 2001). For risk assessment, the neurobehavioral assessment of infants presents two challenges (Bellinger 2002). First, the highly dynamic nature of early neurodevelopment presents a moving target, making it difficult to interpret apparent performance deficits in the absence of a baseline measure. Second, normal change over time is expected and must be distinguished from a deviation that may be triggered by neurotoxicant exposure.
Research in human infants has focused mainly on simple forms of learning such as habituation and classical conditioning, where the young infant's behavior is changed as a function of specific experience, and through which the memory store of the aging child is altered over successive life events (Lipsitt 1990). Operant learning tasks, in which the infant or animal must manipulate a specific part of their environment to receive a reinforcer, are possible only when the infant acquires sufficient motor skills for the task and thus are often limited to age 6 months and older.

Comparable Measures of Cognitive Function
Ultimately, neurobehavioral toxicologists seek a sensitive homologous or parallel test in human infants and laboratory animals that can distinguish normal subjects from those that have had an exposure to a neurotoxicant. Although tests of cognitive function can be performed in a variety of species and age groups, this review is limited to studies in rodents, nonhuman primates, and human infants (0-12 months old). Table 1 presents an overview of tests described here, identified as either homologous or parallel for each species that has been studied. Homologous tests are those for which the same procedure is followed in humans and the animal species. Parallel tests are those that are conducted in a different manner in humans and the animal species, but for which it is believed the same cognitive function is being measured. Table 2 summarizes information for each of the tests.

Eye-Blink Conditioning
Eye-blink conditioning (EBC) is a model system for studying neural correlates of learning and memory (Sears and Steinmetz 2000;Stanton and Freeman 1994;Woodruff-Pak andSteinmetz 2000a, 2000b). Data collected from human and animals (monkeys, rabbits, rats, cats, mice) show similar patterns of acquisition, retention, and extinction of EBC. Analysis of neural systems and structures involved in EBC have been documented through studies employing stimulation, lesion, and pharmacologic methods. Data collection has consistently demonstrated that brain networks used in EBC are virtually identical across vertebrate species, including humans, monkeys, rabbits, rats, cats, and mice. EBC can be used in the same way for comparison studies across the life span. EBC can distinguish between normative groups and populations with impaired learning or memory disorders, such as between normal and autistic children or between normal aging and Alzheimer disease.
The EBC procedure involves pairing a conditioned stimulus (CS; typically a pure tone) and an unconditioned stimulus (US; typically a brief air puff to the eyelid area). The EBC task can be varied by changing the length of the trace or complexity of the conditioning stimuli, or by methods such as discrimination reversal conditioning (Sears and Steinmetz 2000). There is evidence that delay EBC (when the CS and US overlap and coterminate) can be acquired and retained independently of the forebrain and independently of awareness, whereas trace EBC [which occurs when a short empty interval called the interstimulus interval (ISI) separates the CS and US] cannot (Manns and Clark 2002). In delay EBC, the memory trace is localized in the cerebrum, although the hippocampus is also engaged in the acquisition of a conditioned eye-blink response. Trace EBC depends critically on the cerebellum, but also on the hippocampus if the trace interval is sufficiently long (Kishimoto et al. 2001).
Infant model. Although EBC has been well studied in adults, considerably less work has been done in human infants and children. The developmental aspects of the conditioned response have not been systematically studied using either a cross-sectional or a longitudinal approach (Sears and Steinmetz 2000). From limited published data on normal infants and  Classical eye-blink conditioning (EBC) Pavlovian conditioning procedure involves pairing a Two rooms: one for parents and infant preparation, one for task conditional stimulus (CS; typically a pure tone) and an Standardized visual display of brightly colored objects unconditional stimulus (US; typically a brief air puff to the Soft band to secure the infant's head eyelid area). The air puff elicits a reflexive eye blink and, Flexible plastic tube to deliver air-puff to right eye after repeated conditioning trials, the response comes Two small 7-ohm speakers to deliver tone CS (1 kHz, 80 dB) to be evoked by the tone CS before or in the absence of Background music the air puff US (Stanton and Freeman 1994).
Two cameras to video the infant's head Variations: delay EBC, CS, and US overlap and coterminate; Signal box with counter and indicator lights for tone and air puff trace EBC, an ISI separates the CS and the US.
EMG recording equipment Custom-built EBC system: control presentation of stimuli and amplify EMG records Experienced technicians Approximately 45 min in 4-5-month-olds Visual habituation/novelty preference; Paired comparison: The infant is presented with a single or two Targets: abstract patterns and shapes (Colombo 1993), or a visual recognition memory identical targets for a period of familiarization. The familiar target combination of faces and abstract patterns (Rose et al. 2001b) is then paired with a novel one. The extra time spent looking at A three-sided, curtained enclosure with a pivoting stage for the novel target implies recognition memory. Nine or 10 presentation of paired stimulus targets comparisons are usually used in a session.
Peephole located midway between the two stimuli for observation of infant corneal reflections of stimulus patterns Computer for recording looks and looking time and controlling the timing of trials (Rose et al. 2001a) Habituation assessment: Each trial is either fixed by the As above experimenter or determined by how long the infant keeps looking In infant-control procedure, the computer creates the stimuli at a stimulus.
(animated pictures of animals), with the observer pressing a Measures: Look duration (longest look and mean look), time spent mouse button when the infant looks at the stimulus and off-target (pauses and exposure time), attention time changes releasing it when the infant looks away. (shifts of gaze between paired targets).
Visual recognition memory tasks: novelty scores (amount of time As above directed at novel target divided by time looking at both targets) are assessed, in addition to above measures.

Fagan Test of Infant Intelligence (FTII): A standardized paired
Targets: paired people faces (infants, women, men) comparison test of visual novelty preference, with 10 FTII is portable and can be conducted in infant's home simultaneous presentations of one familiar and one novel stimulus. A novelty preference score is calculated as the average percentage of time spent fixating the 10 novel pictures.
Disengagement fixation: After a fixation duration pretest, Darkened room infants are presented with a series of eight trials designed to Car seat measure latency of shifting fixation toward a peripheral target Screen for stimuli presentation, 75 cm from infant under conditions in which the central target either remains Stimuli (achromatic geometric patterns and color photograph present ("competition" condition) or is removed from the of a female face) display ("noncompetition" condition).
Mounted camera to monitor infant's gaze movements Adjacent rooms observer codes direction and duration of infant's fixations using pushbuttons interfaced with a microcomputer. Experimental trials are also analyzed off-line frame by frame.
Span task: Infants are presented with up to four items in Infant seated on parent's or caretaker's lap at a black table succession and then tested for recognition by successively Tester, shielded from infant's view, to present stimuli pairing each item with a novel one. Novelty scores are calculated Stimuli, colorful, attractive 3-D objects as above.
Draped screen on a black tray for presenting stimuli Infant's looks monitored and recorded via a peephole in screen to provide the number and duration of looks for each trial A-not-B; delay tolerance A-not-B The subject (infant or monkey) watches as a reward (toy for Procedural variations: location of ultimate hand motion in infants) is hidden to the left or right in one of two identical hiding sequence, distance between hiding locations, distribulocations (A or B). A few seconds later, the subject is encouraged tion of reaches on warmup trials, differences in covers of to find the hidden treat. The reward for correct reaching is the background surface, presence of distraction during delay, room toy (or treat). After successful retrieval of the toy (or treat) from illumination, and criterion for determining whether reach is location A on two consecutive trials, it is hidden in location B correct (Diamond 2001c;Noland 2001) with the subject watching.
Limitations: The task requires infant's active participation, Measures: A-not-B, correct vs. incorrect location reached on unlike assessments that measure looking time. The infant must the reversal trial (location B); delay-tolerance A-not-B: Length of search for the target on dozens of trials and remain motivated longest delay the subject can tolerate and still succeed in even after repeated failures. retrieving the treat on reversal trials (Diamond 2001a).
The task cannot be automated so problems are associated with tester-subject interaction.
Transparent barrier detour (object Toy (treat) is placed in box within easy reach of subject. There is Small clear box in which to place toy or treat, open on one side retrieval) a strong pull to reach straight for the toy through the side one is only looking, which must be inhibited when subject is looking through closed side of box.
Continued, next page children, Sears and Steinmetz (2000) described the developmental process. Between infancy and early childhood, the acquisition rate for the conditioned eye-blink response dramatically increases from 28% at 1 month to levels near 80% at 5 months, and near 70% for 4-to 6-year-olds. These conditioning rates are similar to rates seen in adults, although the optimal ISI required for conditioning varies from adult protocols. In 5-month-old infants, a delay of 650 msec produces more robust conditioning than do intervals of either 250 or 1,200 msec (Ivkovich et al. 2002).
In the first use of this procedure, 61.5% of 4-and 5-month-olds did not yield reliable data either because they failed to achieve the criterion number of trials (30 tone-air puff trials) or because of technical or procedural problems (Ivkovich et al. 2000). In a later study (Ivkovich et al. 2002), the attrition rate was reduced to 34%. The investigators have now published EBC data on more than 100 healthy, full-term 4-and 5-month-olds. In addition, data collected from 14 premature Mobile/train conjugate Infants at 3 and 6 months of age are conditioned to move an Stimulus: treat or toy reinforcement overhead crib mobile by kicking one of their feet (mobile conjugate Mobile or musical train with lighted press response box reinforcement). At 9 and 12 months, infants are conditioned to Limitations: activate a musical train and a bank of 10 lights with a lever press Test is labor intensive response. At each age, 15-min conditioning sessions are Significant respondent burden conducted in a series of home visits separated by 24 hr. After Infant motivational factors also impact on test. conditioning sessions, infants are tested after increasing delays The task cannot be automated, so problems associated with (1, 7, or 14 days later) until they exhibit no retention for 2 tester-subject interaction must be addressed. successive weeks.
Contour detection and closure detection: Using the mobile As above described above, the infant learns to kick to move the mobile. After two learning sessions on 2 consecutive days, one or more visual characteristics (contour and closure) of the mobile are altered for some infants and not for others. On the third test day, recognition and discrimination of the test mobile are assessed using kick rate in the presence of the training mobile (old) or a novel mobile (new) relative to a baseline acquired for that infant before learning the task.

Delayed nonmatching to sample
A sample object is presented. A delay follows, and then the Object (toy or treat) (DNMS) familiar object is presented alongside a novel object. The correct choice is to select the novel object.
Means-end problem solving At 7-8 months, task involves placing a cloth in reach of child and Cloth to lay on tabletop placing toy at the far end of cloth. To retrieve the toy, infant pulls Toy the cloth (one-step problem solving). At 9 months, infants watch Cover to hide toy while toy is placed on end of cloth and then hidden under a cover. Infant has to first pull cloth to retrieve cover and then remove cover to find toy (two intermediate steps). At 10 months, infants must remove barrier to grasp cloth, pull cloth to retrieve cover, and search under cover to find toy (three intermediate steps). For each task, infants receive several trials to solve problem. Score is based on criteria for evidence of intention to retrieve the hidden toy (Willatts and Forsyth 2000).

Event-related potentials (ERPs)
Evaluation of a synchronized portion of the QEEG, time-locked to Limitations: the onset of some event in the infant's environment.
The procedure has significant constraints, including problems of between-subject variability in placement of electrodes on the scalp, choice of reference electrode location, and muscle and other forms of artifacts (Marshall and Fox 2001).
Operant discrimination (object Visual/spatial displays are presented to the right and left of Displays, e.g., red circle, green square features and spatial mapping midline. Looking to a "correct" dimension (color, form, or spatial Auditory reinforcement: music discrimination) position) produces synchronous auditory reinforcement. Measures Limitations: Tasks are not standardized for use to detect retention of correct dimension. deficits in brain development or functioning.
Testing scales Bayley Scales of Infant Development II: Individually administered instrument composed of two main subscales: mental scale, 178 items that assess mental ability (memory, habituation, problem solving, ability to vocalize, language and social skills); motor scale, 111 items that assess motor ability (rolling, crawling and creeping, sitting, standing, walking, running, jumping). All items arranged in order of developmental difficulty. Specification provided for specific sets of items to administer to a child depending on chronological age (Bayley 1993).
Early Childhood Longitudinal Study reduced-item Bayley (ECLS-B): 9-month-olds, approximately 25 min to administer. A reduced-item set developed that can be administered in less time and produce reliable, valid scores equivalent to the full set (West and Andreassen 2002). Items have been selected for their operational ease and psychometric properties. Multiple items can be scored from one administration, and, in the motor specialty, several items can be scored from observation. infants (28-31 weeks) using simple delay EBC have been submitted for publication (Herbert et al. In press).
Animal model. A rodent model for studying development of EBC is well established (Woodruff-Pak and Steinmetz 2000b). The emergence of EBC occurs gradually between 17 and 24 days of age in the rat. Disruption of cerebellar development by administering an antiproliferative agent, neonatal alcohol exposure, or early cerebellar or hippocampal aspirations interferes with development of normal EBC (Ivkovich and Stanton 2001;Stanton 2000;Stanton and Goodlett 1998).
Classical EBC represents a promising test of cognitive function with a well-studied homologous laboratory animal counterpart. Additional data are needed on population norms for infants and on the predictive validity or correlation of EBC deviations from established norms in infancy with later childhood and adult cognitive function assessments. Approaches to increasing subject retention rates between conditioning sessions and refinement of procedures to achieve higher success rates on criterion trials in each conditioning session will further strengthen this method.

Visual Habituation/Novelty Preference Tasks and Visual Recognition Memory Tasks
Tasks based on habituation/novelty and visual recognition memory (also called paired comparison) paradigms have been used widely to assess information processing and attention in infants and monkeys (Sirois and Mareschal 2002). Habituation occurs when attention decreases to repeated presentation of the same stimulus; novelty preference occurs when attention increases at the later presentation of a new stimulus. Infants and animals have a preference for novelty. Habituation and novelty preference are interpreted as reflecting the subject's processing of stimulus information (Colombo 1993). Although the habitation/ novelty paradigm focuses on the developmental course and speed with which attention wanes to a repeated stimulus, the visual recognition memory paradigm is concerned chiefly with visual recognition memory as reflected in differential responsiveness to familiar and novel stimuli. Such responsiveness is assessed after an initial exposure to the familiar stimulus, which is considerably briefer than that afforded in the habituation paradigm (Rose et al. 2001b).
Paired-comparison task (look duration, shift rate, novelty score)-infant model. In this task, the infant is presented with a target for a period of familiarization. When the familiar target is paired with a novel one, infants typically spend more time looking at the novel target, implying recognition memory. The examiner records the number of looks and looking time (Rose et al. 2001a).
Habituation assessment-infant model. In habituation studies, each trial is either fixed by the experimenter or determined by how long the infant keeps looking at a stimulus. The length of the intertrial interval may also be varied. Which aspect to use as a predictor of risk has been the focus of considerable debate (Colombo 1993;Fagen and Ohr 2001). A large body of evidence indicates that look duration is related to performance, such that infants with shorter looks process information faster and more efficiently than do infants with longer looks (Colombo 1993;Rose et al 2001a). In addition, short lookers tend to process global properties before local properties, much like adults do, whereas long lookers tend to focus initially on local aspects of the stimuli. Of course, there is no way to know whether equal look durations reflect equivalent depths of concentration, what is being encoded, or how rapidly it is being encoded (Rovee-Collier and Barr 2002).
The infant-control procedure represents an important evolution in visual habituation procedures (Lavoie and Desrochers 2002). In this procedure, a trial begins when the infant looks at the stimulus and ends when the infant looks away. In a study of the short-term reliability of this test, a number of habituation measures and reaction to novelty response were shown to be a reliable and valid construct.
Visual recognition memory assessmentinfant model. There is substantial evidence that poorer performance on tests of visual recognition memory and slower habituation are associated with "risk" for cognitive delay. Among the groups studied are infants with Down syndrome and those with prenatal exposure to chemical teratogens, malnourishment, and prematurity (Rose and Orlian 2001). For example, in a recent longitudinal study of full-term and preterm (birth weight < 1,750 g) infants seen at 5, 7, and 12 months, full-term infants had shorter look durations, faster shift rates, less off-task behavior, and higher novelty scores than did preterms (Rose et al. 2001a).
Overall, mean predictive correlations are comparable for both habituation and visual recognition memory and tend to be approximately r = 0.45 (Rose and Orlian 2001). A prospective longitudinal study (n = 109) followed high-risk preterms and a socioeconomically matched group of full-terms annually through 6 years of age (Rose et al. 1992) and at age 11 (Rose and Feldman 1995). Visual recognition memory at 7 months and a 1-year crossmodal transfer (test of infant feeling object without seeing and then identifying it visually) each predicted Bayley scores at 2 years and IQ at 3, 4, 5, 6, and 11 years. Correlations of infancy scores with the various outcomes were similar for both groups and ranged from 0.37 to 0.65. Visual recognition memory and crossmodal transfer also correlated with speed of information processing, memory, and verbal and spatial abilities at 11 years of age (Rose et al. 1997).

The Fagan Test of Infant Intelligence (FTII)-infant model.
The FTII is a standardized paired-comparison test developed in the 1980s for the early assessment of infant intelligence using the fixation preference principle (Fagan 1990a(Fagan , 1990bFagan and Singer 1983). It has since been used to detect delayed mental development in infants subsequent to environmental exposure to neurotoxic chemicals (Darvill et al. 2000;Jacobson et al. 1985Jacobson et al. , 1996Simmer 2000;Winneke et al. 1998). The test is constructed for use at four gestational ages, 67, 69, 70, and 92 weeks, corresponding to 27, 29, 39, and 52 weeks postnatal age. Reviews of the predictive validity of the FTII report correlations with later tests of intelligence at 36 months of age ranging from 0.31 to 0.61 (Fagan 1990a(Fagan , 1990bFagan and Detterman 1992). The instrument also correctly predicted more than 80% of infants who were later identified as mildly to severely retarded. FTII test results in the first year of life predict intellectual performance (Stanford-Binet Intelligence Scale IV; Thorndike et al. 1986) at 8 years of age (Smith et al. 2002). However, there are questions regarding the strength of predictive validity of the FTII in nonrisk samples and variability in correlations depending upon the infant's age at testing (Andersson 1996). Andersson (1996) found low predictive correlations (0.21) in a longitudinal study on a random sample of 100 boys and 96 girls assessed on the Fagan test at 7 and 9 months and then again at 5 years. Furthermore, retest reliabilities at 2-week intervals for two observers in a small nonrisk sample of children at 7 months of age were found to be zero or even slightly negative (Winneke et al. 1998). In addition, recent research has questioned whether recognition memory is what is being measured in tests of this type (Colombo 1993). Other cognitive factors that could affect the FTII and related tests include sensory or perceptual visual discrimination, or speed of visual processing. Premature infants do less well at 6 months and 12 months of age than do fullterm infants (Rose 1983).
Disengagement fixation task-infant model. This task was designed to study whether individual and developmental differences in look duration are linked to development of neural attention systems that control the ability to disengage visual fixation (Frick et al. 1999). Look duration has been correlated with disengagement latency; longer-looking infants are slower than shorter-looking infants to shift fixation to a peripheral target on competition trials, but not on noncompetition trials. This task has been used only in a research setting examining the development of the neural attention systems that control the ability to inhibit visual attention.
Span task-infant model. The span task, based on visual recognition memory and paired comparisons, is designed to assess the amount of information infants can hold in short-term memory. Novelty scores provide a measure of performance on each task and an overall index of capacity (Rose et al. 2001b). Thus far, only one human study and no animal studies have used this task.
Visual habituation/novelty preference tasks and visual recognition memory tasks-animal models. In animals, the closest parallel tasks have been studied in monkeys. In the visual recognition memory test, adapted from human infant tasks described above, novel visual stimuli are paired with familiar stimuli and looking time for both is recorded. There are striking similarities between macaque monkeys and human infants in the development of visual recognition memory and other adaptations of paired-comparison tasks and in the effects of risks on cognition (Burbacher and Grant 2000). Monkey infants, like human infants, show deficits associated with severe birth trauma, exposure to teratogens, and low birth weight (Gunderson et al. 1987). Deficits in visual recognition memory have been documented, including exposure to methyl mercury (Gunderson et al. 1986), ethanol (Gunderson et al. 1987), and methanol (Burbacher and Grant 2000).
This task cannot be applied directly to rodents because the primary sensory modality is visual. Rat visual systems are relatively weak, and their "direction of gaze" is represented better by input from the auditory, tactile, or olfactory modalities (Bushnell 1998). Some have compared the novelty object proximity tasks in rats with the human infant paired-comparison tasks (Anderson 2000). This task measures the tendency of rats to explore an unfamiliar object placed within an open field. The limitations of applying tasks such as the novelty object proximity tasks and observational methods to studies of head gaze novelty preference in human infants and monkeys are reviewed by Bushnell (1998).

The A-not-B Task and the Delay-Tolerance A-not-B Tasks
Piaget's A-not-B task is widely used to study infant cognitive development (Diamond 2001a). Under the name "delayed response," the almost identical task is used in rhesus monkeys to study the functions of the dorsolateral prefrontal cortex (Goldman-Rakic 1987). Subjects must "hold in mind" for a few seconds where a treat (or toy) is hidden and, over trials, must update their mental record to record where the treat was hidden last. Subjects are rewarded for reaching correctly, hence reinforcing the response. This task requires an aspect of working memory (holding the information in mind) plus inhibition of the natural tendency to repeat a positively reinforced response on reverse trials. Infant model. By roughly 7.5-8 months of age, human infants correctly reach the first hiding location with delays as long as 2-3 sec (Diamond 2001a(Diamond , 2001b. When the reward is hidden at location B, infants make a mistake (called the A-not-B error) by going back again to the A hiding place. Between 7.5 and 12 months, infants show increasing improvements in their performance of the delayed-response A-not-B task. For example, each month they can withstand delays approximately 2 sec longer. By 12 months of age, delays of 10 sec or longer are needed to see the A-not-B error (Diamond 2001c).
Various adaptations of the procedures have been studied. In a longitudinal study on 13 infants, Bell and Fox (1992) rated infant's performance on an ordinal scale. Infants proficient at reversal trials on a given day received a score corresponding to that level of delay. Investigators have also developed a looking version of the task in which the eye gaze, not the reach, is the criterion evaluated. No differences in the performance of more than 100 infants on the delay-tolerance A-not-B tasks with an eye-gaze response, compared with the reaching response, have been documented (Bell and Adams 1999).
Interobservation agreement ratings on the A-not-B task are reported in the range of 85-95%, with higher ratings where videotape is used (Bell and Adams 1999). Differences in task performance have been reported between normal control infants and infants with Down syndrome, autistic children, and cocaineexposed infants (Noland 2001). There have been no demonstrations of predictive validity of the A-not-B task as a measure of individual difference (Noland 2001), although infants with phenylketonuria have been followed for 4 years with continued impaired performance on tests of frontal lobe functioning (Diamond 2001b).
Although there is a wealth of study and debate on establishing the cause of the response preservation seen in 8-to 12-month-olds in these tasks (Ahmed and Ruffman 1998;Carey and Xu 2001;Diamond 2001aDiamond , 2001bDiedrich et al. 2001), a standardized procedure for the tasks has not been developed (Diamond 2001c;Noland 2001).
Animal model. Infant rhesus monkeys improve on these same tasks (more quickly reaching the hiding locations, withstanding longer delays) during the same equivalent age period-1.5-4 months (Diamond 1991). In monkeys, an adaptation of this task, the object concept test, has been used to study in utero exposure to methyl mercury, lead, and methanol (Burbacher and Grant 2000).

The Transparent Barrier Detour Task (Object Retrieval Task)-Infant and Animal Models
Like the delay-tolerance A-not-B task, this task has been used in human infants and in rhesus monkeys to study working memory and functions of the dorsolateral prefrontal cortex. At 6-8 months in human infants and the equivalent age in rhesus monkeys, subjects reach for a toy or treat in a clear box only at the side through which they are looking. As they get older, subjects can look through the opening, sit up, and reach in while looking through the closed side. Infants 11 or 12 months old and monkeys 4 months old do not need to look along the line of reach. Infants and monkeys progress through a welldemarcated series of five stages in performance of this task (Diamond 1991).
There are wide individual differences in the rate at which infants and monkeys move through the tasks to retrieve the object. However, the age at which a given subject achieves "phase 1B" on the object retrieval task is remarkably close to the age at which that same subject can first uncover a hidden object in the delayed-response A-not-B task. The object retrieval tasks and comparisons with performance of the delayed-response Anot-B task have been mainly studied in relationship to development and function of the prefrontal cortex. The limitations described for the delayed-response A-not-B task also apply to the transparent barrier detour task.

Mobile/Train Conjugate Reinforcement Tasks
The mobile/train conjugate reinforcement tasks are based on operant conditioning and the rationale that infants who lack a verbal response can perform a motoric response (foot kick, lever press) to indicate whether they recognize a stimulus or reinforcement/ reward (Rovee-Collier and Barr 2002). The tasks involve acquisition of information regarding the relationship between behavior (kicking a foot or pushing a lever) and a reinforcement or reward (mobile or train moves). These tasks provide a direct means of assessing long-term memory (Fagen and Ohr 2001) because the extent to which the infant retains the learned action can be measured.
Infant model. Infants at 2-3 months and 6 months of age are conditioned to move an overhead crib mobile by kicking one of their feet, which is attached to the mobile by a ribbon. Foot kicks move the mobile in a graded manner that is commensurate with their rate and vigor, providing conjugate reinforcement. At 9 and 12 months, infants are conditioned to activate a musical train and a bank of 10 lights with a lever press response. At each age, 15-min conditioning sessions are conducted in Children's Health | Comparable measures of cognitive function Environmental Health Perspectives • VOLUME 111 | NUMBER 13 | October 2003 a series of home visits separated by 24 hr. After conditioning sessions, infants are tested after increasing delays (1, 7, or 14 days later) until they exhibit no retention for two successive sessions. From these series of experiments, investigators documented the duration of retention increasing monotonically between 2 and 18 months of age, based on standard parameters of training and testing. Reference curves have been developed to serve as a general model of normal memory development in the infancy period (Hartshorn et al. 1998a(Hartshorn et al. , 1998b. Other experiments with similar operant conditioning techniques have also been conducted (Fagen andOhr 1990, 2001).
From the limited number of longitudinal studies using this method of operant conditioning in human infants, data on the predictive validity are promising. Average correlation between infant memory measures (baseline and retention ratios) and 2-, 3-, and 5-year standardized developmental assessments of 0.45, 0.40, and 0.38, respectively, have been reported (Fagen and Ohr 2001). In a small number of studies, differences between normal infants and high-risk infants (preterm, Down syndrome, and cocaine-exposed infants) in retention of conditioning have been documented and reviewed by Fagen and Ohr (2001).
Animal model. The investigators developing these operant conditioning tasks compare this work with retention of a learned fear response by rats of five ages-18, 23, 38, 54, and 100 days (Campbell and Campbell 1962;Campbell and Coulter 1976). Fear was conditioned by administering a series of inescapable shocks on either the black or white side of a shuttle box. At 0, 7, 21, or 42 days later, rats were tested for their persisting fear of the shock side. As with memory development assessed by the mobile train/conjugate reinforcement tasks in human infants, rats of all ages exhibited equivalent retention after the shortest delay, but as the retention interval increased, the amount of conditioned fear varied directly with age.

Delayed Nonmatching-to-Sample Tasks-Infant and Animal Models
In the delayed nonmatching-to-sample (DNMS) task, a sample object is presented. A delay follows, and then the familiar object is presented alongside a novel object. The correct choice is to select the novel object. The task has been widely used in humans and monkeys as an assessment of working memory and attention (Diamond et al. 1999;Paule et al. 1998). In fact, the test is a component of the OTB and has been studied in children 6.5 years and older using the identical automated apparatus used to test monkeys (Paule 2000). Considerable reliability, validity, and population norm data for monkeys, human children, and adults are available.
However, human infants generally cannot succeed in the standard DNMS, even with delays of only 5-10 sec, until they are 21 months old (Diamond 1990;Overman et al. 1992Overman et al. , 1993. Likewise, infant monkeys do not reliably reach criterion on DNMS at 10-sec delays until 4 months of age. Diamond et al. (1999) postulated that infants failed on the DNMS not because of lack of memory requirements, but because infants did not understand the relationship between stimulus and reward or because spatial separations between response and reward or between stimulus and response make the task more difficult. Diamond et al. (1999) have designed a DNMS task for infants in which the infants do not displace stimuli to receive rewards, but the objects used as stimuli themselves are the reward. The protocol is the same as for the DNMS except the rewards are attached to the base of the stimuli. With this modification, 70% of 9-month-olds succeeded in the DNMS with a 5-sec delay. When verbal rewards (experimenter cheered and applauded when the infant reached correctly) were provided, 80% of infants passed the DNMS with a 5-sec delay (Diamond et al. 1999). Diamond and colleagues' modification of the standard DNMS has not been well studied as an assessment tool in infants, although there is a growing body of data on the standard DNMS in older children (Chelonis et al. 2000).

Means-End Problem-Solving Task
Means-end problem solving involves the deliberate and planned execution of a sequence of steps to achieve a goal. Means-end behavior develops after 6 months of age and involves the acquisition of knowledge of appropriate means-end relations and abilities such as planning, sequencing actions, and maintenance of attention to a goal (Willatts and Forsyth 2000). There is evidence that development of means-end problem solving is related to development of the prefrontal cortex (Diamond et al. 1997).
Infant model. Infants between 7 and 8 months of age can solve simple problems involving the completion of one intermediate step-for example, pulling a cloth to retrieve a toy sitting on top of it. By 9 months, infants begin to solve more complex problems requiring completion of two intermediate steps to achieve a goal. Infants first watch as a toy is placed at the end of a cloth and then hidden by a cover. To solve the problem, an infant must first pull the cloth to retrieve the cover and next remove the cover to find the toy. At 10 months, infants can solve more complex problems involving three intermediate steps: removing a barrier to grasp a cloth, pulling the cloth to retrieve a cover, and searching under the cover to find a toy (Willatts 1999). Means-end problem-solving tasks are structured so that the infant's sequence of behavior is scored according to specified criteria for evidence of intention to retrieve the hidden toy, with higher scores indicating more mature problem solving (Willatts and Forsyth 2000).
Two-step problem-solving scores at 9 months of age correlate positively with IQ (0.64, p < 0.01) and vocabulary scores (0.42, p < 0.01) at 3 years (Slater 1995;Willatts 1997). In a randomized trial of the role of long-chain polyunsaturated fatty acids in infant cognitive development, higher problem-solving scores were observed on the 9month two-step problem-solving task in the supplemented infants who at 3 months had demonstrated poorer attention control and had a lower birth weight. At 10 months, all children in the supplemented group displayed higher problem-solving scores on the threestep task (Willatts and Forsyth 2000).
Animal model. The incremental repeated acquisition (IRA) task, part of the OTB, might be considered a parallel test in monkeys and rodents. The animal is required to learn a sequence of lever presses to receive a reinforcer. First, in IRA1, the subject is required to learn the correct response to one of three levers. Next, in IRA2, the subject is required to learn a response on a different lever than for IRA1, and then a two-lever sequence. The tasks are incremented up to a six-lever sequence or until the allotted task time has elapsed (Mayorga et al. 2000a).

Event-Related Potentials
Quantitative electroencephalographic (QEEG) measures have been used in clinical settings to diagnose neuropathology and, in infants, to evaluate gestational age and maturational levels of newborns. The use of electroencephalographic (EEG) recordings in conjunction with other task measures has become a common practice in studying psychophysiologic processes. The use of EEG measures in conjunction with A-not-B tasks is reviewed by Marshall and Fox (2001).
An event-related potential (ERP) is a synchronized portion of the ongoing EEG pattern. The ERP is distinguished from the more traditional baseline EEG measure in that the evoked potential is a portion of the ongoing EEG activity that is time-locked to the onset of some event in the infant's environment (Molfese and Molfese 2001). The ERP reflects both general and specific aspects of the evoking stimulus and the person's perceptions and decisions regarding it (cognition) as reflected by changes in the amplitude or height of the wave at different points in its time course. ERPs are recognized as providing information concerning between-hemisphere differences as well as within-hemisphere differences in the brain's electrical activity under specific stimulus conditions. Infant model. ERPs have been paired with both vision and auditory assessments in infants and correlated with later intelligence measures. Studies in the 1960s through the 1980s using ERPs had mixed results (Molfese and Molfese 2001). Recent studies on small samples using newer technology and improved study design suggest that ERPs have value as predictors of later functioning. Studies reviewed by Molfese and Molfese (2001) showed measures obtained in later infancy and early childhood successfully predicted language and cognitive skills in older children. Nelson et al. (2000) used ERPs paired with auditory stimuli to test auditory recognition memory in normal newborn infants and the infants of diabetic mothers. Neonatal ERPs elicited by the maternal voice were compared with those elicited by a stranger's voice. Results were compared with Bayley scores at 1 year of age. The presence of a specific neonatal ERP pattern (greater positive slow wave area in response to stranger's voice) indicated better 1year cognitive development. In an earlier study (Nelson and Bloom 1997), ERPs were used for shape recognition at 4 months in high-risk preterm infants and healthy full-term infants. ERPs were recorded while infants were familiarized with one stimulus (a red cross, 15 trials) and a novel stimulus (red corkscrew). Atypical patterns were found in the high-risk infants.
Animal model. ERPs can be recorded in monkeys (Lilienthal and Winneke 1996;Lilienthal et al. 1994) and rodents (Winneke 1992) and are being used in a parallel pairedcomparison task in both monkeys and human infants in the University of Michigan longitudinal study of iron deficiency study (Lozoff B. Personal communication). Rhesus monkeys pre-and postnatally exposed to lead had consistent prolongations of latencies in the brainstem auditory evoked potentials (Lilienthal and Winneke 1996) and visually evoked potentials (Lilienthal et al. 1988).

Operant Discrimination Learning (Object Features and Spatial Mapping Discrimination Tasks)
Infant model. Colombo (2001) trained 3-, 6-, and 9-month-olds to an association between an auditory reinforcement and attention to visual/spatial displays. Colombo (2001) reviewed similar studies by Harman et al. (1994) and work by Catherwood et al. (1996) on determining the time course of the processing of visual features and their joining compounds in 5-to 6-month-olds. It is important to note that tasks such as these are currently used to examine how the infant brain develops and functions.
Animal model. Discrimination tasks in nonhuman primates are homologous to this task. The spontaneous alteration task in rats maybe a parallel model for this task. Rats exposed to PCBs prenatally showed altered performance on retention of visual discrimination tasks (Lilienthal and Winneke 1991 (Bayley 1993). The mental and motor scales are known to be well correlated, because several items are scored for both scales.
A reduced set of Bayley items has been developed that can be administered in less time and produces reliable, valid scores equivalent to those of the full set (West and Andreassen 2002). The Early Childhood Longitudinal Study reduced-item Bayley (ECLS-B) for 9month-olds takes approximately 25 min to administer. Items have been selected for their operational ease and psychometric properties. Multiple items can be scored from one administration, and, in the motor scale especially, several items can be scored from observation.
Animal model. Adaptations of many of the assessments included in the Bayley Scales for Infant Development II (Bayley 1993) have been used in assessment of nonhuman primate infants (Burbacher and Grant 2000).

Discussion
Over the last decade, there have been tremendous advances in the understanding of the development of learning, memory, and attention in infants and in measures to assess these functions. This review identifies validated tests of normal cognitive function in human infants 12 months and younger that have a homologous or parallel test in laboratory animals. The tests vary along many dimensions of desirable properties, including the number of available validation data, speed of testing, breadth and/or specificity of test results, requirements for equipment and personnel to conduct the test, and extrapolation of results among species. As technology improves our ability to study infant brain function, continued advances are anticipated in the understanding of the integration of motor and mental development in infant cognitive function, and in identifying the factors that arrest normal development.
Tests are currently under study in human infants that seem appropriate for evaluation in animal models. For example, the visual expectation paradigm (VExP) is based on the infant's ability to learn a spatiotemporal pattern (Haith et al. 1993). VExP measures of expectation (anticipations and reaction times) have a moderate amount of reliability and stability (Canfield et al. 1995). Correlations of -0.44 to -0.46 between reaction time in infants (3.5 and 8 months) and standardized IQ measures at 3 to 4 years of age have been reported (DiLilla et al. 1990;Dougherty and Haith 1997).
Some neurophysiologic tests also have promise in future research of brain function and cognitive development, such as functional magnetic resonance imaging and positron emission tomography. At this time, practical limitations restrict use to older children, because the tests require an alert, cooperative child who can overcome the fear of a strange situation and hold his or her head still (Singer 2001).
Because a gold standard does not exist for assessment of cognitive function in animals for comparison with human infants, a battery of tests may also be considered. For example, B. Lazoff at the University of Michigan (personal communication) is leading a longitudinal study assessing cognitive and motor development in 150 human infants and monkeys with iron deficiency from 9 to 12 months of age. The battery of tests assessing cognitive function include the A-not-B task, a Fagan II novelty preference test modified to include looking time, the Resnick ocular motor spatial task, the pair-comparison task with ERP recordings, and a spatial recognition task.
Although sensorimotor and language development in infancy obviously prevents the assessment of late-maturing higher order skills that might be particularly sensitive to neurotoxicant exposures (e.g., reading, complex problem solving, executive functions such as planning, organizing, and strategizing skills), the concurrent validity of habituation and classical conditioning tests has been established. Bellinger (2002) suggests interpreting the validated infant assessment tests in the same way that neonatologists interpret birth weight. Although not predictive of later weight, birth weight is highly informative as an index of a newborn's general health status.
An important advantage of assessing cognitive outcomes in infancy is reducing the amount of time between gestational or early postnatal neurotoxicant exposure and outcome assessment. This has several applications. Early assessment increases the strength of, and reduces the bias in, the estimate of the neurotoxicant's contribution to the results of the neurobehavioral assessment. By reducing the time available for other factors to influence outcome, early assessment allows observation of the relatively direct effect of the exposure. Conversely, but of perhaps equal importance, results obtained in childhood are assessed longitudinally to help identify the impact on neurodevelopment of confounding factors (e.g., sociodemographic, education). This information can be valuable for developing intervention strategies, which in turn are often more effective when initiated earlier in development.

Children's Health | Comparable measures of cognitive function
Environmental Health Perspectives • VOLUME 111 | NUMBER 13 | October 2003 Comparable measures in laboratory animals are essential to the understanding of the toxicology underlying effects measured in human infants. Assessment of the actual risk to developing humans relies heavily on extrapolation of data from animal studies. When comparable methods are used in laboratory studies and in evaluation of human infants, more confidence can be placed on predictions of levels of exposure that will adversely affect humans. Ethical and economic considerations support the choice of rodent over nonhuman primate studies. Homologous measures, in which the identical methodology is employed, have some advantages over parallel measures, which rely on different techniques to evaluate what are believed to be the same processes in different species. We found several homologous tasks for humans and nonhuman primates, some suitable for the study of infants. We identified only one homologous task, classic EBC, that can be used in humans, nonhuman primates, and rodents. It is not sufficiently developed at this time for use in large-scale studies. However, we identified several parallel measures that are suitable for evaluation of human infants and application in rodent toxicology studies designed to clarify and extend the findings of studies in humans.