Identification of anovulation and transient luteal function using a urinary pregnanediol-3-glucuronide ratio algorithm.

The sensitivity and specificity of a urinary pregnanediol-3-glucuronide (PdG) ratio algorithm to identify anovulatory cycles was studied prospectively in two independent populations of women. Urinary hormone data from the first group was used to develop the algorithm, and data from the second group was used for its validation. PdG ratios were calculated by a cycles method in which daily PdG concentrations indexed by creatinine (CR) from cycle day 11 onward were divided by a baseline PdG (average PdG/Cr concentration for cycle days 6-10). In the interval method, daily PdG/CR concentrations from day 1 onward were divided by baseline PdG (lowest 5-day average of PdG/CR values throughout the collection period). Evaluation of the first study population (n = 6) resulted in cycles with PdG ratios > or = 3 for > or = 3 consecutive days being classified as ovulatory; otherwise they were anovulatory. The sensitivity and specificity of the PdG ratio algorithm to identify anovulatory cycles in the second population were 75% and 89.5%, respectively, for all cycles (n = 88); 50% and 88.3% for first cycles (n = 40) using the cycles method; 75% and 92.2%, respectively, for all cycles (n = 89); and 50% and 94.1% for first cycles (n = 40) using the interval method. The "gold standard" for anovulation was weekly serum samples < or = 2 ng/ml progesterone. The sensitivity values for all cycles and for the first cycle using both methods were underestimated because of apparent misclassification of cycles using serum progesterone due to infrequent blood collection. Blood collection more than once a week would have greatly improved the sensitivity and modestly improved the specificity of the algorithm. The PdG ratio algorithm provides an efficient approach for screening urine samples collected in epidemiologic studies of reproductive health in women. ImagesFigure 1. AFigure 1. BFigure 1. CFigure 2. AFigure 2. B

Anovulation is a common cause of female infertility (1), and its frequency is reported to increase when women are exposed to reproductive toxicants (2). Environmental exposures associated with anovulation in women may occur in the home or the workplace (3). A number of studies in laboratory animals have suggested that environmental chemicals may have adverse effects on ovarian function that result in anovulation and/or menstrual dysfunction (4). The consequences of these perturbations are not limited to infertility. For example, estrogen deficiency may contribute to risk of cardiovascular disease and/or bone loss (5,6). Anovulation is also associated with an increased risk of breast cancer (7) and endometrial cancer (8). Menstrual dysfunction, such as irregular uterine bleeding, can result from causes other than ovarian dysfunction and also can be associated with infertility (9,1Q). Epidemiologic studies that rely solely on menstrual calendars to detect adverse reproductive outcomes associated with environmental exposures cannot distinguish between endocrine abnormalities that may result from toxicity to reproductive organs and menstrual bleeding disorders that may result from toxicity to other organ systems (11). Although the occurrence of anovulation in clinic populations can be documented by standard diagnostic procedures including daily basal body temperature assessments, daily measurements of serum progesterone (Po), and ultrasound evaluations (12), these methods are not feasible for monitoring the reproductive function of large groups of women in populationbased studies. Thus, urinary metabolites of reproductive hormones are now being measured as biomarkers of ovarian function and implantation in epidemiologic studies (13,14). Methods have been developed to use the daily urinary levels of estrogen and P metabolites for identification of the approximate day of ovulation in ovulatory menstrual cycles (15), but no method is currently available for classifying cycles as anovulatory on the basis of urinary hormone measurements alone.
The present study was undertaken to devise a simple algorithm, based on urinary hormone assays, for identifying anovulatory cycles, which may occur between episodes of vaginal bleeding or which may be associated with prolonged intervals of amenorrhea. The method described here can be used to identify periods of anovulation or to screen cycles for evidence of ovu-lation before the application of existing algorithms for determining the approximate day of ovulation and/or the occurrence of early pregnancy loss in ovulatory cycles.

Methods
To develop the algorithm, six women who had a recent history of exercise-induced amenorrhea (absence of menstruation during the preceding three months) were recruited at Oregon State University. These women provided daily urine samples, weekly blood samples, and completed a daily menstrual calendar for a 40-day study period. These women exercised for an average of 8.0 hr (SD 3.5) per week. Their mean age was 23.3 years (SD 4.0), with a range of 19-29 years, and they were in good general health.
Each woman collected first-void morning urine samples every day for 40 consecutive days and stored them in a freezer (-20°C) until analyzed. In addition, 4 weekly blood samples were collected, beginning on day 17 of the 40-day study period. The blood samples were allowed to clot, and serum was removed and stored frozen until analyzed. The weekly blood samples were used to verify that there was no significant rise in P production (i.e., P <2 ng/ml) during the study interval. Although such anovulatory intervals during periods of amenorrhea cannot be considered menstrual cycles, we use the term "cycles" to refer to the urine collection intervals, whether or not vaginal bleeding occurred.
To test the validity of the PdG ratio algorithm for identifying anovulatory cycles, a second set of urine samples was collected from an independent group of 40 women participating in a study of the effects of exercise on ovarian function at the University of Connecticut Health Center. First-void morning urine samples were collected daily by 38 women during 85 menstrual cycles, and by 2 women who experienced no vaginal bleeding during the study.
Articles . Urinary PdG criteria for anovulatory cycles Each amenorrheic woman collected 2 consecutive sets of 30 daily urine samples. All 40 women also provided weekly blood samples. These women ranged in age from 19 to 36 years with a mean age of 28.2 (SD 5.8) years. Fourteen women were recreational athletes who ran up to 20 miles per week, 15 were conditioned athletes who ran more than 20 miles per week, and 11 women were sedentary. Altogether, they contributed 89 collection cycles. All women enrolled in the study gave informed, written consent, and the research protocols were reviewed and approved by the institutional review boards of the respective universities.
The protocol for urine and blood collections in the University of Connecticut Health Center group was the same as in the group of women from Oregon State University, except that sampling was begun on the first day of vaginal bleeding and was divided into cycles based on the subsequent occurrences of menstrual bleeding. In collection periods in which there was no bleeding for 30 days (amenorrhea), a new sampling interval was begun. In addition to providing daily urine samples and weekly blood samples, each woman completed a daily record of menstrual bleeding for the duration of the study.
We analyzed each serum sample for P using commercial radioimmunoassay kits (16). Urinary pregnanediol-3-glucuronide (PdG) and creatinine (CR) levels for each urine sample were determined by methods described by Munro et al. (17). Each PdG concentration was indexed by the CR concentration of the same sample to adjust for urine dilution. Urine samples with CR concentrations <0.2 mg/ml were considered too dilute for accurate analysis, and samples with CR concentrations >3.0 mg/ml were considered too concentrated; neither type of sample was induded in the analysis.
The development of the algorithms is based on the assessment of P secretion by the corpus luteum during the post-ovulatory phase of the menstrual cycle. Because P is not expected to be metabolized and/or excreted in the same way by every woman, it was necessary to develop an approach for assessing P production that could utilize measurements of a single P metabolite (PdG), but could not be confounded by individual variation in hormone metabolism and excretion. The method takes advantage of the single source and relatively uniform production of P from the adrenal gland before ovulation (18). A mean 5-day baseline value for the excretion of PdG was calculated for each cycle and used as a denominator for the calculation of PdG ratios for that cycle. The ratios constitute a series of dimensionless values that can be used to characterize P metabolite excretion throughout the cycle. A sustained increase in these values should be observed when P production by the corpus luteum after ovulation exceeds its secretion by the adrenal gland.
Two algorithms to assess anovulatory cycles using urinary levels of PdG/CR were developed and evaluated for sensitivity and specificity by comparison to the "gold standard" of weekly serum P levels. The first algorithm (cycles method) was developed for use when intermenstrual intervals were well defined by menstrual bleeding and were assumed to represent menstrual cycles. The second algorithm (interval method) was developed for use with samples collected during periods when menstruation could not be dearly identified, i.e., during intervals of amenorrhea or intermittent vaginal bleeding. All cycles/intervals were evaluated separately by both the cycles method and the interval method.
For the cycles method, the PdG baseline was defined as the average of the daily concentrations of PdG/CR from cycle day 6 through cycle day 10, where day 1 was the first day of vaginal bleeding. This baseline represents an interval in the follicular phase, which begins after dearance of PdG from the previous luteal phase and ends before the rise of PdG at the time of ovulation. During this period, the adrenal gland, not the ovary, is the primary source of P production (19). The ratios of subsequent daily PdG/CR concentrations to this baseline value were calculated beginning with cycle day 11 and were continued to the end of the cycle (day before the first day of the next vaginal bleeding).
For the interval method, the PdG baseline was determined by calculating a series of centered 5-day moving averages of daily concentrations of PdG/CR, beginning with day 3 of collection and was continued to day n -2 (where n = the last day of collection). We used the lowest of these 5-day average values as the PdG baseline. Daily PdG ratios were computed with the interval method for all collection intervals beginning with day 1 of the collection period and was continued to the end of the collection period.
Cigarette smoking is a potential confounding variable that needed to be taken into account in designing the PdG ratio algorithm because smoking has been shown to increase P production by the adrenal gland by as much as 65% (20)(21)(22). To test the potential effects of smoking and the resulting increase in urinary PdG concentration on the PdG ratio algorithm, PdG values were increased artificially (since none of the women studied actually smoked) for all 89 cycles obtained from the 40 University of Connecticut Health Center women. The values for the PdG baseline were increased by 75%, which assumed an increment greater than the maximum reported increase of 65% (20), as well as 100% recovery of circulating adrenal P as urinary PdG. The PdG/CR concentrations for the other days of the collection period were also increased by the same amount, which was used to adjust the baseline (75% of the baseline value). Daily PdG ratios then were computed for all cycles using the interval method as described previously.
Sensitivity, specificity, and confidence intervals (23) for the urinary PdG ratio algorithm were computed for identifying anovulatory cycles in the second set of 40 women, using each of the three procedures described above. The "gold standard" used for the analysis was weekly serum P values for each of the 89 cycles. A cycle was classified as anovulatory if P levels in the weekly blood samples never exceeded 2 ng/ml (12).

Results
The weekly serum P values for the six women with presumed anovulation in the Oregon State University study group ranged from 0.186 ng/ml to 1.276 ng/ml, confirming the absence of ovulation during the 40day urine collection period. When the cydes method was used to evaluate the PdG ratios, in only 1 of 6 cases was the PdG ratio .2.5 on 2 consecutive days, and the PdG ratio was never >2.5 on 3 consecutive days (Table  1). When calculated by the interval method, the PdG ratio was never >2.5 on 3 consecutive days ( Table 2). Based on these data, an algorithm rule was formulated that dassified cycles with PdG ratios .3.0 for .3 consecutive days as ovulatory; otherwise, cycles were dassified as anovulatory.   To test the sensitivity and specificity of the algorithm, the 89 cycles from the second set of women at the University of Connecticut Health Center were divided into ovulatory and anovulatory cycles based on serum P measurements. Twelve cycles from 9 women had serum P <2 ng/ml in all weekly samples and were classified as anovulatory; 2 of these women contributed only anovulatory intervals. The remaining 77 cycles had 1 or more weekly serum P values >2 ng/ml and were classified as ovulatory. Of the 77 ovulatory cycles, 10 were contributed by 7 women who also had anovulatory cycles; the other 67 ovulatory cycles were provided by the remaining 31 women. One cycle from the 77 ovulatory  cycles had missing urine samples during the follicular phase and was not used in subsequent analysis by the cycles method but was evaluated using the interval method. Nine of the 12 cycles dassified as anovulatory on the basis of weekly serum P measurements were also dassified as anovulatory by the PdG ratio algorithm (Table 3). However, 3 of the 12 cycles were misdassified as ovulatory by the PdG ratio algorithm. Evaluation of the PdG profiles in these misclassified cycles revealed transient elevations (<7 days) of urinary PdG in the sampling period (Fig. 1). The maximum PdG levels of these elevations ranged from 2.5 pg/mg CR to 8.7 pg/mg CR (mean = 4.9 pg/mg Cr, SD 2.7). In comparison, the maximum PdG levels in the 77 ovulatory cycles ranged from 0.85 to 15.38 pg/mg CR (mean = 6.4 pg/mg Cr, SD 3.0). In two of the three misdassified cycles, there was evidence of menstruation as a result of P withdrawal, because vaginal bleeding followed the period of PdG elevation (Fig. 1). However, this luteal function was not detected by serum P because it occurred during the time when no blood samples were collected, i.e., between weekly blood samples (Fig. 1). Had more frequent blood samples been obtained, it is likely that these two cycles would have been dassified as ovulatory with a short luteal phase. In the third misdassified interval, the daily PdG profile revealed evidence of transient luteal function, but there was no record of occurrence of vaginal bleeding. Again, the rise of PdG occurred between the weekly blood samples, so the corresponding P value was not obtained (Fig. 1). This woman was amenorrheic, with the last documented menses 15 months before enrollment in this study. cCycles with all weekly serum samples <2 ng/ml progesterone were classified as anovulatory; otherwise, ovulatory. dMisclassified cycles: cycles A, B, and C (see Fig. 1).
Urinary PdG ratios also were calculated for 76 of the 77 cycles dassified as ovulatory on the basis of weekly serum P levels. In 8 of these 76 cycles the PdG ratio algorithm misclassified the intervals as anovulatory. Thus, using the serum P values as the "gold standard," the sensitivity and specificity of the PdG ratio algorithm to identify anovulatory intervals using the cycles method were 75% and 89.5%, respectively (Table  3). Because some women contributed multiple cycles, the data were recalculated to address the issue of potential lack of independence of collected cycles. For these calculations, only the first cycle provided by each of the 40 women was used, and the resulting sensitivity and specificity of the PdG/CR ratio algorithm were 50% and 88.2%, respectively (Table 3). Table 4 provides information on the sensitivity and specificity of the PdG ratio algorithm as evaluated by the interval method for the same 89 cycles. The sensitivity and specificity for all intervals were 75% and 92.2%, respectively. When these data were recalculated using only the first cycle from each of the 40 women, the sensitivity and specificity of the PdG ratio algorithm were 50% and 94.1%, respectively. When the artificial adjustment in urinary PdG values was made to simulate increased adrenal production of P as a result of smoking, the sensitivity using the interval method was stil 75%, but the specificity dropped to 84.4% (Table 5). The loss of specificity was due to an increase in the number of misclassified ovulatory cycles from 6 cycles to 12 cycles. When only the first cycle for each woman was examined using the interval method, the specificity was 85.3%, with no change in sensitivity (Table 5).

Discussion
The algorithm described here is designed to detect anovulatory menstrual cycles and Sensitivity (95% Cl) = 50% (11.8-88.2) Specificity (95% Cl) = 94.12% (80.2-99.2) PdG, pregnandiol-3-glucuronide. 8lnterval method: baseline = lowest 5-day moving average of daily concentrations of PdG/creatinine from day 3 to day n -2 of collection (n = number of days of urine samples collected). PdG ratios calculated beginning day 1 of the collection to the end of the collection period. All cycles .19 days. bCycles with PdG ratios .3 for .3 consecutive days were classified as ovulatory; otherwise, anovulatory by the urinary PdG ratio algorithm. cCycles with all weekly serum samples <2 ng/ml progesterone were classified as anovulatory; otherwise, ovulatory. dMisclassified cycles: cycles A, B, and C (see Fig. 1). Sensitivity (95% Cl) = 50% (11.8-88.2) Specificity (95% Cl) = 85.29% (68.9-95.1) PdG, pregnandiol-3-glucuronide. alnterval method with data artificially adjusted to simulate smoking-induced increase in adrenal production of progesterone: adjusted baseline = lowest 5-day moving average of daily concentrations of PdG/creatinine from day 3 to day n -2 of collection + 75% increase (n = number of days of urine samples collected). Adjusted daily PdG/creatinine + 75% of lowest 5-day moving average (increased adrenal contribution throughout the collection period). PdG ratios calculated beginning day 1 of collection to the end of the collection period. All cycles .19 days. bCycles with PdG ratios .3 for .3 consecutive days were classified as ovulatory; otherwise, anovulatory by the urinary PdG ratio algorithm. cCycles with all weekly serum samples <2 ng/ml progesterone were classified as anovulatory; otherwise, they were ovulatory. dMisclassified cycles: cycles A, B, and C (see Fig. 1). reflects P secretion by the corpus luteum after ovulation. The algorithm is based on detection of a sustained increase in urinary PdG excretion in excess of the basal levels. Because P is not metabolized and/or excreted in the same way by every individual, it is necessary to employ an approach for assessing P production that is not confounded by individual variation in steroid hormone metabolism and excretion. The PdG ratio algorithm takes advantage of the single source and relatively uniform production of P from the adrenal gland before ovulation (18). When intermenstrual intervals are well defined, a baseline value for the excretion of PdG is calculated for the first phase of each intermenstrual interval. In women with amenorrhea or metrorrhagia, a baseline value is generated by searching for the lowest 5-day interval of PdG excretion within the period of urine collection. In this latter application, the number of days of urine collection must be large enough to avoid the false identification of anovulation that would result if the collection interval contained only a follicular phase.
The daily PdG ratios constitute a series of dimensionless values that are used to characterize P metabolite excretion throughout the cycle. The advantages of this algorithm are 1) it is based on a welldefined physiologic process that is relatively uniform between individual women; 2) it requires daily measurement of only one Environmental Health Perspectives * Volume 104, Number 4, April 1996 Figure 2 shows the location in the cycle of the PdG baseline for the interval method when it was determined from the 5-day moving average of the PdG/CR values. These data are shown as the distribution of the median day of the lowest 5-day average. Figure 2 also shows the distribution of the cycle lengths of the 89 cycles studied. The median day of the lowest 5-day moving average occurred by day 10 of collection in 73% of the cycles and by day 15 of collection in 91% of the cycles. The cycle length ranged from 19 days to 50 days with a mean of 27.9 (SD 4.8) days, and 89% of the cycles had cycle lengths between 24 days and 32 days. hormone analyte and creatinine; and 3) it attenuates individual differences in steroid metabolism and route of steroid metabolite excretion by relying on the relative change in analyte excretion rather than absolute urinary concentrations, which can be influenced by environmental factors such as smoking (20-22. Using this algorithm, a sustained increase in the PdG ratio is observed when P production by the corpus luteum after ovulation exceeds its secretion by the adrenal gland. Theoretically, changes in P secretion by the adrenal gland could affect the values of PdG ratio through changes in the baseline values. Cigarette smoking has been shown to alter adrenal gland P production (20)(21)(22), but our calculations demonstrate that even a 75% increase in the adrenal source of P does not significantly affect the sensitivity and specificity of the PdG algorithm.
The PdG ratios that are generated for use in the algorithm are largely unaffected by individual differences in P metabolism and route of excretion. The impact of these variables can be demonstrated by analysis of the daily values for absolute concentrations of PdG/CR in the same sets of cycles studied with the PdG ratio algorithm. Evaluation of the six anovulatory cycles from Oregon State University by the methods previously described led to an absolute cut-off level for ovulatory cycles of PdG .3 pg/mg CR for .3 consecutive days (data not shown). When this rule was applied to the 89 cycles from the University of Connecticut Health Center, 11 of the 12 anovulatory cycles were correctly identified, but 18 of the 77 ovulatory cycles were misdassified as anovulatory. This relatively low specificity of 76.6% was increased to approximately 90% when PdG ratios instead of absolute cut-offs were used in the algorithm (Tables 3 and 4).
In recent population-based studies of reproductive health in women potentially exposed to environmental hazards, a prospective approach has been taken in which daily urine samples have been collected and analyzed for biomarkers of ovulation and early pregnancy loss (13,14,24,25). The algorithm we report here can be used for the initial assessment of ovulatory function in such studies. The detection of anovulatory cycles may be the primary endpoint for studies that focus on ovarian function. Such cycles could be evaluated further with additional biomarkers to determine whether the cause of anovulation was ovarian failure or pituitary dysfunction. In studies to assess early fetal loss, early identification of anovulatory cycles is a cost-effective strategy because these cycles can be excluded from subse-quent laboratory evaluation for conception and loss. The high specificity of the algorithm is particularly advantageous for such screening applications, since few ovulatory cycles will be misclassified as anovulatory.
Some women contributed multiple cycles in this study, and the data therefore were analyzed in two ways, using data from all cycles contributed by the 40 women, and using data from only the first cycle contributed by each woman to control for the lack of independence of multiple cycles contributed by the same woman.
The results of the present study demonstrate that the PdG ratio algorithm has good sensitivity to identify anovulatory cycles. The sensitivity of 75% for all cycles and 50% for the first cycle determined from the data in this study is probably an underestimate of the true sensitivity of the method, as indicated by further evaluation of the misclassified cycles. Examination of the urinary hormone profiles for two of the three misclassified cycles (Fig. 1) revealed evidence for significant urinary PdG excretion (and thus P production) between days of blood collection, and a period of vaginal bleeding that was consistent with P withdrawal. Because of the brief interval of luteal function in these cycles, evidence of P secretion was not obtained with weekly serum samples. Although no vaginal bleeding was associated with the rise of PdG in the third misclassified cycle, it is likely that a significant increase in P secretion also would have been detected in this cycle if blood samples had been collected more frequently. Whether or not ovulation occurred in the third misclassified cycle, the cycle dearly did not have a normal luteal phase, as the PdG elevation was much shorter in duration than has been observed in documented ovulatory cycles (17).
The hormone profiles in Figure 1 demonstrate the heterogeneity of cycles that may be classified as "ovulatory" by the PdG ratio algorithm. As indicated by the high specificity of the algorithm, virtually all episodes of normal luteal function are likely to be identified by this method. Whether episodes of transient luteal function are always a result of ovulation is unclear. However, brief intervals of luteal function, such as those illustrated in Figure  1, are not likely to be fertile cycles, and thus cycles identified as ovulatory by this PdG ratio algorithm cannot necessarily be assumed to be normal cycles (26,27). In future studies, additional algorithms will be developed to analyze ovulatory intervals for identification of normal and abnormal luteal phases.
Previous attempts to document the occurrence of abnormal menstrual cycles in population-based studies have been limited to analyses of menstrual calendars or daily diaries (28,29). Anovulation has been suggested to underlie menstrual cycles that are unusually long (30) and unusually short (31). However, it is generally agreed that menstrual cycle data alone are not adequate for accurate detection of anovulation (34. Recent studies have demonstrated the feasibility of collecting daily urine samples in large population-based studies (13,14,24,25). The methodology described here can be used to process and screen those samples in a cost-effective manner and to analyze the data to identify anovulatory cycles with a high degree of sensitivity and specificity without a requirement for blood sampling.