Point process models in asthma attacks for assessing environmental risk factors.

Point process models are reviewed and discussed for assessing the effects of environmental risk factors on asthma attacks. It is pointed out that the logit model and proportional intensity model are useful for analyzing the data based on the diaries recorded consecutively during several months or during a few years. Some covariates that seems to influence upon asthmatics are explored using these models. Further work on estimating the smoothed base-line intensity function is briefly discussed in terms of the Bayes model.


Introduction
According to the definition of the joint committee of the American Thoracic Society and American College of Chest Physicians, asthma is a disease characterized by an increased responsiveness of airways to various stimuli and manifested by slowing of forced expiration which changes in severity either spontaneously or with treatment (1). In this study we are interested in the effects of some environmental stimuli on asthma attacks.
Environmental stimuli are generally classified into three categories. The first category consists of stimuli caused by meteorological or aerometric factors such as barometric pressure, temperature and humidity. Kasai and Nemoto (2) investigated the distribution of atmospheric pressure in relation to a panel attack rate and found the nine patterns under which patients were likely to have asthma attacks with high frequency. Wagner et al. (3) explored the weather front manifested by changes in barometric pressure, cold air or altered air ionization; however they failed to show any statistically significant relation between these factors and asthma attacks. The second category consists of the stimuli of air pollutants such as sulfur oxides, oxides of nitrogen, carbon monoxide, and others. The detailed description of air pollution and respiratory disease appears in Whittemore (4). These effects are frequently studied in relation to meteorological influences (5)(6)(7). The third is the category that includes other environmental stimuli such as pollens (8), spores, and house dust.
In this article the effects of environmental stimuli in *Institute of Statistical Mathematics, Tokyo, Japan the first two categories on asthma are examined and evaluated in the light of data from panel studies.

Point Process Modeling of Asthma Attacks
We denote by Xi(t) the binary random variable which represents the absence or presence of an attack for the i-th asthmatic at time t. For each asthmatic i, repeated measurements are made on the levels of a set of environmental stimuli at different times t and are recorded in a diary. These risk factors are time-dependent covariates z(t) which have been found to play an important role in predicting asthma attacks (5,6). Korn and Whittemore (5) pointed out that usual linear models of panel attack rate had many problems, including the assumption of independence of successive asthma attacks. They therefore used the logit model with a Markov parameter, thereby taking dependency of successive random variables into consideration. In their analysis the Markov parameter was found to be highly statistically significant. This finding was first verified by Hasselblad (9) and later by Yanagimoto and Kamakura (6). In this section we present three point process models, modulated by stimuli covariates. We define discrete intensity pi(t) for the i-th patient at time t given his attack history Hi(t) (= { Xj(u), u < t } ) and covariate process Wi(t), ( = Zj(u), ut } ) as the following: pi(t) = Pr{Xi(t) = 1 Hi(t), Wi(t) } (1) The model one is a multiple logistic model based on consecutive observations for the patient i: log lp(t) 1 -pj(t)-J i + z a%iXi(t -j) + j=1 rameters are estimated by the partial likelihood, K ikZik(t) (2) where K is the number of stimuli and 1 -in, where n is the number of patients in the study. In the case J -0, the second term of the right-hand side will vanish.
If we set J= 1, Xi(t) is two-state Markov chain (5). The model of two-state Markov chain is theoretically given by Whittemore and Keller (10), using the threshold approach. It is assumed that an asthmatic has an attack when the intensity of random trigger exceeds his threshold.
When we can assume that each individual in the panel has common a(j = 1, . . ., J) and k(k = 1, ...,K), we have the second model, The parameter ,ui is necessary for adjusting the difference among patients' attack intensities. The third one is the proportional intensity model, which is the generalization of Cox's (11) proportional hazard model to the point process case (12,13). The model is expressed by the following; pi(t)  (7) where 4XOt is the set of the patients who have attack at time t and 4't is the set of all subsets of size 14otI from the risk set {1, . .. , n}. Here 14)ti indicates the cardinal number of the set 4ot. The set + is any subset of the set 40t. If many patients have attacks at the same time, the size of lYotl becomes larger, and the computation of Eq. (7) becomes difficult. In such a case one can use approximation derived by Peto (14) and Breslow (15); however, this approximation gives rise to downward bias (16), so the recursive algorithm (17)(18)(19) should be applied.
Here we note that the proportional intensity model fails to detect the differences in the effects of the covariates among patients, since the parameter vector ,B is common among them. On the contrary, one can say that proportional intensity model has good property to obtain estimates of individual parameters excluding common covariates exposed to all asthmatics.

Our Experiences
Statistical models described in the preceding section was applied to three actual data sets. They include surveys at Tokyo, Chiba, and eight large cities in Japan. Physicians asked their outpatients to keep the record in a diary of asthma attacks, health conditions and medications. Uniform asthma diaries were prepared for the purpose of these surveys. We will concentrate on two data sets.

Tokyo Metropolitan Asthma Attack Diary Study
The first data set is obtained from a survey conducted at Bokutoh Metropolitan and Kiyose Pediatric Hospitals under the sponsorship of Tokyo Metropolitan Government. The time period of the survey covers from September 1978 through October 1980. The survey was designed for evaluating the relation of environmental factors, especially concentrations of air pollutants to, asthma attack, and it was a part of a large scale survey. Among the records of patients who entered the survey, those of 20 patients were carefully selected based on the following criteria. The record length of each patient is longer than 417 days; attack rates lie between 4.4% and 53.0%, and the proportion of missing data is less We first present the results of the analysis for the ten selected persons. Here the levels of environmental factors, such as photochemical factors, on the day prior to an attack are studied in relation to asthma attacks. Table 1 presents (-2) * (maximized log-likelihood and the signs of the estimates of the regression coefficients using the likelihood LI. The value of J gives the number of cij and the differences of the (-2) * (maximized loglikelihood) in each parameter setting presents test of significance; for example, for patient I.D. 105 we obtain a highly significant positive estimate of a,,, because the the difference in chi-squared values between J = 0 and J = 1 is 439.18 subtracted from 786.41, or 347.23, with one degree of freedom. Other cases also indicate that Markov parameters are statistically highly significant.
In winter it is known that attack frequency is low. Our logit model, Eq. (2), also confirms this hypothesis of decrease in the number asthma attacks in winter time. The regression coefficient for winter covariate (1 = winter, 0 = otherwise) are statistically significant and they have negative signs. We can see this tendency in all asthmatics. It is noteworthy that when J changes from 0 to 1, the chi-squared values in the model of Eq.
(2) are reduced greatly. We find in literature many possible attributable factors, including season, weather conditions, rapid change of weather, rainy season in early summer and Sunday. These factors are used in the model, Eq. (2), with J = 0. Our analysis suggests that -rainy season and Sunday are important attributable factors. Though biometeorologists insist on the possible relation of weather conditions to asthma attacks, weather conditions are not detected except for trough pattern (east-west) in our study. This is partly explained by the fact that classification of weather conditions depends highly on subjective judgement. We are afraid that spurious excess of chi-squared values under the model of Eq. (2) with J = 0 may have caused many possible attributable factors in literature.
The evaluation of concentrations of air pollutants is difficult, since they are sensitive to many factors. Oxidant among air pollutants provides an attractive result. Chi-squared values for three patients are all greater than 2, and the largest one is 13.16. The signs of the estimated parameter for the three patients are all positive. However we should be careful in interpretation of the result. In fact, as stated above, the concentration of oxidant is low in winter, when the attack rate is low. It is recommended to check the result by using another data set and different models. Now we apply the proportional intensity model to this data set. This model requires some homogeneity among patients because we assume common base-line intensity functions and common parameters aoj = 1,..., J), Pk (k= 1, . . ., K). To assure the homogeneity each of the five selection criteria is used, separately: (1) the chisquare value of the regression coefficient of oxidant is greater than two; (2) in the model which includes only one Markov covariate the patient gives the estimate near the value 2.4; (3) the estimate of coefficient of Markov covariate is around 3.4; (4) the attack rate is high; (5) the attack rate is low. The selected patients are presented in Figure 1. We apply the proportional intensity model to the records of patients selected by each criterion. The estimates of the regression coefficients of oxidant covariates are summarized in Table 2. Here we assume the model which has a Markov covariate a, (J = 1). The detailed description of data analysis are given by Yanagimoto and Kamakura (6).

Nationwide Asthma Attack Study
The next study is a part of the national-wide data set recorded under the financial support of Environmental Agency. The data include asthma attack processes of both adult and childhood patients who have residences at eight main large cities dispersed from the north of Japan to the south: Sapporo, Sendai, Tokyo, Nagoya, Osaka, Hiroshima, Fukuoka, and Naha. The asthma attack data were recorded every 6 hr for each patient who lives near one of the above eight cities. This nationwide data set is obtained in an attempt to investigate the relations between environmental factors and asthma attacks which appear to affect people nationwide.   to all patients. One example is the case where patients live in the same district, and hence, they are affected by the same meteorological factors. They propose the random effect model based on the assumption that the regression coefficient parameters are normally distributed. Here we use the fixed effect model of Eq. (3) without assuming normality. However, we cannot utilize the computation of each patient and so many parameters including individual parameters ,ui (i = 1, . . ., n) should be simultaneously estimated. These calculations may be laborious when the number of patients who enter the study is large. We analyze these data by using several models derived from the likelihood LlI. Table 3 gives estimates of the regression coefficients and estimated normal deviates for a number of models that were fitted to the data of patients at Naha in Okinawa without the estimates of pu? measuring heterogeneity of attack occurrences. The regressor variables at time t are defined in the following manner: Markov covariate, the value of the Xi(t -1), (ZO); barometric pressure (Z1); Zj(t) -Z1 (t -1), (Z2); temperature (Z3); Z3 (t) -Z3(t -1), (Z4); humidity (Z5); Z5(t) -Z5 (t -1), an indicator variable: 0 = day time or evening, 1 = morning or night (Z7), where the morning is defined to be the time from three o'clock to nine, the day is from nine to fifteen, the evening is from fifteen to twenty one, and the night is from twenty one to three of the next day. The most remarkable feature of these data is the high statistical significance of a, which is the regression coefficient of Markov covariate ZO. The models with Markov parameter (models; 2,4,6,8) require the records at previous time, t -1. Hence, patients with any missing information at time t -1 or t cannot be included in the models. Therefore, it is meaningful to compare the values of the maximum likelihoods among the models, 1, 3, 5, 7 and, separately, among the models 2, 4, 6, 8 ( Table 3). The best fitted models according to AIC (deviance plus two times the number of parameters) (20,21) are the model 7 and the model 8, which include the covariate measuring the effect of night and morning. This corresponds to the fact that patients are likely to have attacks in the morning or at night. These data suggest that asthma attacks of patients are apt to occur at low temperature when the pressure goes down or the temperature changes downwardly. Though the models 1, 2, 5 and 6 suggest that high humidity influences patients with a higher attack rate, the covariate of humidity (Z5 or Z6) and the covariate of night and morning (Z7) are closely correlated, and therefore, the effect of humidity may be excessively evaluated in absence of the covariate Z7.

Discussion
In this section we will discuss the problems yet to be solved in searching for the stimuli which trigger asthma attacks.  Here we consider a problem which includes both nuisance parameters and a structural parameter. They correspond to the parameters comprising the baseline intensity function and a regression coefficient parameter, respectively. We assume that two sequences {XJ}, {Yt}, (t = 1, ... ,T) have independent binomial distributions, Pr {Xt = xt} = Pt -Pt (8) and Pr (Yt = Yt} = :] qtYt (1 -qt)mt -Yt (9) respectively, where we have restrictions that (10) [Pt/(l -Pt)] = 0 + Tt and log [qt/(lq,)] = Tt When we investigate the relation between weather fronts and asthma attacks, it is preferable to obtain the data of asthma attacks along the path where fronts move. The weather information can be easily obtained from the weather stations, but that of asthmatics is much difficult. If such data can be collected, spacial may be needed including data of weather fronts and data of asthma attack occurrences of patients who are decentralized in the space.
Furthermore asthma attacks are closely related to immunological mechanism so that some episodes of asthma are highly individualized. Therefore, the more refined models that incorporate covariables for immunological mechanism of each patient need to be developed. The effects of medicines should be also taken into consideration.
Other problems are the following: (1) How should the base-line intensity function in the proportional intensity model be estimated? (2) Which models should be selected among the three models described in the section two and other models including modulated renewal model? (3) How should the order J (the number of parameters of aj included in the model) of the dependency upon individual history be given?
In the proportional intensity model, the base-line intensity function is considered as nuisance. One method to obtain estimates of the regression coefficient parameters is to use the partial likelihood without the information on the base-line intensity function. However, in some cases the intensity function might be of main interest, as is the case when the survival function is required to be estimated in the proportional hazard model. In those cases we should estimate the baseline intensity (11) The parameters Tt and the parameter 0 are called incidental (or nuisance) and structural (22), respectively, where a log odds ratio 0 is of main interest.
In Table 4 we have two sequences, each pair of which consists of the numbers of days when two patients have attacks in a month. The estimates of both Tt and 0 are obtained from maximization of the likelihood, nF nt M[t px'(lpt)nlIt qyt(lqt)mt -Yt (12) t=1 Xt yt t t When the parameters, Tt are not of our interest, the procedure based on the conditional likelihood can been used. Since the binomial distribution belongs to the exponential family explored in Andersen (22), the solution is given by: , [8] (13) The above estimation procedure is closely related to the partial likelihood of the point process defined above. It may be naturally assumed that the incidental parameter rt change smoothly for t. Using the natural assumption as prior information, the Bayes-type approach (23) can be used. For example we assume that Here the matrix IT-1 is the (T -1)-dimensional identity matrix. The parameter P is estimated as the mode of the posterior distribution with the estimates of the hyper parameters obtained by minimization of ABIC (23).
ABIC is defined as follows; ABIC = -2 logfL(x,ylo) (0138) d,B Here L(x,yl) is the likelihood expressed as Eq.  Figure 2, where (JH and crT go to infinity. As for the structural parameter 0, the maximum likelihood method and the smoothing method give the estimates, 0.342 and 0.331, respectively. The effect of prior information of rt upon the estimate of 0 should be investigated and the smoothing priors selected with meticulous care. The problem of model selection is very difficult because the measurement of deviance or AIC cannot be utilized in a conventional way when one of the mdels under study requires the partial likelihood approach. Further work is required on the properties of regression coefficients in modulated renewal model and proportional intensity model in order to evaluate the differences of estimates among the modulated renewal model and other model described above.
The third problem is closely related to time series analysis, where the order is generally determined from the viewpoint of forecasting. However, in our analysis one of the greatest concerns is to obtain the information about not forcasting but cause-effect mechanism of environmental stimuli and asthma attacks.