Analysis of cancer risk related to longitudinal information on smoking habits.

Radiation Effects Research Foundation (RERF) has followed the RERF Life Span Study (LSS) cohort consisting of atomic bomb survivors and unexposed subjects for more than 40 years. The information on their lifestyles, including smoking habits, has been collected in the past 25 years through two mail surveys of the entire LSS cohort and three interview surveys of a subcohort for the biennial medical examination program. In the present study an attempt was made to consolidate the information of smoking habits obtained from the five serial surveys, and then a risk analysis was conducted to evaluate the effect of updating the smoking information on the smoking-related risk estimates for lung cancer. The estimates of smoking-related risk became larger and estimates of dose-response became sharper by updating smoking information using all of the data obtained from the five serial surveys. Analyses were also conducted for cancer sites other than lung. The differences in risk estimates between the two approaches were not as evident for the other cancer sites as for lung.


Introduction
In our previous studies, (1,2) we compared the smoking-related risk of lung cancer in the Six-Prefecture Cohort Study, the largescale Japanese cohort study conducted by Hirayama and his colleagues, with that in the British physicians' cohort study (3). We concluded that the relatively low lung cancer risk among cigarette smokers in the Japanese cohort can be explained, at least partially, by the cigarette shortage that lasted for about six years during and immediately after the Second World War. Another interesting difference between the two cohorts noted in our analysis was the strength of association between daily cigarette consumption and lung cancer risk: the dose-response relationship was sharper in the latter study than in the former. A possible explanation for the observed discrepancy was the difference in the fashion in which information was collected on smoking habits, which can change over time. In the Japanese study, the information was collected by a single survey conducted in 1965; therefore, we unavoidably had to ignore the possible changes of smoking habits over time. In the British study, serial surveys were conducted to update information on smoking habits. Since direct evaluation was not possible in the Six-Prefecture Cohort Study, we used the data obtained from another large-scale cohort study in Japan, the Life Span Study (LSS) of atomic bomb survivors in Hiroshima and Nagasaki. This study covers the observation period, 1966 to 1981, in our analysis of the Six-Prefecture Cohort Study data. Radiation Effects Research Foundation (RERF) has followed the LSS cohort consisting of atomic bomb survivors and unexposed subjects for more than 40 years (4). Information has been collected on their lifestyles, including smoking habits, in the past 25 years through two mail surveys of the entire LSS cohort and three interview surveys of its subcohort for the biennial medical examination program (5). In the present study, information on smoking habits obtained from the five sources was consolidated. Then risk analysis of cancer of the lung and the other major sites was conducted to compare the smoking-related risk estimates obtained from two approaches; one with smoking information limited to that available from the first survey, and the other incorporating all of the available information.

Subjects and Methods
The LSS cohort of the RERF originally consisted of 100,000 atomic bomb survivors and nonexposed control subjects. It was expanded around 1968 and in 1985 by adding about 10,000 atomic bomb survivors to the non-Adult Health Survey (AHS) subpopulation of the LSS population each time. Persons already deceased also were included in the expansion of the cohort (5). The cohort now consists of about 120,000 subjects including 27,000 nonexposed controls, i.e., Not In City (NIC) subjects.
The information on smoking was obtained from the five sources listed in Table 1. A subject in the AHS subcohort could have come under study in three interview surveys and two mail surveys at the maximum. In this study, however, the data obtained from the 1965 mail survey for male AHS subjects were not used if they responded to the 1964 to 1968 interview survey.
Information on cancer incidence and mortality follow-up were obtained from the RERF tumor registry and mortality database. Details of these data are given elsewhere (4,6). The tumor classification used in this study is given in Appendix Table 1.
Individual radiation doses from exposure to atomic bombing were estimated using the latest version of the DS86 (7). Shielded kerma was used in all analyses reported in this article.

Statistical Methods
Poisson regression models were used to fit loglinear relative risk (RR) and linear excess RR models. Maximum likelihood parameter estimates, 95% confidence intervals, and likelihood ratio tests for nested models were obtained using the AMFIT regression program (8). The confidence intervals presented in this article were calculated by likelihood methods using large-sample approximations unless otherwise specified (9). Using DATAB computer program (8), the person-years and the number of cancer cases were aggregated and stratified by city, sex, population group (AHS sample or not), atomic bomb exposure (NIC or 0-, 0.01-, 1.0+ Gy, and dose unknown), 10- it can be estimated since questions on daily cigarette consumption for current and ex-smokers were asked. CCategorized according to 1-4, 5-9, 10-14, 15-19, and 20+ g/day. dCalculated from year since the cessation of smoking.
year intervals of year of birth (before 1884, 1885-1894,...,1935 or later) and 5-year intervals of attained age (less than 39, 40-44, 80 or older), as well as smokingrelated variables. The log-linear RR model used in our analysis was as follows: Rij= Rioexp(PjGj), [1] where i is the stratum in the crossclassification of city, sex, population group, atomic bomb exposure status, year of birth, and attained age; and j is the category of exposure variable, e.g., number of cigarettes smoked per day. R;, is the cancer incidence rate for nonsmokers and G.. is the dummy variable for the exposure group j. Also used in the analysis was the linear excess RR model of the form Rij-= Ri-o(I +jDij) ' [2] where Di., for example, is the stratumspecific average daily consumption of cigarettes.

Results
After consolidation of information on smoking, there were 15,304 AHS subjects and 46,201 nonAHS LSS subjects for whom there were smoking data. For 42% of the subjects, information on smoking habits had been obtained at more than one Although the format of questions on cigarette smoking habits differed between the surveys, they were so constructed that the current smoking habits could be categorized into three groups: never smoked, stopped smoking, and currently smoking. An exception was the 1963 to 1964 interview survey where ex-smokers could not be distinguished from those who had never smoked. In addition, in this survey it was hard to distinguish between cigarette smokers and smokers of other tobaccos. Those who answered as nonsmokers in the 1963 to 1964 survey, to which about 15% of the subjects analyzed in this article responded, were considered never to have smoked, since smoking cessation was not common in Japan at that time. As shown in Table 2, there are nine possible combinations in the smoking habit variables taken from any two subsequent surveys. Since the combinations of E->N and S->N are not possible if the answers were correct, they were replaced with E-÷E and S-*E, respectively. These conflicting combinations, however, were infrequent.
When serial information on smoking is to be used in risk analysis, analysts are forced to face a problem that sometimes is ignored when the risk is to be analyzed in terms of smoking information obtained at a single time point. The question is the length of the period to be allowed between the time when data on smoking habits were obtained and the beginning of the subsequent observation period ( Figure 2). A three-year lag time was allowed in our risk analysis: the smoking habits in one survey were related to the observation period starting three years after the survey and ending three years after the next survey or at the end of follow-up, December 31, 1987. Table 3 compares the results obtained from two data sets, one with the information on smoking from the first survey, the other with all the information available. Although the total numbers of subjects are the same between the two surveys, the latter tended to include smaller numbers of those who had never smoked or who were current smokers and a larger number of exsmokers, due to the data correction made after consolidating multiple data sets. The relative risk of lung cancer among the current smokers and ex-smokers identified by a single survey was slightly lower than the risk level identified from the analysis using all the available information. Table 4 shows the relative risk estimates for lung cancer according to daily cigarette consumption. The number of subjects who   "Ex-smokers were excluded from the analysis. bThe reference category is that consisting of those who had never smoked, shown in Table 2. c95% bound could not be obtained. dIn females, daily cigarette consumption category of 15-24 nd 25+ were combined. had never smoked, who constituted the reference category, are not presented in this table but they were used in the analysis. The numbers of females who smoked 15-24 cigarettes per day and those who smoked 25 or more cigarettes per day were combined since the latter category had only a small number of subjects. The relative risk increased markedly among males smoking 25 or more cigarettes per day, as measured by the updated smoking information. No evident change was noted in the analysis of the data for females. In addition to lung cancer, analyses were conducted for the other major cancer sites ( Table 5). Although differences were noted for cancers of some sites, the 95% confidence intervals for the estimates for those sites were also large. The estimates given here were not affected greatly by changing the lag time for risk analysis from 0 to 5 years.

Discussion
The results obtained in this study showed that the estimates for lung cancer risk can be increased by updating smoking information using the data from a series of surveys. The proportion of male smokers in Japan started to decrease in the mid-1960s, when more than 80% of males were smokers, and the trend is still unabated (10). According to a recent survey (11), 60% of male respondents were smokers. On the other hand, the proportion of female smokers has remained at 10 to 15% over the last three decades, for which statistics have been available (10). In such a situation, not a small number of the male smokers identified at the beginning of the follow-up period are expected to have stopped smoking during the follow-up period. Therefore, the risk estimates for male smokers was thought to be underestimated if information on smoking habits was obtained by a single survey in the mid-1960s, and that cohort subsequently was followed. The same is true for the relative risk estimates for males who were heavy smokers, since the amount smoked per day is thought to have decreased over time based on two observations: the amount of cigarettes smoked was smaller for older birth cohorts, and it decreased with age in the same birth cohort (RERF, unpublished data  "When 95% confidence bound could not be obtained, the corresponding column was left blank. b Relative risk and 95% confidence interval with reference category being the group of never smokers. The cancer sites with fewer than 15 cases were excluded from this table. Those cancer sites are other digestive organs (15), other respiratory organs (11), bones (7), soft tissue (11), cutaneous melanoma (3), other female genital organs (11), testis and other male genital organs (6), other urinary organs (9), eye (1), pituitary, pineal gland and other endocrine organs (6), and unknown sites (100).
to become apparent within about 5 years (12). In our analysis, we assumed the lag time to be three years, as Doll and Peto (3) did in their analysis of British physicians' data. The results presented in this artcle were not strongly affected by changing the lag time from 0 to 5 years.
As is often pointed out, the relative risk of lung cancer in terms of daily cigarette consumption is much lower than that reported in the US and in Western European countries. At the Japan-US biostatistics seminar in 1989, we reported the results of a reanalysis of the Six-Prefecture Cohort Study data that confirmed the notion (1). In that reanalysis of cancer risk during the period 1966 to 1981, we also noted that the risk associated with smoking increased markedly in the mid-1970s in all the birth-year groups. The observation could not be explained by the increase in the cumulative amount of cigarettes smoked with aging of the cohort since the risk was more strongly affected by calendar time than attained age. The risk appeared to increase around the mid-1970s in all the birth cohorts. A subsequent analysis of the Six-Prefecture Cohort data by Mizuno et al. (2) showed that the low lung cancer risk in Japan can be explained by the cigarette shortage, which lasted for about 6 years during and immediately after the Second World War. They reported that lung cancer mortality among males smoking 20 cigarettes per day in the Six-Prefecture Cohort can be expressed in the equation: 1.9 x 10-1' x (duration of cigarette smoking in years -5.8 years) (4,5).
This was quite similar to the result obtained from the British physicians' data (3), where the lung cancer incidence among male physicians who smoked 20 cigarettes per day was 1.8 x 10-10 x (duration of cigarette smoking in years) (4,5).
The results obtained for heavier smokers and lighter smokers were, however, not as good as for those smoking around 20 cigarettes per day: after adjustment for the cigarette shortage, the lung cancer mortality became lower for heavy smokers and higher for light smokers when compared with the British data. One possibility for this discrepancy is the difference in the way the smoking information was collected: in the Japanese cohort, the information was obtained at the beginning of the follow-up period, while in the British cohort, serial surveys were conducted and the information was updated. Since direct evaluation was impossible in the Six-Prefecture Cohort, we examined this problem using the data obtained in the LSS cohort of the RERF. As shown in this article, the relative risk in the heavy smokers was increased, based on our updating of the smoking information, while that in the light and medium smokers changed only slightly. This suggested the possibility that the risks for male heavy smokers in the Six Prefecture Cohort would become more similar to the risks for the British cohort if the smoking information were appropriately updated.
The smoking-related risks for various cancer sites obtained in this study were similar to those reported in our previous report using the mortality follow-up data obtained from the Six-Prefecture Cohort study in Japan (1). In the present study, however, the association of cancers of the pharynx and pancreas with smoking habits could not be confirmed. Cancers of the Environmental Health Perspectives skin, uterine corpus, ovary, kidney, ureter, and thyroid, as well as malignant lymphoma, multiple myeloma, and leukemia -all of which were newly analyzed in this studydid not show any significant association with smoking. A marginally significant result in the statistical test was found in the association of smoking with lymphomas, most of which were non-Hodgkin's lymphoma. This result is interesting since non-Hodgkin's lymphoma was recently reported to be related to smoking by a study of 17,633 US white male insurance policy holders (13). Further investigations on this observation are certainly warranted.
In this study, subjects with smoking information from multiple surveys comprised less than half of the entire population. The exclusion of those with a single response would, however, introduce a bias in risk analysis since nonresponse could be related to the health condition of the subjects.
The health conditions related to cancer can cause changes in smoking habits. It is unlikely, however, that the correlation affected the results substantially in this study since the incidence data were used and cancer risk analysis was conducted after excluding the first three years of the observation interval corresponding to each survey ( Figure 2).