Design and Rationale of the HAPIN Study: A Multicountry Randomized Controlled Trial to Assess the Effect of Liquefied Petroleum Gas Stove and Continuous Fuel Distribution.

BACKGROUND
Globally, nearly 3 billion people rely on solid fuels for cooking and heating, the vast majority residing in low- and middle-income countries (LMICs). The resulting household air pollution (HAP) is a leading environmental risk factor, accounting for an estimated 1.6 million premature deaths annually. Previous interventions of cleaner stoves have often failed to reduce exposure to levels that produce meaningful health improvements. There have been no multicountry field trials with liquefied petroleum gas (LPG) stoves, likely the cleanest scalable intervention.


OBJECTIVE
This paper describes the design and methods of an ongoing randomized controlled trial (RCT) of LPG stove and fuel distribution in 3,200 households in 4 LMICs (India, Guatemala, Peru, and Rwanda).


METHODS
We are enrolling 800 pregnant women at each of the 4 international research centers from households using biomass fuels. We are randomly assigning households to receive LPG stoves, an 18-month supply of free LPG, and behavioral reinforcements to the control arm. The mother is being followed along with her child until the child is 1 year old. Older adult women (40 to <80 years of age) living in the same households are also enrolled and followed during the same period. Primary health outcomes are low birth weight, severe pneumonia incidence, stunting in the child, and high blood pressure (BP) in the older adult woman. Secondary health outcomes are also being assessed. We are assessing stove and fuel use, conducting repeated personal and kitchen exposure assessments of fine particulate matter with aerodynamic diameter ≤2.5μm (PM2.5), carbon monoxide (CO), and black carbon (BC), and collecting dried blood spots (DBS) and urinary samples for biomarker analysis. Enrollment and data collection began in May 2018 and will continue through August 2021. The trial is registered with ClinicalTrials.gov (NCT02944682).


CONCLUSIONS
This study will provide evidence to inform national and global policies on scaling up LPG stove use among vulnerable populations. https://doi.org/10.1289/EHP6407.


Introduction
Background Globally, nearly 3 billion people rely on solid fuels (wood, dung, coal, charcoal, or agricultural crop waste) for cooking and heating (Bonjour et al. 2013). These fuels are often burned in inefficient and poorly ventilated combustion devices (e.g., open fires, traditional stoves). The resulting household air pollution (HAP) accounts for an estimated 1.6 million premature deaths per year and 59.5 million disability-adjusted life-years (GBD 2018). Despite progress in recent years, this largely preventable exposure remains a leading risk factor for morbidity and mortality worldwide. Poor populations in low-and middle-income countries (LMICs) bear most of this burden (GBD 2018).
Several studies have documented association between HAP and multiple diseases or health conditions including chronic lung disease, lung cancer, cancers of the aerodigestive tract, cervical cancer in adults, pediatric acute lower respiratory infections (ALRI), or pneumonia, low birth weight, stillbirth, preterm birth, childhood stunting (short length for age), tuberculosis, and impaired cognitive development (Bruce et al. 2015;Smith et al. 2014;Gordon et al. 2014;Quansah et al. 2017;Thakur et al. 2018). However, except for pediatric ALRI, current methods used for the global HAP-attributable disease burden calculations are focused primarily on cardiovascular disease (CVD) and chronic respiratory diseases in adults using integrated exposureresponse functions drawn substantially from estimates of effects of other sources of air pollution (Burnett et al. 2014). Additionally, adverse birth outcomes such as low birth weight and preterm birth, along with childhood development and growth, are not included in the global burden of disease estimates. Therefore, the current estimate for the burden of disease related to HAP is uncertain and likely underestimated.
The state of the science illustrates several compelling reasons to undertake a multicountry randomized controlled trial (RCT) for reducing HAP using a clean fuel intervention. We will address several knowledge gaps in this study. First, liquefied petroleum gas (LPG) is currently the most widely available clean fuel in LMICs (IEA 2017), but to date, no LPG trials have been conducted that demonstrate significantly improved health outcomes among children and adults. Currently available cleanerburning biomass combustion stoves are unlikely to achieve or sustain health-relevant exposure reductions (Clark et al. 2013;Bruce et al. 2015;Anenberg et al. 2013;Sambandam et al. 2015). This trial will provide needed evidence from a randomized LPG stove intervention to support policy formulation on national levels. Second, focusing on a combination of child (birth weight, pneumonia, and stunting) and adult cardiovascular [blood pressure (BP)] primary outcomes is strategic for public health goals in LMICs, as these conditions contribute the most to HAP-associated health burdens in these settings (GBD 2018). Although HAP has been identified as a risk factor for these outcomes, intervention efforts directed at birth outcomes, child health, and noncommunicable disease risk have been included in a few RCTs aimed at reducing HAP. Furthermore, we will evaluate biomarkers and other indicators that are known to predict noncommunicable disease occurrence and/or severity across the lifespan. Third, establishing exposure-response relationships across a range of personal exposures to HAP is needed to close critical gaps in our current understanding of exposure to disease relationships (Steenland et al. 2018). Exposure-response relationships are also critical for transferability of trial results across settings and for benchmarking future intervention efforts. Finally, most stove intervention studies have failed to adequately investigate and address the behaviors necessary to overcome concurrent use of polluting stoves (a practice known as stacking) to ensure consistent and sustained use of cleaner stoves and displacement of polluting ones (Rosenthal et al. 2017).
This paper summarizes the rationale, study design, and methods of the Household Air Pollution Intervention Network (HAPIN) trial, a recently launched RCT that seeks to provide the evidence necessary for policy makers to determine what health benefits that can be achieved by implementing a scalable (in many areas) intervention aimed directly at reducing HAP in LMICs. The trial represents the first multicountry RCT to assess the effect of a stove intervention LPG on exposure to HAP and on a broad range of maternal, child, and adult health outcomes.

Study Aims
The study has three specific aims. The first is to determine the effect of a randomized LPG stove and fuel intervention on health in four diverse biomass-using LMIC populations across the world using a common protocol. We hypothesize that compared to pregnant women (18 to <35 years of age) in control households (n = 1,600), those who receive LPG stoves and fuel (n = 1,600) will have offspring with increased birth weight, reduced severe pneumonia incidence, and improved growth [less stunting, defined as length-for-age z-score less than 2 standard deviations (SD) below the median z-score based on World Health Organization (WHO) child growth standards] up to 12 months of age. We also hypothesize that compared to control households, older adult women (40 to <80 years of age) living in households that receive LPG stoves and fuel will have reduced BP. In addition to these primary outcomes, the study will assess multiple secondary outcomes on mothers, infants, and older adult women.
The second aim is to evaluate the exposure-response associations for HAP and health outcomes in four diverse LMIC populations. Using repeated 24-h personal and indirect measurements of exposure to fine particulate matter with aerodynamic diameter ≤2:5 lm (PM 2:5 ), carbon monoxide (CO), and black carbon (BC), we will characterize the exposure-response associations for all primary and secondary outcomes (assessing potential nonlinearity) while adjusting for confounders. While evidence for an overall effect of the intervention will be available from the first aim, analysis of exposure-response is critical for quantitative risk assessment and policy determinations of acceptable levels of HAP regardless of the cooking technologies and/or fuels in use.
The third aim is to evaluate the extent to which biomarkers of exposure and health effects, including targeted and exploratory (e.g., metabolomics) analyses, are associated with intervention status or exposure. We hypothesize that participants residing in households that receive LPG stoves and fuel or have lower levels of exposure to HAP will have lower levels of carcinogenic polycyclic aromatic hydrocarbons and volatile organic compounds, such as urinary 1-OH-pyrene, 2-naphthol, 9-phenanthrene, as well as of chronic disease indicators, such as inflammatory, endothelial, inflammatory, oxidative stress, and glycemic control/diabetes biomarkers [e.g., C-reactive protein (CRP), endothelin-1, E-selectin, interleukin 6, and hemoglobin A1c (HbA1c)] when compared to participants in control households.
By comparing an LPG stove and continuous free fuel intervention with standard cooking practices (typically traditional solid biomass in these settings), the HAPIN trial will provide an estimate of a potentially achievable level of HAP reduction and the associated impact on select maternal, infant, and adult health outcomes. Establishing exposure-response relationships in LMIC settings will allow for estimates of the range of improvement that can be expected across real-world conditions where clean stoves and fuels are often combined with traditional biomass stoves.

Study Overview
The study is an RCT of an LPG stove and continuous fuel distribution intervention and promotion of its exclusive use among 3,200 households in four LMICs (India, Guatemala, Peru, and Rwanda). Following an 18-month period of planning, piloting, and formative research, the study began recruiting participants in May 2018 and is expected to complete enrollment in February 2020; follow-up data collection will continue through August 2021. In each country, eligible pregnant women are recruited and their households randomly assigned to intervention and control groups on a 1:1 ratio, and they are followed for ∼ 18 months until their newborn child is 1 year old. Intervention households receive a free LPG stove and free unlimited supply of LPG for the 18-month follow-up period. Control group households do not receive an LPG stove and fuel during the study period, and it is anticipated that they will continue cooking with solid biomass fuels during the trial. After enrollment, assessments will be made on a regular schedule over the course of the pregnancy (baseline, 24-28 wk gestation, 32-36 wk gestation), at 3 months of age, 6 months of age, and 12 months of age for the child, and at the same time points for the older adult woman in the household (Table 1). Control group compensation is summarized below and described elsewhere

Study Settings and Formative Research
The study is being conducted across four LMIC settings in which large portions of the population use solid biomass as the primary fuel type. To increase generalizability, the settings were purposefully selected to represent a diversity of characteristics expected to influence intervention effects, including altitude, population density, cooking practices, baseline pollution levels, and sources of pollution other than cooking (Table 2). Other factors that may influence intervention effects, such as fuel types, dwelling characteristics, and socioeconomic conditions, are being measured and recorded. Within each country, candidate sites were selected after evaluation in formative research over 12 months. The formative research consisted of four phases: a) initial scoping to identify potentially suitable sites and developing contextually grounded behavior change strategies to promote intervention adoption, b) pilot intervention to determine the HAP exposure contrast that might be expected from the intervention, c) pilot assessment to test trial procedures and methods, and d) respiratory rate/pulse oximetry assessment to define context-specific tachypnea and oxyhemoglobin saturation thresholds in the study sites.

Eligibility Criteria, Screening, and Recruitment
Study teams led by experienced local investigators work in collaboration with clinics and community health workers in each country to identify candidate pregnant women. To be eligible to participate in the study, a pregnant woman must meet the following inclusion criteria: confirmed pregnancy (human chorionic gonadotropin-positive blood or urine test); 18 to <35 years of age (confirmed by government-issued ID, whenever possible), cooks primarily with biomass stoves, lives in the study area, 9 to <20 wk gestation with a viable singleton pregnancy with normal fetal heart rate confirmed by ultrasound, continued pregnancy at the time of randomization (via self-report), and agrees to participate with informed consent. Eligible pregnant women are excluded if they currently smoke cigarettes or other tobacco products, plan to move permanently outside the study area in the next 12 months, use a clean fuel stove predominantly, or are likely to use LPG or another clean fuel predominantly in the near future. Ultrasound measurements are conducted by trained personnel (who are also additionally certified centrally) in a clinic or home setting to determine eligibility and assess fetal growth using a portable ultrasound [Edge (Edge Ultrasound System), Sonosite/Fujifilm Edge (FUJIFILM SonoSite Inc.)]. Across the trial locations, up to 800 older adult women 40 to <80 years of age (confirmed by government-issued ID whenever possible) who reside in the same households as an enrolled pregnant woman are being recruited (one per household), provided they do not fall within the following exclusion criteria: currently smoking cigarettes or other tobacco products, pregnant (via selfreport), or planning to permanently move out of current household in the next 12 months.

Baseline Surveys and Assessments; Randomization
Following recruitment and informed consent, a baseline visit is made to the household by a trained fieldworker to conduct surveys and other assessments. This baseline visit includes a survey that covers a range of topics like cooking behaviors, household composition, socioeconomic and demographic information, housing characteristics, and pregnancy-related information. Pregnant women are also surveyed about their medical and gynecological history, including medication use. Separate questionnaires assess physical activity, dietary diversity, household food insecurity, and household expenditures.
The baseline visit includes assessments of health, biomarkers, and exposure for both the pregnant woman and older adult woman. For pregnant women, a trained fieldworker or nurse measures resting BP (model HEM-907XL; Omron ® ) in Table 2. Summary of key characteristics of international research centers by country based on sampling, government information, or published studies. . Further, carotid intima-media thickness (CIMT) and, in Peru, only due to its exploratory nature, brachial artery reactivity testing (BART) are measured among older adult women using the ultrasound devices described above. Common carotid artery (CCA) ultrasound will be performed as a marker of atherosclerosis using a high-resolution linear transducer to image the distal 1-cm CCA region (just proximal to the bifurcation) (Stein et al. 2008). CIMT will be obtained from enddiastolic B-mode images as the average of the posterior wall segments from both right and left CCAs, measured using an automated system with an edge detection algorithm and manual override capacity (100 separate dimensional measurements are obtained from the 1-cm segment and averaged to obtain mean and maximal CIMT values). BART will be performed with a portable ultrasound to assess endothelial function using a high-resolution linear transducer to image the brachial artery (BA) above the antecubital fossa to measure rest diameter (Corretti et al. 1995). The BA will be occluded for 5 min by a BP cuff inflated to suprasystolic levels in the forearm; after 5 min of occlusion, the BP will be released, and the BA diameter will be imaged every 30 s for 2.5 min after cuff deflation to calculate percent BA dilation after hyperemia. Urine samples (first morning void) and dried blood spots (DBS) via finger prick are obtained from all participating pregnant women and older adult women. Finally, pregnant women and the older adult women are monitored for 24-h personal and household level exposure to HAP (PM 2:5 , CO, and BC) using the procedures described below. Baseline data on ambient temperature and humidity in the home and primary cooking area are also collected by the data loggers of the PM 2:5 samplers, and dimensions of the kitchen are measured by the surveyor.
After the baseline surveys and assessments are completed, households are randomly assigned to intervention or control arms, stratified by country. In India and Peru, additional stratified randomization is used to ensure a balance between discrete geographical regions within the study area. In Rwanda and Guatemala, the study areas are deemed homogenous so that such further stratification is not necessary.

Intervention
The intervention consists of an LPG stove, a continuous supply of LPG fuel delivered to the homes for 18 months, as well as education and behavioral messages (described in future papers) to promote safe, exclusive use of the LPG stove for cooking. Stoves are procured in each country and vary (footprint, base, burner size, nob position, and griddle) based on local cooking practices; however, all include at least two burners and meet applicable safety requirements. The intervention (stove and fuel) is provided free of charge to all intervention households after baseline measurements are conducted. On each visit to provide additional fuel cylinders in Rwanda, Guatemala, and Peru, stove condition is examined, any necessary repairs performed, and the weight of LPG cylinders measured and recorded in order to help monitor use and anticipate the need for additional refills. In India, per national governmental regulations, the public sector oil marketing company is responsible for stove installation, cylinder refills, and repairs; study staff facilitate those activities for intervention participants. The rate of LPG usage is monitored by calculating average kilograms of LPG used per household member per day (using data on fuel cylinder weights and the number of days between installation and exchange of each cylinder).

Control Compensation
Control households receive compensation designed to meet three aims. First, it must comply with applicable ethics requirements for treatment of controls. Second, we are compensating control participants for the burden associated with this study, with the view of minimizing losses to follow-up. Third, we offset the economic advantage to intervention households accorded by the provision of free stoves and fuel. While the details vary across the four countries, compensation was designed based on a uniform set of trial-wide principles that address the above aims, with details informed by focus group discussions in the communities selected for the intervention. Controls receive either an LPG stove and a supply of fuel at the end of the trial or preferred alternatives of comparable value during or at the end of the trial. Details concerning the development of the compensation strategy are provided elsewhere ).

Primary Outcomes
Our primary health outcomes are birth weight, severe pneumonia in the first 12 months of life, stunting at 12 months of age, and BP in the older adult woman over 18 months of follow-up.
Birth weight is measured in duplicate to the nearest gram within 24 h of birth by a trained fieldworker or nurse using a routinely calibrated seca 334 mobile digital baby scale (Seca). If the first two measurements differ by more than 10 g, a third measurement is taken. Newborns are weighed naked or in a preweighed blanket, typically at the health facility where infants are delivered.
For the purposes of the HAPIN trial, a case of severe pneumonia is adapted from the revised WHO classification of childhood pneumonia (WHO 2014b) and is consistent with expert opinion (Goodman et al. 2019;WHO 2014a). This classification includes two independent algorithms for severe pneumonia, which we have enriched with more objective imaging and pulse oximetry criteria (Simkovich et al., in press). The first algorithm requires the presence of cough and/or difficulty breathing, at least one general danger sign, and primary end point pneumonia on either lung ultrasound or chest radiograph imaging. Lung ultrasound is our preferred imaging modality; however, if logistical reasons preclude lung ultrasound, then we will consider chest radiograph images interpreted according to WHO methodology. The second severe pneumonia algorithm requires the presence of cough and/or difficulty breathing and hypoxemia, measured noninvasively by pulse oximetry. We will define hypoxemia based on the physiologic threshold of ≤92% for altitudes <2,500 m above sea level and ≤86% for altitudes ≥2,500 m above sea level. BP is measured in the right arm of an older adult woman in triplicate (with at least 2 min resting between repeat measurements) using an automatic BP monitor (model HEM-907XL; Omron ® ). The first measurement is ignored to reduce white coat hypertension. The average of the second and third measurements will be used for analysis. Measurements are taken after receiving assurances that the older adult woman has not smoked, consumed alcohol or a caffeinated beverage (coffee, tea, or cola), or cooked using biomass in the 30 min prior to the measurement. She is asked to sit in a chair in a quiet room for 5 min with legs uncrossed, back supported by a chair, and arm supported by a table prior to commencing the measurements.

Secondary Outcomes
Secondary outcomes are maternal BP, preterm birth, fetal growth, infant linear growth (as a continuous outcome), infant development using the long-form version of Caregiver Reported Early Development Instruments (CREDI) (McCoy et al. 2017), WHO severe pneumonia among children <12 months old, WHO pneumonia (both severe and nonsevere) confirmed by ultrasound in Guatemala, burns, CIMT, BART (Peru only), SGRQ (respiratory symptoms), SF-36 (quality of life), household expenditures for fuel and healthcare, household time/activity, and chronic disease biomarkers in the older adult woman. Detailed methods on these outcomes will be included in the papers reporting the results.

Exposure Assessment
Two approaches to measure personal HAP exposure are used. For pregnant women (baseline, 24-28 wk gestation, and 32-36 wk gestation) and older adult women (baseline and five additional times during follow-up) ( Table 1), exposure is measured through instrumentation placed on the participant. Pollutants measured include PM 2:5 , CO, and BC. PM 2:5 concentrations are assessed using both gravimetric and real-time methods, and the PM 2:5 filters are being assessed for BC with SootScan™ Model OT21 Transmissometers (Magee Scientific). The primary instrument for this purpose is the Enhanced Children's MicroPEM™ (ECM) developed by RTI International (Johnson et al. 2020). Real-time PM 2:5 concentrations are estimated with the ECM's nephelometer. Real-time CO concentrations are measured personally for all pregnant women and older adult women (and for infants in Guatemala) and area monitoring locations with EL-USB-300 CO monitors (Lascar Electronics).
For infants for whom the ECM is impractical to use due to their size or participant preference, a microenvironmental approach is employed at 3, 6, and 12 months of age (Balakrishnan et al. 2004;Xu et al. 2018;Zuk et al. 2007), whereby area sampling via ECMs is used in main living/sleeping areas, including the kitchen. The child is outfitted with a coin-sized location beacon (Roximity) linked to receivers in these same areas to objectively assess location. Personal exposures for the child are estimated by integrating corresponding area concentrations with time spent in the respective locations along with time spent near the mother, who is instrumented with an ECM and receiver whenever feasible. Details of this approach can be found in Liao et al. (2019), which presents results from piloting the indirect assessment with proximity beacons for the HAPIN trial in Guatemala.
To track changes in ambient PM 2:5 concentrations and broadly characterize potential for regional air quality impacting participant exposure, ambient measurements are being made in at least two locations at each international research sites (IRCs) field site. Real-time ambient concentrations of PM 2:5 are measured using the E-Sampler (Met One Instruments) or comparable systems with an integrated 24-h filter-based measurement.
A full description of the exposure assessment procedures is described elsewhere (Johnson et al. 2020).

Stove Use Monitoring
Stove use is monitored throughout the 18-month follow-up period in both intervention and control households using a combination of observations, reports, and instruments. Use of the LPG stoves in intervention arm households is also tracked via data on the distribution and use of LPG cylinders. Additionally, all homes in the intervention arm, and a subset of ∼ 80 homes in the control arm of each country, are equipped with stove use monitors (SUMS), which are temperature data loggers placed on traditional stoves (Ruiz-Mercado et al. 2012; Pillarisetti et al. 2014). SUMS data by household, supplemented with observations made by local study teams, are reviewed weekly to identify any continued traditional stove usage. Where it occurs, local teams, following trialwide written standard operation procedures (SOPs), visit each home to address barriers to exclusive LPG stove use that may be responsible and reinforce behavioral messages developed in relation to local cooking needs to minimize future events. For example, barriers related to capabilities and skills are addressed by how-to training, whereas barriers related to motivation are targeted with an appeal to emotions such as trust and security and conscious decision-making. Barriers related to opportunity and context are addressed with suggestions for adapting to contextual realities. The 80 control homes per country comprise the 20% random subsample in which additional exposure measurements are taking place (as described above). The SUMS are Geocene Temperature Loggers™, which attach high-temperature thermocouples to the various stoves used in a home.

Biomonitoring
Biomarker analyses are an integral part of the trial, drawing on existing biomarker research and developing and validating novel biomarkers associated with HAP exposure and health. The Biomarker Center is based at Emory University (Emory), where samples from three international research centers are analyzed. Samples from India are principally analyzed at a laboratory based in India, where extensive infrastructure and capacity already exists, and that laboratory collaborates closely with Emory in methods and data analysis. Using urine and DBS collected at regular intervals from the pregnant woman, older adult woman, and child (Table 1), the Biomarker Center will: a) assess repeated measures, including targeted biomarkers of exposure, biomarkers of tobacco smoke exposure, biomarkers of effect that are predictive of clinical outcomes, and exploratory analyses that include metabolomics, mRNA, miRNA, and DNA methylation; b) perform HAP-specific biomarker development and validation; c) create and operate a biospecimen repository; and d) create a data set for deposit in the National Institutes of Health (NIH) BioLINCC repository and ensure that study procedures are compatible with that program's requirements for biospecimen collection, labeling, and storage. Biomarker sampling methodologies are described in detail elsewhere (Barr et al. 2020;Hu et al. 2000;Liu et al. 2014).
Prior studies link exposure to HAP with cardiometabolic biomarkers and metabolomic profiles as measured in DBS (Clark et al. 2009), a method that overcomes collection, transportation, and storage limitations of venipuncture sampling. The ability to evaluate biomarkers of exposure in young children is an innovative tool for HAP research. Biomarkers of exposure (e.g., polycyclic aromatic hydrocarbons, volatile organic chemicals, and levoglucosan, a marker of wood smoke) are measured at multiple time points in all study participants; correlations will be assessed with measured 24-h pollutant concentrations. In the older adult woman, measurements include a suite of biomarkers of endothelial function, inflammation, oxidative stress, glycemic control/diabetes (HbA1c), a marker with specific relevance to lung cancer (P53 tumor-associated antigen antibodies), and enzyme induction (cytochrome P450). These biomarkers were chosen to inform mechanistic pathways and/or because of their capabilities to predict future disease risk. Biomarker discovery (metabolomics and miRNA evaluations) (Espín-Pérez et al. 2014) will be conducted using DBS collected in a subsample of children (n 100=site) and older adult women (n 100=site).
We are also conducting a substudy focused on cancer biomarkers, conducted in collaboration with the National Cancer Institute, among older adult women from the Peru and Guatemala research centers. These centers were chosen because early data indicated that they have a higher percentage of households with older women than the other sites. Among these older women, buccal cells, oral rinse, nasal turbinate, and peripheral (venous) blood are collected at baseline and a year later (in both intervention and control arms) and analyzed to provide additional data on the effects of biomass emission exposure on biomarkers related to cancer. Sample size for the cancer substudy was based on available budgetary resources and in the absence of any prior knowledge of the effect size for change in the prevalence of biomarkers.

Follow-Up Assessments
Each participating household is followed from enrollment until the index child reaches his/her first birthday. The data collection schedule during this period is summarized in Table 1.

Trial Management
The trial is led by a steering committee composed of the study's multiple principal investigators (MPIs), the lead investigators of each of the international research centers, the directors of the biomarker core, one NIH project officer, one NIH scientific officer, and one representative of the Bill and Melinda Gates Foundation (BMGF). The steering committee obtains guidance on particular areas of expertise from five study cores (behavior and economics, exposure, clinical and imaging, biomarkers, and data management) and two working groups (pneumonia; anthropometry and nutrition). Day-to-day management of the trial is led by the trial coordinating center, based at Emory. The steering committee is supported by an external advisory committee and the pneumonia working group by an outside expert group. A data safety monitoring board (DSMB) appointed by the National Heart, Lung, and Blood Institute (NHLBI) is responsible for safeguarding the interests of study participants, assessing the safety and efficacy of study procedures, ensuring data quality, and monitoring the overall conduct of the study.

Data Management and Reporting
Field staff collect data on password-protected tablets, then upload the data (daily) to a secure REDCap™ (research electronic data capture) server [Health Insurance Portability and Accountability Act (HIPAA) and Federal Information Security Management Act (FISMA) compliant] hosted by Emory. Every day after uploads, the tablets are refreshed, and all data are deleted from the mobile device. Emory has a daily offsite backup on the REDCap™ server.
We also implemented algorithms for real-time data quality checking wherever possible, which triggers an error or warning message as the staff collect the data, even if they are offline. For example, if a staff member enters 20 kg for a baby's weight, the tablet shows a warning message that the value is more than the maximum allowed, and she/he can correct the entry.
The data management core (DMC), jointly with data managers in each country, is responsible for collecting survey and other field data on tablet-based platforms that incorporate GIS (geographical information systems) positioning and QR readers, allowing data collection teams to scan barcodes placed on LPG stoves, fuel tanks, exposure monitoring equipment, and biospecimen collection material. The DMC developed and maintains SOPs for all data management activities, which detail specific elements of data storage, entry, quality, transfer, processing, and security. The main study database is maintained in REDCap™ (for forms), LIMS™ (laboratory information management system for laboratory data), and TRICE™ (for clinical imaging). These systems are integrated using LabKey™.
Data reports with all relevant data and metadata for each of the analytic aims of the study are made available regularly to members of the study steering committee, the DSMB, and the relevant study investigators after consultation of exact data needs. However, until results are analyzed and reported by the DMC, investigators are blinded to study arm assignment except for the purpose of monitoring compliance with the intervention. Nevertheless, because of the nature of the intervention, fieldworkers are assumed to have knowledge of study arm assignment for data collected at the household level.
Final study data associated with HAPIN publications will be archived by the DMC to allow data sharing as per NIH requirements following a registration and review process by the DMC. The DMC will manage data requests and document those receiving data.

Protocol Standardization and Data Quality Control
Several activities have been undertaken to ensure standardization of field protocols within and between the four countries involved in this trial. These include the use of SOPs, step-by-step instruction sheets for each case report form (CRF), and a step-by-step visit schedule, all of which were translated into local languages and reviewed during in-person trainings at the field sites. In addition, during these in-person trainings, the representative of each core or work group directly observed field teams conduct practice visits and provided real-time feedback. Finally, field supervisors conducted weekly direct observations of enumerators and completion of a supervisor checklist that included key procedures to observe for each measurement, as determined by their respective core or work group.
With specific regard to CRFs, these were labeled as "M" for CRFs collecting information on the pregnant women, "C" for those collecting information on the child, "H" for those collecting information on the household, and "A" for those collecting information on older adult women. While the CRF content was translated into the local language, the CRF codes (e.g., "M10" for the mother's demographic information) and question numbers were not changed to facilitate communication. In addition, bilingual color-coded CRFs were made available to all study staff in order to address any ambiguity on original questions and translations.
Data quality was monitored by each core or work group, as they had the required content knowledge to identify issues and propose appropriate solutions. For example, for the primary outcomes of birth weight and stunting (length-for-age z-score), the nutrition work group received the raw REDCap™ data download for the CRFs containing those measurements every 2 months from the central data management center at Emory. They then ran a standard report analysis code to generate a standard set of parameters established by a group of nutrition experts to assess data quality. These included summarizing missing data; descriptive statistics (mean, standard deviation, and minimum and maximum for continuous variables and prevalence of low birth weight and stunting, which are compared to DHS estimates); flagging outliers (defined as greater than ± 3 SD) and informing field teams so that they can identify potential errors in recording; tracking individual child growth trajectories and flagging children that lose weight or get shorter as they age, as these may be potential errors in recording and evaluating digit preference, the number of repeat measurements that are identical, and the time between repeat measurements as well as the time between birth and the birth weight measurement (should be within 24 h).

Data Analysis
Analysis of primary outcomes is described below; the analyses for secondary outcomes will be analogous. For aims 1 and 3, comparisons are made between the study arms to which participants were assigned, regardless of their actual adherence to the intended condition. Two-sided tests for each primary outcome will be performed separately at an a level of 0.0125 (using a conservative Bonferroni correction, i.e., a level of 0.05/4, the number of primary outcomes). The primary analyses will include indicator variables to account for the stratification of random sampling with different geographical areas in India and Peru, as well as individual sites.
Analysis for a continuous measure like birth weight, absent confounding because of the randomization, essentially is a t-test for mean birth weight between the two arms. For the two binary primary outcomes, pneumonia incidence over a 1-y follow-up period and stunting at 12 months of age, we will compare the two groups using their relative risk and accounting for randomization strata. BP analyses in older adult women will exclude women who self-report the use of BP medication at baseline (anticipated to be ∼ 5%). Repeated measurements of BP of the older adult woman will be analyzed using a random intercept model to examine differences in mean BP over the follow-up period, controlling for baseline BP. Women who begin BP medication during follow-up will be censored at that time in this analysis. Using longitudinal BP measurements, we will also consider whether the intervention is associated with the rate of change in BP changes over time in a sensitivity analysis by examining the interaction term between time and intervention.
In a supplemental analysis, women beginning BP medication after enrollment will be included via the method "j" described by Tobin et al. (2005), which is based on the assumption that treated women have BPs that are right censored in a normal distribution, with the censoring occurring above the level at which hypertensives are typically treated.
For exposure-response analyses, the general regression model is given by where Y i is the primary outcome of interest, gð Þ is the appropriate link function (identity for birth weight and BP, logistic for stunting and pneumonia), X i is the continuous exposure of interest, and Z i is the vector of confounders. Potential confounders include, but are not limited to, age, sex, body mass index, socioeconomic status, tobacco use, secondhand smoke, physical activity, dietary intake, and season. These potential confounders have been identified as being associated with exposure or outcomes in prior studies. However, we expect that exact list of confounders to differ by outcome. For each outcome, we determine potential confounders using previous literature and directed acyclic graph methods. We will determine final models by evaluating change in effect estimates for meaningful changes. Exploration of models will consider the consideration of possible intermediate variables and effect modification. While we do not anticipate a great amount of missing data, we will evaluate whether the missing data patterns are differential between intervention and control groups, as well as evaluate the temporal pattern using Kaplan-Meier curves. For all analyses, we will assume missing data to be missing at random (MAR) a priori. The MAR mechanism specifies that the complete data distribution can be modeled using the observed data.
For repeated measurements of BP, an individual-level random intercept will be included, and baseline BP will be included in the model. We will consider PM 2:5 mass to be our primary exposure measurement; we will also evaluate CO and BC in secondary analyses. For birth weight, a gestational exposure level will be obtained by averaging the two or three 24-h average measurements available during pregnancy. For birth weight, analyses will be restricted to full-term births, or we will take into account gestational age by using as the outcome z-scores for birth weight adjusted for gestational age [derived from INTERGROWTH tables (https://intergrowth21.tghn.org)]. For stunting and pneumonia, we will consider average gestational and average firstyear-of-life exposures (average of the 24-h measurements). For BP, we will consider time-varying exposures (24-h averages), available at the time of BP measurement.

Study Power
Study power for primary outcomes is given in Table 3, which summarizes the minimal detectable difference in mean (for birth weight and BP) and minimal detectable relative risk (for stunting and pneumonia) associated with an 80% power and a type I error rate a level of 0.0125, assuming a 10% attrition during follow-up. For BP, we show power calculations for 200 older adult women per arm, although we expect about 240 per arm, given that initial recruitment data suggest that there will be older adult women in about 15% of households. Table 3 shows we have good power to detect a smaller difference than has been previously reported in the literature.

Discussion
HAP is a leading cause of morbidity and mortality, especially among young children in LMICs (GBD 2018). Large-scale programs have succeeded in replacing traditional stoves with improved biomass stoves that reduce fuel consumption (Pope Our key population parameters were the variance of continuous measures (r 2 ) or the incidence rates among controls (p 1 ) for relative risks. We took our key parameter estimates from the previous studies listed in the last column, with the exception of the control rate for pneumonia, which was estimated from our early data from the trial ).
b Thompson et al. (2011) is a substudy of a randomized trial of improved cookstoves in Guatemala (RESPIRE) in which those using stoves were compared to those not using stoves after adjusting for confounders.   Smith et al. (2011), from the RESPIRE randomized trial, provides the estimated relative risk (RR) of 0.67, the RR for clinician-diagnosed severe pneumonia for children under 18 months of age. Our estimated background rate of 0.09 for controls is based on our observed severe pneumonia rate in both arms (treatment and control) in the HAPIN study (using the HAPIN severe pneumonia definition), with 20% of the child person time observed, and assuming the RESPIRE RR of 0.67 for intervention vs. controls. et al. 2017). However, deployment of these stoves has not generally achieved major reductions of fine PM, nor have they achieved meaningful improvements in health (Sambandam et al. 2015). Moreover, continued use of traditional stoves, along with improved solid fuel stoves (i.e., stacking), has reduced the potential contribution of such stoves to improved health. Systematic reviews have linked cleaner cooking fuels with improvements in health (Quansah et al. 2017;Thakur et al. 2018;WHO 2014c;Pope et al. 2017;Bruce et al. 2013), but evidence is still not strong, suffers from few well-designed studies, and varies by end point. Intervention studies of chimney stoves or improved solid biomass stoves have provided mixed evidence of health benefits (Sambandam et al. 2015;McCracken et al. 2007;Thompson et al. 2011;Hartinger et al. 2013;Tielsch et al. 2016;Johnson et al. 2013;Mortimer et al. 2017;Lee et al. 2019;Alexander et al. 2018). Evidence from observational studies is moderate for birth weight, pneumonia, and BP and still sparse for stunting. Recent trials have shown mixed results from clean cooking fuels, including ethanol and LPG (Alexander et al. 2018;Olopade et al. 2017;Alexander et al. 2017). Taken together, by addressing multiple outcomes at different ages and studying several LMIC contexts, the HAPIN trial is well positioned to fill critical scientific gaps with direct relevance to national policies.
This study addresses many of these gaps while also advancing knowledge in new areas. For example, the prevalence of low birth weight continues to be high in many LMICs, and preterm birth remains the leading cause of death among children under 5 years of age globally (Liu et al. 2015). While substantial literature exists on the effect of active tobacco smoking and ambient air pollution on low birth weight and preterm birth, few studies have examined the effect of these in LMICs or specifically the effect of HAP exposures on these outcomes (Thakur et al. 2018;Sambandam et al. 2015;Balakrishnan et al. 2013Balakrishnan et al. , 2018Alexander et al. 2017;Liu et al. 2015;Amegah et al. 2014;Wylie et al. 2014). Only three studies have quantified the association between HAP and child developmental outcomes (Suter et al. 2018;Dix-Cooper et al. 2012;Munroe and Gauvain 2012), and only two quantified the association between HAP and stunting (Wylie et al. 2017;Kim et al. 2017), but none have done so using a longitudinal approach in children under 1 years of age. Our case definition of pneumonia is designed to balance the need for sensitivity and specificity (Goodman et al. 2019).
Similarly, the CVD burden from HAP is estimated from integrated exposure-response models (Burnett et al. 2014) that are not informed by direct measurements. Demonstrating potential improvements in cardiovascular biomarkers and BP through the LPG stove intervention can provide strategic information for the prevention of CVD itself. This is especially important for LMICs, where the health burden from noncommunicable diseases is increasing rapidly (WHO 2017). Finally, the inclusion of molecular biomarkers of exposure and early biological effect for a range of chronic health end points, including cancer, fills a critical need to establish biomarker-based approaches to assess longterm health impacts from HAP (IEA 2017; Caravedo et al. 2016).
While providing policy-relevant results, our research will incorporate important technical and training innovations, including expert clinical outcomes and imaging support for and training in pulmonary (point-of-care ultrasound for severe pneumonia), noninvasive markers of cardiovascular health (ultrasound assessment of endothelial function and CIMT), and obstetrics-gynecology (fetal growth). We will analyze targeted biomarkers of exposure, susceptibility, and effect, providing the most comprehensive analysis of relevant biomarkers in a HAP study to date. We will conduct discovery metabolomics and miRNA, mRNA, and DNA methylation analyses to evaluate perturbations occurring in response to HAP exposure to discover novel HAP biomarkers. We are implementing comprehensive stove-use monitoring, building on previous research (Ruiz-Mercado et al. 2012;Pillarisetti et al. 2014). These data, in conjunction with information gained in formative research on barriers to adoption and sustained, exclusive use of LPG stoves, will be used to maximize adherence and will help inform efforts to minimize stacking when stoves are delivered.
Our study will include a more comprehensive exposure assessment of PM 2:5 and CO in RCTs than has been attempted to date, with multiple measurements, modeling of exposures, and assessments of stove stacking via SUMs. We expect to be able to estimate longer-term exposure averages, which can be used in exposureresponse analyses. We anticipate a wide range of intervention to control and cross-cohort exposure contrasts across and within our four countries, making exposure-response analyses more powerful.
Moreover, our study will be the first HAP intervention trial to be conducted in multiple country settings using a common protocol, collecting a rich set of data with the potential to help define the generalizability criteria of the findings. While this efficacy study includes elements that are unlikely to be operationalized at scale (free fuel, purposefully selected settings with low to moderate ambient pollution, and frequent behavioral reinforcement to minimize stacking with traditional stoves), it will yield important comparable information about the most scalable clean fuel intervention across different world settings. As governments worldwide undertake efforts to reduce reliance on solid biomass fuels, the trial will provide data-supported evidence of the immediate and long-term health benefits that may be achieved by expanding access to and promoting the exclusive use of clean cooking fuels by those relying on solid biomass.