A Web-Based Database on Exposure to Persistent Organic Pollutants in China

Zhaomin Dong,1,2 Xiarui Fan,1 Yao Li,1 Ziwei Wang,1 Lili Chen,3 Ying Wang,1,2 Xiaoli Zhao,4 Wenhong Fan,1,2 and FengChang Wu4 School of Space and Environment, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, China Beijing Academy of Edge Computing, Beijing, China State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, China


Introduction
To comply with the Stockholm Convention, China has initiated a series of research activities to restrict, reduce, and eliminate the production, use, and emission of persistent organic pollutants (POPs). Major findings from these activities have been regularly documented in peer-reviewed journals (Chen et al. 2020;Jiang et al. 2019). Although many studies have been performed, the monitoring results have been reported in different contexts, which typically vary in terms of sample inclusion criteria, field treatments, target POPs or congeners, analytical methods, statistical descriptions, units, and other aspects (Lau et al. 2012). To better understand the nationwide occurrence of POPs, capacity buildings on data collection, harmonization, and analysis are urgently needed.
Toward this goal, we first identified publications related to the monitoring of POPs in China, and then manually extracted data records on the monitoring of POPs in a range of environmental elements (such as air, soil, sediment, dust, water, and food), biota, and human matrices (blood, urine, breast milk, and placenta). In particular, the data extracted from literature were marked and recorded by the first author and then cross-checked by another independent author. After that, we developed an online database (https://pops.hhra.net) that collates literature information, exposure data, spatiotemporal analysis, and risk assessments of POPs in China.

Methods
The establishment of this online database consisted of five steps ( Figure 1). First, we created a chemical panel that comprised all POPs listed in the Stockholm Convention. In step 2, we identified relevant literature from four databases, including PubMed, Web of Science, Scopus, and China National Knowledge Infrastructure (CNKI). The titles, authors, publication dates, journal names, and digital object identifiers (DOIs) of the resulting publications were extracted by text mining (Barupal and Fiehn 2019). Briefly, a systematic search was performed with a query based on the combined keywords of "chemical names" and "exposure pathways" and "China." Multiple chemical names were used for each chemical to retrieve as much of the available literature as possible. We also denoted 11 pathways of exposure as follows: air, soil, dust, sediment, water, food, biological, blood, serum, plasma, and breast milk. Regarding the PubMed database, we used an R-based web crawler to identify target literature. With respect to other three databases, we directly exported the search results based on query sentences. The supplementary methodology, code for web crawler, search strategy, and query sentences for each database were available on GitHub (https://github.com/POPs-EXPs/beta-version).
In step 3, we excluded the irrelevant literature based on title, abstract, and full-text examination, and the data from the remaining literature were manually extracted and cross-checked by two authors. For each record, we included the available items, including chemical panel, chemical name, congener, DOIs, sampling location (province, city, and site), sampling time, sample size, pathway, unit and statistics (range, mean, median, and standard deviation), and method detection limit. After standardizing (such as data clean and unit conversion) the extracted exposure data, we initiated the functions of spatiotemporal analysis and risk assessment in step 4. Finally, the online database and data visualization were projected online (https://pops.hhra.net) using (version 1.6.0, R 3.6.0, Windows 10) R "Shiny."

Results and Discussion
The literature search yielded 9,131 publications that had been published before the end of May 2020. Irrelevant studies (6,332) were excluded based on the examination of titles, abstracts, and fulltexts, after which 2,799 studies were kept (Figure 2A). Figure 2B showed that there had been a rapid annual increase in publications on POPs over the last two decades. Figure 2C indicated that the most commonly studied POPs group was organochlorine pesticides (OCPs), followed by dioxins and polychlorinated biphenyls (PCBs), brominated flame retardants (BFRs), per-and polyfluoroalkyl substances (PFAS) and other POPs. The trends in the numbers of environmental matrices associated studies were also illustrated in Figure 2D.
Based on the remaining 2,799 publications, we manually extracted 112,878 records on the exposure of POPs. In particular, there were 21,690 data items for OCPs; 62,495 for dioxins and PCBs; 22,466 for BFRs; 4,534 for PFAS; and 1,693 for other POPs. All records were combined to form our online dashboard of the exposure database. Also, the data analysis and human health risk assessment can be easily performed by setting a range of condition filters (such as chemical panel, pathway, sampling information, unit, and others). In addition, this database will be annually updated by mid-August of each calendar year by integrating exposure data extracted from the latest publications.
The ways in which our database can assist research include at least four aspects: a) the spatiotemporal patterns on the external exposure and risk of POPs, which promise to help evaluate the management of POPs; b) the optimization of pharmacokinetic models and associated parameters, as exemplified by an existing case study that successfully derived the half-lives of short-chain chlorinated paraffins using the simultaneous monitoring of internal and external exposure (Dong et al. 2020); c) the prediction of the internal exposure of POPs using the "forward approach" (Sobus et al. 2015 The authors declare they have no actual or potential competing financial interests. Note to readers with disabilities: EHP strives to ensure that all journal content is accessible to all readers. However, some figures and Supplemental Material published in EHP articles may not conform to 508 standards due to the complexity of the information being presented. If you need assistance accessing journal content, please contact ehponline@niehs.nih.gov. Our staff will work with you to assess and meet your accessibility needs within 3 working days. d) the reconstruction of external exposure, relying on internal exposure, pharmacokinetic models, and advance statistical techniques (Lyons et al. 2008).
In summary, this study established an open-access database (https://pops.hhra.net) that included 112,878 records extracted from 9,131 publications, serving as, to our knowledge, the first tool illustrating the exposure of POPs in China. This large data set helps reveal the research status and occurrence of POPs in China and also released more research opportunities to fulfill China's obligations as a signatory to the Stockholm Convention.