Prerequisite for Imputing Non-detects among Airborne Samples in OSHA’s IMIS Databank: Prediction of Sample’s Volume

Author:

Burstyn Igor1ORCID,Sarazin Philippe2,Luta George3,Friesen Melissa C4ORCID,Kincl Laurel5,Lavoué Jérôme6ORCID

Affiliation:

1. Department of Environmental and Occupational Health, Dornsife School of Public Health, Drexel University , Nesbitt Hall Room 614, 3215 Market Street, Philadelphia, PA 19104 , USA

2. Chemical and Biological Hazards Prevention, Institut de recherche Robert-Sauvé en santé et en sécurité du travail , Montréal, Québec H3A 3C2 , Canada

3. Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University , Washington, DC , USA

4. Occupational and Environmental Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute , Bethesda, MD 20850 , USA

5. College of Public Health and Human Sciences, Oregon State University , Corvallis, OR 97331 , USA

6. Department of Environmental and Occupational Health, School of Public Health, Université de Montréal, Montréal , Québec , Canada

Abstract

AbstractIntroductionThe US Integrated Management Information System (IMIS) contains workplace measurements collected by Occupational Safety and Health Administration (OSHA) inspectors. Its use for research is limited by the lack of record of a value for the limit of detection (LOD) associated with non-detected measurements, which should be used to set censoring point in statistical analysis. We aimed to remedy this by developing a predictive model of the volume of air sampled (V) for the non-detected results of airborne measurements, to then estimate the LOD using the instrument detection limit (IDL), as IDL/V.MethodsWe obtained the Chemical Exposure Health Data from OSHA’s central laboratory in Salt Lake City that partially overlaps IMIS and contains information on V. We used classification and regression trees (CART) to develop a predictive model of V for all measurements where the two datasets overlapped. The analysis was restricted to 69 chemical agents with at least 100 non-detected measurements, and calculated sampling air flow rates consistent with workplace measurement practices; undefined types of inspections were excluded, leaving 412,201/413,515 records. CART models were fitted on randomly selected 70% of the data using 10-fold cross-validation and validated on the remaining data. A separate CART model was fitted to styrene data.ResultsSampled air volume had a right-skewed distribution with a mean of 357 l, a median (M) of 318, and ranged from 0.040 to 1868 l. There were 173,131 measurements described as non-detects (42% of the data). For the non-detects, the V tended to be greater (M = 378 l) than measurements characterized as either ‘short-term’ (M = 218 l) or ‘long-term’ (M = 297 l). The CART models were complex and not easy to interpret, but substance, industry, and year were among the top three most important classifiers. They predicted V well overall (Pearson correlation (r) = 0.73, P < 0.0001; Lin’s concordance correlation (rc) = 0.69) and among records captured as non-detects in IMIS (r = 0.66, P < 0.0001l; rc = 0.60). For styrene, CART built on measurements for all agents predicted V among 569 non-detects poorly (r = 0.15; rc = 0.04), but styrene-specific CART predicted it well (r = 0.87, P < 0.0001; rc = 0.86).DiscussionAmong the limitations of our work is the fact that samples may have been collected on different workers and processes within each inspection, each with its own V. Furthermore, we lack measurement-level predictors because classifiers were captured at the inspection level. We did not study all substances that may be of interest and did not use the information that substances measured on the same sampling media should have the same V. We must note that CART models tend to over-fit data and their predictions depend on the selected data, as illustrated by contrasting predictions created using all data vs. limited to styrene.ConclusionsWe developed predictive models of sampled air volume that should enable the calculation of LOD for non-detects in IMIS. Our predictions may guide future work on handling non-detects in IMIS, although it is advisable to develop separate predictive models for each substance, industry, and year of interest, while also considering other factors, such as whether the measurement evaluated long-term or short-term exposure.

Funder

National Cancer Institute

National Institutes of Health

Publisher

Oxford University Press (OUP)

Subject

Public Health, Environmental and Occupational Health

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3