Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/ radical2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15\%. Distribution-based multiple imputation methods, also known as robust or "fill-in" procedures, may produce dependable results even when 50-70\% of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6\% of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.

Handling of dioxin measurement data in the presence of non-detectable values: overview of available methods and their application in the Seveso chloracne study.

BONZINI, MATTEO;
2005-01-01

Abstract

Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/ radical2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15\%. Distribution-based multiple imputation methods, also known as robust or "fill-in" procedures, may produce dependable results even when 50-70\% of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6\% of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.
2005
http://dx.doi.org/10.1016/j.chemosphere.2005.01.055
Acne Vulgaris; Chemistry Techniques; Analytical; Data Interpretation; Statistical; Environmental Monitoring; Humans; Italy; Linear Models; Tetrachlorodibenzodioxin
A., Baccarelli; R., Pfeiffer; D., Consonni; A. C., Pesatori; Bonzini, Matteo; D. G., Patterson; P. A., Bertazzi; M. T., Landi
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/1718918
Citazioni
  • ???jsp.display-item.citation.pmc??? 49
  • Scopus 160
  • ???jsp.display-item.citation.isi??? 150
social impact