A Quantitative Structure-Property Relationships (QSPRs) study for the prediction of the environmental persistence of a set of 250 heterogeneous organic compounds is here presented. Three a priori defined classes of environmental persistence were generated, by Hierarchical Cluster Analysis, from the combination of half-life data in air, water, soil and sediment available for all the studied compounds. QSPR classification models were successfully developed using different techniques (k-NN, CART and CP-ANN) and three interpretable theoretical molecular descriptors. Robust external validation was provided by statistical splitting and also on completely new data. The good performances of all these models were compared and their structural domains were analyzed. The analysis of the errors highlights a slight tendency of persistence overestimation, misclassifying chemicals from a lower to a higher class of persistence, in line with the precautionary principle. Finally, the reliability of the proposed QSPR models was verified further with new data from the literature. The structure-based classification models, applicable for the prediction of potential persistence of heterogeneous organic compounds, could be useful as preliminary support tools for the identification and prioritization of new potential POPs among already existing chemicals as well as "screening prior to synthesis" procedures to avoid the production, and consequent release into the environment, of new POPs.
Screening of persistent organic pollutants by QSPR classification models: a comparative study
PAPA, ESTER;GRAMATICA, PAOLA
2008-01-01
Abstract
A Quantitative Structure-Property Relationships (QSPRs) study for the prediction of the environmental persistence of a set of 250 heterogeneous organic compounds is here presented. Three a priori defined classes of environmental persistence were generated, by Hierarchical Cluster Analysis, from the combination of half-life data in air, water, soil and sediment available for all the studied compounds. QSPR classification models were successfully developed using different techniques (k-NN, CART and CP-ANN) and three interpretable theoretical molecular descriptors. Robust external validation was provided by statistical splitting and also on completely new data. The good performances of all these models were compared and their structural domains were analyzed. The analysis of the errors highlights a slight tendency of persistence overestimation, misclassifying chemicals from a lower to a higher class of persistence, in line with the precautionary principle. Finally, the reliability of the proposed QSPR models was verified further with new data from the literature. The structure-based classification models, applicable for the prediction of potential persistence of heterogeneous organic compounds, could be useful as preliminary support tools for the identification and prioritization of new potential POPs among already existing chemicals as well as "screening prior to synthesis" procedures to avoid the production, and consequent release into the environment, of new POPs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.