A case study of toxicity of (benzo)triazoles ((B)TAZs) to the algae Pseudokirchneriella subcapitata is used to discuss some problems and solutions in QSAR modeling, particularly in the environmental context. The relevance of data curation (not only of experimental data, but also of chemical structures and input formats for the calculation of molecular descriptors), the crucial points of QSAR model validation and the potential application for new chemicals (internal robustness, exclusion of chance correlation, external predictivity, applicability domain) are described, while developing MLR-OLS models based on molecular descriptors, calculated by various QSAR software tools (commercial DRAGON, free PaDEL-Descriptor and QSPR-THESAURUS). Additionally, the utility of consensus models is highlighted. This work summarizes a methodology for a rigorous statistical approach to obtain reliable QSAR predictions, also for a large number of (B)TAZs in the ECHA preregistration list of REACH (even if starting from limited experimental data availability), and has evidenced some ambiguities and discrepancies related to SMILES notations from different databases; furthermore it highlighted some general problems related to QSAR model generation and was useful in the implementation of the PaDEL-Descriptor software.

QSAR Modeling is not “Push a Button and Find a Correlation”: A Case Study of Toxicity of (Benzo-)triazoles on Algae

GRAMATICA, PAOLA;PAPA, ESTER
2012-01-01

Abstract

A case study of toxicity of (benzo)triazoles ((B)TAZs) to the algae Pseudokirchneriella subcapitata is used to discuss some problems and solutions in QSAR modeling, particularly in the environmental context. The relevance of data curation (not only of experimental data, but also of chemical structures and input formats for the calculation of molecular descriptors), the crucial points of QSAR model validation and the potential application for new chemicals (internal robustness, exclusion of chance correlation, external predictivity, applicability domain) are described, while developing MLR-OLS models based on molecular descriptors, calculated by various QSAR software tools (commercial DRAGON, free PaDEL-Descriptor and QSPR-THESAURUS). Additionally, the utility of consensus models is highlighted. This work summarizes a methodology for a rigorous statistical approach to obtain reliable QSAR predictions, also for a large number of (B)TAZs in the ECHA preregistration list of REACH (even if starting from limited experimental data availability), and has evidenced some ambiguities and discrepancies related to SMILES notations from different databases; furthermore it highlighted some general problems related to QSAR model generation and was useful in the implementation of the PaDEL-Descriptor software.
2012
Chemical structure pitfalls, Molecular descriptors, QSAR tutorial, REACH, SMILES, Triazole algae toxicity, Validation
Gramatica, Paola; Cassani, S.; Roy, P. P.; Kovarich, S.; Yap, C. W.; Papa, Ester
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/1788319
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 30
  • Scopus 185
  • ???jsp.display-item.citation.isi??? 171
social impact