The evaluation of linear regression QSAR models performances, both in fitting and external prediction, is of pivotal importance. While leave-one-out (LOO) Q2 internal validation technique (cross-validation) is well established, different external validation parameters have been proposed in the last decade: Q2-F1 (Shi), Q2-F2 (Schuurmann), Q2-F3 (Todeschini), r2m (Roy) and the Tropsha-Golbraikh method. These parameters usually are in accordance, making one confident of a model predictivity, but doubts arise when they give contradictory results. In these cases the QSAR model developer should understand which one of the aforementioned parameters is “the best”. However this is not an easy task, mainly because no one of these parameters could be considered “the best” in every situation. We are thus looking for a simpler method to evaluate the external predictivity of the models, independently on the set composition. In our opinion, the simplest method consists in the quantification of the similarity among the experimental data of external test set versus the corresponding values calculated by the model. In this study our new method has been used as a reference and we have evaluated the number of contradictory and agreeing results on validation parameters by means of 210.000 simulated datasets. A wide range of possible scenarios has been generated and, concerning the more realistic ones, 95% of agreement has been found among our method and all the aforementioned validation parameters together. Our proposed method is the most precautionary among those analyzed. We have verified that disagreements among results is related to two possible situations: a) the external data points are well predicted (good matching), while at least one of the validation parameters rejects the model (rare), b) the matching is not good and one or more validation parameters accept the model (less rare). The second alternative is more dangerous for QSAR models, thus a deeper analysis of the results is suggested. Our method, verified also on real models, has been proposed as a tool to be used in addition to the aforementioned external validation parameters to find out this kind of critical models with doubtful predictivity.

On the agreement of external validation parameters for linear regression QSAR models,

CHIRICO, NICOLA;PAPA, ESTER;GRAMATICA, PAOLA
2011-01-01

Abstract

The evaluation of linear regression QSAR models performances, both in fitting and external prediction, is of pivotal importance. While leave-one-out (LOO) Q2 internal validation technique (cross-validation) is well established, different external validation parameters have been proposed in the last decade: Q2-F1 (Shi), Q2-F2 (Schuurmann), Q2-F3 (Todeschini), r2m (Roy) and the Tropsha-Golbraikh method. These parameters usually are in accordance, making one confident of a model predictivity, but doubts arise when they give contradictory results. In these cases the QSAR model developer should understand which one of the aforementioned parameters is “the best”. However this is not an easy task, mainly because no one of these parameters could be considered “the best” in every situation. We are thus looking for a simpler method to evaluate the external predictivity of the models, independently on the set composition. In our opinion, the simplest method consists in the quantification of the similarity among the experimental data of external test set versus the corresponding values calculated by the model. In this study our new method has been used as a reference and we have evaluated the number of contradictory and agreeing results on validation parameters by means of 210.000 simulated datasets. A wide range of possible scenarios has been generated and, concerning the more realistic ones, 95% of agreement has been found among our method and all the aforementioned validation parameters together. Our proposed method is the most precautionary among those analyzed. We have verified that disagreements among results is related to two possible situations: a) the external data points are well predicted (good matching), while at least one of the validation parameters rejects the model (rare), b) the matching is not good and one or more validation parameters accept the model (less rare). The second alternative is more dangerous for QSAR models, thus a deeper analysis of the results is suggested. Our method, verified also on real models, has been proposed as a tool to be used in addition to the aforementioned external validation parameters to find out this kind of critical models with doubtful predictivity.
2011
Chirico, Nicola; Papa, Ester; Gramatica, Paola
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/1727784
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact