An experience in the evaluation of fault prediction

IRIS - Institutional Research Information System
IRIS è il sistema di gestione integrata dei dati della ricerca (persone, progetti, pubblicazioni, attività) adottato dall'Università degli Studi dell’Insubria.

IRInSubria - Institutional Repository Insubria
IRInSubria raccoglie, conserva, documenta e dissemina le informazioni sulla produzione scientifica dell'Università degli Studi dell’Insubria anche ai fini della valutazione della ricerca.

Background ROC (Receiver Operating Characteristic) curves are widely used to represent the performance (i.e., degree of correctness) of fault proneness models. AUC, the Area Under the ROC Curve is a quite popular performance metric, which summarizes into a single number the goodness of the predictions represented by the ROC curve. Alternative techniques have been proposed for evaluating the performance represented by a ROC curve: among these are RRA (Ratio of Relevant Areas) and φ (alias Matthews Correlation Coefficient). Objectives In this paper, we aim at evaluating AUC as a performance metric, also with respect to alternative proposals. Method We carry out an empirical study by replicating a previously published fault prediction study and measuring the performance of the obtained faultiness models using AUC, RRA, and a recently proposed way of relating a specific kind of ROC curves to φ, based on iso-φ ROC curves, i.e., ROC curves with constant φ. We take into account prevalence, i.e., the proportion of faulty modules in the dataset that is the object of predictions. Results AUC appears to provide indications that are concordant with φ for fairly balanced datasets, while it is much more optimistic than φ for quite imbalanced datasets. RRA’s indications appear to be moderately affected by the degree of balance in a dataset. In addition, RRA appears to agree with φ. Conclusions Based on the collected evidence, AUC does not seem to be suitable for evaluating the performance of fault proneness models when used with imbalanced datasets. In these cases, using RRA can be a better choice. At any rate, more research is needed to generalize these conclusions.

An experience in the evaluation of fault prediction

Luigi Lavazza;Sandro Morasca;Gabriele Rotoloni

2024-01-01

Abstract

Background ROC (Receiver Operating Characteristic) curves are widely used to represent the performance (i.e., degree of correctness) of fault proneness models. AUC, the Area Under the ROC Curve is a quite popular performance metric, which summarizes into a single number the goodness of the predictions represented by the ROC curve. Alternative techniques have been proposed for evaluating the performance represented by a ROC curve: among these are RRA (Ratio of Relevant Areas) and φ (alias Matthews Correlation Coefficient). Objectives In this paper, we aim at evaluating AUC as a performance metric, also with respect to alternative proposals. Method We carry out an empirical study by replicating a previously published fault prediction study and measuring the performance of the obtained faultiness models using AUC, RRA, and a recently proposed way of relating a specific kind of ROC curves to φ, based on iso-φ ROC curves, i.e., ROC curves with constant φ. We take into account prevalence, i.e., the proportion of faulty modules in the dataset that is the object of predictions. Results AUC appears to provide indications that are concordant with φ for fairly balanced datasets, while it is much more optimistic than φ for quite imbalanced datasets. RRA’s indications appear to be moderately affected by the degree of balance in a dataset. In addition, RRA appears to agree with φ. Conclusions Based on the collected evidence, AUC does not seem to be suitable for evaluating the performance of fault proneness models when used with imbalanced datasets. In these cases, using RRA can be a better choice. At any rate, more research is needed to generalize these conclusions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo del volume
	
				Product-Focused Software Process Improvement : 24th International Conference, PROFES 2023, Dornbirn, Austria, December 10–13, 2023, Proceedings
			
	ISBN
	
				9783031492655
			
	Titolo del congresso
	
				Product-Focused Software Process Improvement  24th International Conference, PROFES 2023
			
	Luogo del Congresso
	
				Dornbirn, Austria
			
	Data del Congresso
	
				December 10–13, 2023
			
	Appare nelle tipologie:
	
				Relazione (in Volume)

File in questo prodotto:

File	Dimensione	Formato
paper_CR.pdf non disponibili Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 418.76 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	418.76 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2166491

Citazioni

ND

0

0

social impact