Context: ROC (Receiver Operating Characteristic) curves are widely used to represent how well fault-proneness models (e.g., probability models) classify software modules as faulty or non-faulty. AUC, the Area Under the ROC Curve, is usually used to quantify the overall discriminating power of a fault-proneness model. Alternative indicators proposed, e.g., RRA (Ratio of Relevant Areas), consider the area under a portion of a ROC curve. Each point of a ROC curve represents a binary classifier, obtained by setting a specified threshold on the fault-proneness model. Several performance metrics (Precision, Recall, the F-score, etc.) are used to assess a binary classifier. Objectives: We investigate the relationships linking ‘‘under the ROC curve area’’ indicators such as AUC and RRA to performance metrics. Methods: We study these relationships analytically. We introduce iso-PM ROC curves, whose points have the same value isoPM for a given performance metric PM. When evaluating a ROC curve, we identify the iso-PM curve with the same value of AUC or RRA. Its isoPM can be seen as a property of the ROC curve and fault-proneness model under evaluation. Results: There is an S-shaped relationship between isoPM and AUC for performance metrics that do not depend on the proportion ρ of faulty modules, i.e., dataset balancedness. φ (Matthews Correlation Coefficient) depends on ρ: with very imbalanced datasets, AUC appears over-optimistic and φ over-pessimistic. RRA defines the region of interest in terms of ρ, so all performance metrics depend on ρ. RRA is related to performance metrics via S-shaped curves. Conclusion: Our proposal helps gain a better quantitative understanding of the goodness of a ROC curve, especially in practically relevant regions of interest. Also, showing a ROC curve and iso-PM curves provides an intuitive perception of the goodness of a fault-proneness model.

Software Defect Prediction evaluation: New metrics based on the ROC curve

Luigi Lavazza;Sandro Morasca
;
Gabriele Rotoloni
2025-01-01

Abstract

Context: ROC (Receiver Operating Characteristic) curves are widely used to represent how well fault-proneness models (e.g., probability models) classify software modules as faulty or non-faulty. AUC, the Area Under the ROC Curve, is usually used to quantify the overall discriminating power of a fault-proneness model. Alternative indicators proposed, e.g., RRA (Ratio of Relevant Areas), consider the area under a portion of a ROC curve. Each point of a ROC curve represents a binary classifier, obtained by setting a specified threshold on the fault-proneness model. Several performance metrics (Precision, Recall, the F-score, etc.) are used to assess a binary classifier. Objectives: We investigate the relationships linking ‘‘under the ROC curve area’’ indicators such as AUC and RRA to performance metrics. Methods: We study these relationships analytically. We introduce iso-PM ROC curves, whose points have the same value isoPM for a given performance metric PM. When evaluating a ROC curve, we identify the iso-PM curve with the same value of AUC or RRA. Its isoPM can be seen as a property of the ROC curve and fault-proneness model under evaluation. Results: There is an S-shaped relationship between isoPM and AUC for performance metrics that do not depend on the proportion ρ of faulty modules, i.e., dataset balancedness. φ (Matthews Correlation Coefficient) depends on ρ: with very imbalanced datasets, AUC appears over-optimistic and φ over-pessimistic. RRA defines the region of interest in terms of ρ, so all performance metrics depend on ρ. RRA is related to performance metrics via S-shaped curves. Conclusion: Our proposal helps gain a better quantitative understanding of the goodness of a ROC curve, especially in practically relevant regions of interest. Also, showing a ROC curve and iso-PM curves provides an intuitive perception of the goodness of a fault-proneness model.
2025
2025
https://www.sciencedirect.com/science/article/pii/S0950584925002046?via=ihub
Fault-proneness models; Binary classifiers; Fault prediction; Accuracy; Performance metrics; ROC curves; Area under the curve (AUC); Area under the ROC curve (AUROC); Pearson φ; Matthews Correlation Coefficient
Lavazza, Luigi; Morasca, Sandro; Rotoloni, Gabriele
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0950584925002046-main.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 4.88 MB
Formato Adobe PDF
4.88 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2196991
 Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact