Background. When estimating whether a software module is faulty based on the value of a measure X for a software internal attribute (e.g., size, structural complexity, cohesion, coupling), it is sensible to set a threshold on fault-proneness first and then induce a threshold on X by using a fault-proneness model where X plays the role of independent variable. However, some modules cannot be estimated as either faulty or non-faulty with confidence: they belong to a “grey zone” and estimating them as either would be quite aleatory and may result in several erroneous decisions. Objective. We propose and evaluate an approach to setting thresholds on X to identify which modules can be confidently estimated faulty or non-faulty, and which ones cannot be estimated either way. Method. Suppose that we do not know if the modules belonging to a subset of a set of modules are faulty or not, as happens in practical cases with the modules whose faultiness needs to be estimated. We build two fault-proneness models by using the set of modules as the training set. The “pessimistic” model is built by assuming that all modules whose faultiness is unknown are actually faulty and the “optimistic” model by assuming that they are actually non-faulty. The optimistic and pessimistic models can be used to set two thresholds, an optimistic and a pessimistic one. A module is estimated faulty by the optimistic (resp., pessimistic) model with optimistic (resp., pessimistic) threshold if its fault-proneness is above the threshold, and non-faulty otherwise. A module that is estimated faulty (resp., non-faulty) by both the optimistic model with optimistic threshold and the pessimistic model with the pessimistic threshold is esti- mated faulty (resp., non-faulty). Modules for which the estimates of the two models with associated thresholds conflict, are in the “grey zone,” i.e., no reliable faultiness estimation can be made for them. Results. We applied our approach to datasets from the PROMISE repository, we carried out cross-validations, and we assessed accuracy via commonly used indicators. We also compared our results with those obtained with the conventional approach that uses one Binary Logistic Regression model. Our results show that our approach is effective in identifying the grey zone of values of X in which modules cannot be reliably estimated as either faulty or non-faulty and, conversely, the intervals in which modules can be estimated faulty or non-faulty. Our approach turns out to be more accurate, in terms of F-measure, than the conventional one in the majority of cases. In addition, it provides F-measure values that are very concentrated, i.e., it consistently identifies the intervals in which modules can be estimated faulty or non-faulty. Conclusions. Our method can be practically used for identifying “grey zones” in which it does not make much sense to estimate modules’ faultiness based on measure X and, therefore, the zones in which modules’ faultiness can be estimated with confidence.

Identifying Thresholds for Software Faultiness via Optimistic and Pessimistic Estimations

LAVAZZA, LUIGI ANTONIO;MORASCA, SANDRO
2016-01-01

Abstract

Background. When estimating whether a software module is faulty based on the value of a measure X for a software internal attribute (e.g., size, structural complexity, cohesion, coupling), it is sensible to set a threshold on fault-proneness first and then induce a threshold on X by using a fault-proneness model where X plays the role of independent variable. However, some modules cannot be estimated as either faulty or non-faulty with confidence: they belong to a “grey zone” and estimating them as either would be quite aleatory and may result in several erroneous decisions. Objective. We propose and evaluate an approach to setting thresholds on X to identify which modules can be confidently estimated faulty or non-faulty, and which ones cannot be estimated either way. Method. Suppose that we do not know if the modules belonging to a subset of a set of modules are faulty or not, as happens in practical cases with the modules whose faultiness needs to be estimated. We build two fault-proneness models by using the set of modules as the training set. The “pessimistic” model is built by assuming that all modules whose faultiness is unknown are actually faulty and the “optimistic” model by assuming that they are actually non-faulty. The optimistic and pessimistic models can be used to set two thresholds, an optimistic and a pessimistic one. A module is estimated faulty by the optimistic (resp., pessimistic) model with optimistic (resp., pessimistic) threshold if its fault-proneness is above the threshold, and non-faulty otherwise. A module that is estimated faulty (resp., non-faulty) by both the optimistic model with optimistic threshold and the pessimistic model with the pessimistic threshold is esti- mated faulty (resp., non-faulty). Modules for which the estimates of the two models with associated thresholds conflict, are in the “grey zone,” i.e., no reliable faultiness estimation can be made for them. Results. We applied our approach to datasets from the PROMISE repository, we carried out cross-validations, and we assessed accuracy via commonly used indicators. We also compared our results with those obtained with the conventional approach that uses one Binary Logistic Regression model. Our results show that our approach is effective in identifying the grey zone of values of X in which modules cannot be reliably estimated as either faulty or non-faulty and, conversely, the intervals in which modules can be estimated faulty or non-faulty. Our approach turns out to be more accurate, in terms of F-measure, than the conventional one in the majority of cases. In addition, it provides F-measure values that are very concentrated, i.e., it consistently identifies the intervals in which modules can be estimated faulty or non-faulty. Conclusions. Our method can be practically used for identifying “grey zones” in which it does not make much sense to estimate modules’ faultiness based on measure X and, therefore, the zones in which modules’ faultiness can be estimated with confidence.
2016
A. Jedlitschka, M. Jorgensen
ESEM 2016: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
9781450344272
ESEM 2016 ACM/IEEE 9th International Symposium on Empirical Software Engineering and Measurement
Ciudad Real, Spain
September 08 - 09, 2016
File in questo prodotto:
File Dimensione Formato  
CameraReady_v2.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: Documento in Pre-print
Licenza: DRM non definito
Dimensione 302.94 kB
Formato Adobe PDF
302.94 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2051584
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 1
social impact