Background. Practical use of a measure X for an internal attribute (e.g., size, complexity, cohesion, coupling) of software modules often requires setting a threshold on X, to make decisions as to which modules may be estimated to be potentially faulty. To keep quality under control, practitioners may want to set a threshold on X to identify "early symptoms" of possible faultiness of those modules that should be closely monitored and possibly modified. Objective. We propose and evaluate a risk-averse approach to setting thresholds on X based on properties of the slope of statistically significant fault-proneness models, to identify "early symptoms" of module faultiness. Method. To this end, we introduce four ways for setting thresholds on X. First, we use the value of X where a fault-proneness model curve changes direction the most, i.e., it has maximum convexity. Then, we use the values of X where the slope has specific values: one-half of the maximum slope, and the median and mean slope in the interval between minimum and maximum slopes. Results. We provide the theoretical underpinnings for our approach and we apply our approach to data from the PROMISE repository by building Binary Logistic and Probit regression fault-proneness models. The empirical study shows that the proposed thresholds effectively detect "early symptoms" of module faultiness, while achieving a level of accuracy in classifying faulty modules close to other usual fault-proneness thresholds. Conclusions. Our method can be practically used for setting "early symptom" thresholds based on evidence captured by statistically significant models. Also, the thresholds depend on characteristics of the models alone, so project managers do not need to devise the thresholds themselves. The proposed thresholds correspond to increasing risk levels, so project managers can choose the threshold that best suits their needs in a risk-averse framework.

Risk-averse slope-based thresholds: Definition and empirical evaluation

MORASCA, SANDRO;LAVAZZA, LUIGI ANTONIO
2017

Abstract

Background. Practical use of a measure X for an internal attribute (e.g., size, complexity, cohesion, coupling) of software modules often requires setting a threshold on X, to make decisions as to which modules may be estimated to be potentially faulty. To keep quality under control, practitioners may want to set a threshold on X to identify "early symptoms" of possible faultiness of those modules that should be closely monitored and possibly modified. Objective. We propose and evaluate a risk-averse approach to setting thresholds on X based on properties of the slope of statistically significant fault-proneness models, to identify "early symptoms" of module faultiness. Method. To this end, we introduce four ways for setting thresholds on X. First, we use the value of X where a fault-proneness model curve changes direction the most, i.e., it has maximum convexity. Then, we use the values of X where the slope has specific values: one-half of the maximum slope, and the median and mean slope in the interval between minimum and maximum slopes. Results. We provide the theoretical underpinnings for our approach and we apply our approach to data from the PROMISE repository by building Binary Logistic and Probit regression fault-proneness models. The empirical study shows that the proposed thresholds effectively detect "early symptoms" of module faultiness, while achieving a level of accuracy in classifying faulty modules close to other usual fault-proneness thresholds. Conclusions. Our method can be practically used for setting "early symptom" thresholds based on evidence captured by statistically significant models. Also, the thresholds depend on characteristics of the models alone, so project managers do not need to devise the thresholds themselves. The proposed thresholds correspond to increasing risk levels, so project managers can choose the threshold that best suits their needs in a risk-averse framework.
http://www.elsevier.com/wps/find/journaldescription.cws_home/525444/description#description
http://www.sciencedirect.com/science/article/pii/S0950584917302409
Fault-proneness; Faultiness; Logistic regression; Probit regression; Risk-aversion; Software measures; Threshold; Software; Information Systems; Computer Science Applications1707 Computer Vision and Pattern Recognition
Morasca, Sandro; Lavazza, LUIGI ANTONIO
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2061916
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 7
social impact