Risk-averse slope-based thresholds: Definition and empirical evaluation

IRIS - Institutional Research Information System
IRIS è il sistema di gestione integrata dei dati della ricerca (persone, progetti, pubblicazioni, attività) adottato dall'Università degli Studi dell’Insubria.

IRInSubria - Institutional Repository Insubria
IRInSubria raccoglie, conserva, documenta e dissemina le informazioni sulla produzione scientifica dell'Università degli Studi dell’Insubria anche ai fini della valutazione della ricerca.

Background. Practical use of a measure X for an internal attribute (e.g., size, complexity, cohesion, coupling) of software modules often requires setting a threshold on X, to make decisions as to which modules may be estimated to be potentially faulty. To keep quality under control, practitioners may want to set a threshold on X to identify "early symptoms" of possible faultiness of those modules that should be closely monitored and possibly modified. Objective. We propose and evaluate a risk-averse approach to setting thresholds on X based on properties of the slope of statistically significant fault-proneness models, to identify "early symptoms" of module faultiness. Method. To this end, we introduce four ways for setting thresholds on X. First, we use the value of X where a fault-proneness model curve changes direction the most, i.e., it has maximum convexity. Then, we use the values of X where the slope has specific values: one-half of the maximum slope, and the median and mean slope in the interval between minimum and maximum slopes. Results. We provide the theoretical underpinnings for our approach and we apply our approach to data from the PROMISE repository by building Binary Logistic and Probit regression fault-proneness models. The empirical study shows that the proposed thresholds effectively detect "early symptoms" of module faultiness, while achieving a level of accuracy in classifying faulty modules close to other usual fault-proneness thresholds. Conclusions. Our method can be practically used for setting "early symptom" thresholds based on evidence captured by statistically significant models. Also, the thresholds depend on characteristics of the models alone, so project managers do not need to devise the thresholds themselves. The proposed thresholds correspond to increasing risk levels, so project managers can choose the threshold that best suits their needs in a risk-averse framework.

Risk-averse slope-based thresholds: Definition and empirical evaluation

MORASCA, SANDRO;LAVAZZA, LUIGI ANTONIO

2017-01-01

Abstract

Background. Practical use of a measure X for an internal attribute (e.g., size, complexity, cohesion, coupling) of software modules often requires setting a threshold on X, to make decisions as to which modules may be estimated to be potentially faulty. To keep quality under control, practitioners may want to set a threshold on X to identify "early symptoms" of possible faultiness of those modules that should be closely monitored and possibly modified. Objective. We propose and evaluate a risk-averse approach to setting thresholds on X based on properties of the slope of statistically significant fault-proneness models, to identify "early symptoms" of module faultiness. Method. To this end, we introduce four ways for setting thresholds on X. First, we use the value of X where a fault-proneness model curve changes direction the most, i.e., it has maximum convexity. Then, we use the values of X where the slope has specific values: one-half of the maximum slope, and the median and mean slope in the interval between minimum and maximum slopes. Results. We provide the theoretical underpinnings for our approach and we apply our approach to data from the PROMISE repository by building Binary Logistic and Probit regression fault-proneness models. The empirical study shows that the proposed thresholds effectively detect "early symptoms" of module faultiness, while achieving a level of accuracy in classifying faulty modules close to other usual fault-proneness thresholds. Conclusions. Our method can be practically used for setting "early symptom" thresholds based on evidence captured by statistically significant models. Also, the thresholds depend on characteristics of the models alone, so project managers do not need to devise the thresholds themselves. The proposed thresholds correspond to increasing risk levels, so project managers can choose the threshold that best suits their needs in a risk-averse framework.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Rivista
	
				INFORMATION AND SOFTWARE TECHNOLOGY
			
	Url
	
				http://www.elsevier.com/wps/find/journaldescription.cws_home/525444/description#description
http://www.sciencedirect.com/science/article/pii/S0950584917302409
			
	DOI
	
				https://dx.doi.org/10.1016/j.infsof.2017.03.005
			
	Codice Web of Science
	
				WOS:000403861300004
			
	Codice Scopus
	
				2-s2.0-85017357880
			
	Parole chiave
	
				Fault-proneness; Faultiness; Logistic regression; Probit regression; Risk-aversion; Software measures; Threshold; Software; Information Systems; Computer Science Applications1707 Computer Vision and Pattern Recognition
			
	Tutti gli autori
	
						Morasca, Sandro; Lavazza, LUIGI ANTONIO
					
	Appare nelle tipologie:
	
				Articolo su Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2061916

Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni

ND

10

6

social impact