Background. Early identification of software modules that are likely to be faulty helps practitioners take timely actions to improve these modules' quality and reduce development costs in the remainder of the development process. To this end, module faultiness estimation models can be built at any point during development by using measures collected up to that time. Models available in later phases are expected to be more accurate than those available in earlier phases. However, waiting until late in the development process may reduce the impact of the effectiveness and efficacy of any software quality improvement actions and increase their cost.Aims. Our goal is to investigate to what extent using software code measures along with software design measures helps improve the accuracy of module faultiness estimation with respect to using software design measures alone.Method. We built faultiness estimation models-by using Binary Logistic Regression, Naive Bayes, Support Vector Machines, and Decision Trees-for 54 datasets from the PROMISE repository. These datasets contain design and code measures and faultiness data of software modules of real-life projects. We compared the models built by using the code measures and design measures together against the models built by using design measures alone via a few accuracy indicators.Results. The results indicate that the models built by using code measures and design measures together are only slightly more accurate than the models built by using design measures alone.Conclusions. Our analysis shows that measures that can be obtained during design can provide models that are almost as accurate as models that can be achieved in later development phases. This is good news for practitioners, who can start early-hence cheaper and more effective-quality improvement initiatives based on fairly reliable models.
Comparing the Effectiveness of Using Design and Code Measures in Software Faultiness Estimation
Morasca, Sandro;Lavazza, Luigi
2019-01-01
Abstract
Background. Early identification of software modules that are likely to be faulty helps practitioners take timely actions to improve these modules' quality and reduce development costs in the remainder of the development process. To this end, module faultiness estimation models can be built at any point during development by using measures collected up to that time. Models available in later phases are expected to be more accurate than those available in earlier phases. However, waiting until late in the development process may reduce the impact of the effectiveness and efficacy of any software quality improvement actions and increase their cost.Aims. Our goal is to investigate to what extent using software code measures along with software design measures helps improve the accuracy of module faultiness estimation with respect to using software design measures alone.Method. We built faultiness estimation models-by using Binary Logistic Regression, Naive Bayes, Support Vector Machines, and Decision Trees-for 54 datasets from the PROMISE repository. These datasets contain design and code measures and faultiness data of software modules of real-life projects. We compared the models built by using the code measures and design measures together against the models built by using design measures alone via a few accuracy indicators.Results. The results indicate that the models built by using code measures and design measures together are only slightly more accurate than the models built by using design measures alone.Conclusions. Our analysis shows that measures that can be obtained during design can provide models that are almost as accurate as models that can be achieved in later development phases. This is good news for practitioners, who can start early-hence cheaper and more effective-quality improvement initiatives based on fairly reliable models.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.