The effectiveness of the software testing process is a key issue for meeting the increasing demand of quality without augmenting the overall costs of software development. The estimation of software fault-proneness is important for assessing costs and quality and thus better planning and tuning the testing process. Unfortunately, no general techniques are available for estimating software fault-proneness and the distribution of faults to identify the correct level of test for the required quality. Although software complexity and testing thoroughness are intuitively related to the costs of quality assurance and the quality of the final product, single software metrics and coverage criteria provide limited help in planning the testing process and assuring the required quality. By using logistic regression, this paper shows how models can be built that relate software measures and software fault-proneness for classes of homogeneous software products. It also proposes the use of cross-validation for selecting valid models even for small data sets. The early results show that it is possible to build statistical models based on historical data for estimating fault-proneness of software modules before testing, and thus better planning and monitoring the testing activities.
Deriving Models of Software Fault-proneness
MORASCA, SANDRO;
2002-01-01
Abstract
The effectiveness of the software testing process is a key issue for meeting the increasing demand of quality without augmenting the overall costs of software development. The estimation of software fault-proneness is important for assessing costs and quality and thus better planning and tuning the testing process. Unfortunately, no general techniques are available for estimating software fault-proneness and the distribution of faults to identify the correct level of test for the required quality. Although software complexity and testing thoroughness are intuitively related to the costs of quality assurance and the quality of the final product, single software metrics and coverage criteria provide limited help in planning the testing process and assuring the required quality. By using logistic regression, this paper shows how models can be built that relate software measures and software fault-proneness for classes of homogeneous software products. It also proposes the use of cross-validation for selecting valid models even for small data sets. The early results show that it is possible to build statistical models based on historical data for estimating fault-proneness of software modules before testing, and thus better planning and monitoring the testing activities.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.