The big number of projects producing open source software provides researches with the possibility to measure software artefacts, thus producing a huge amount of data that are available for analysis. In order to be efficient and reliable, the process of data retrieval and analysis needs to be adequately supported by tools. In particular, measurement tools should guarantee that a large amount of artefacts are measured in a coherent and efficient way. They should also guarantee that the delivered measures have a well specified structure and meaning, which should be agreed upon by the community of researchers interested in analysing the data. A problem that such tools have to face is that all the elements involved are highly variable: the data source can be available in different versions; the measures to be carried out can be defined in (often only slightly) different ways; it is usually different the output required by different types of analysis. Another non trivial problem is that measured data have to be stored persistently in a way that lets the user not only retrieve the data, but also the meta-data describing the measurement themselves. In this paper we describe a tool that addresses the requirements described above, and present a first implementation that satisfies several of such requirements.

A tool for the measurement, storage, and pre-elaboration of data supporting the release of public datasets

LAVAZZA, LUIGI ANTONIO
2006-01-01

Abstract

The big number of projects producing open source software provides researches with the possibility to measure software artefacts, thus producing a huge amount of data that are available for analysis. In order to be efficient and reliable, the process of data retrieval and analysis needs to be adequately supported by tools. In particular, measurement tools should guarantee that a large amount of artefacts are measured in a coherent and efficient way. They should also guarantee that the delivered measures have a well specified structure and meaning, which should be agreed upon by the community of researchers interested in analysing the data. A problem that such tools have to face is that all the elements involved are highly variable: the data source can be available in different versions; the measures to be carried out can be defined in (often only slightly) different ways; it is usually different the output required by different types of analysis. Another non trivial problem is that measured data have to be stored persistently in a way that lets the user not only retrieve the data, but also the meta-data describing the measurement themselves. In this paper we describe a tool that addresses the requirements described above, and present a first implementation that satisfies several of such requirements.
2006
Jesús M. González-Barahona, Megan Conklin, Gregorio Robles
Proceedings of the Workshop on Public Data about Software Development (WoPDaSD 2006)
Workshop on Public Data about Software Development (WoPDaSD 2006)
Como
10 Giugno 2006
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/1501513
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact