Atmospheric monitoring produces huge amounts of data. Univariate and bivariate statistics are widely used to investigate variations in the parameters. To summarize information graphs are usually used in the form of histograms or tendency profiles (e.g., variable concentration vs. time), as well as bidimensional plots where two-variable correlations are considered. However, when dealing with big data sets at least two problems arise: a great quantity of numbers (statistics) and graphs are produced, and only two-variable interactions are often considered. The aim of this article is to show how the use of multivariate statistics helps in handling atmospheric data sets. Multivariate modeling considers all the variables simultaneously and returns the main results as bidimensional graphs that are easy-to-read. Principal Component Analysis (PCA; the most known multivariate method) and multiway-PCA (Tucker3) are compared from methodological and interpretative points of view. The article demonstrates the ability to emphasize different information depending on the data handling performed. The results and benefits achieved using a more complex model that allows for the simultaneous consideration of the entire variability of the system are compared with the results provided by the simpler but better-known model. Atmospheric monitoring (SO2, NOx, NO2, NO, and O3) data from the Lake Como Area (Italy) since 1992 to 2007 were chosen for consideration for the case study.

Bidimensional and multidimensional principal component analysis in long term atmospheric monitoring

GIUSSANI, BARBARA;RECCHIA, SANDRO;POZZI, ANDREA
2016-01-01

Abstract

Atmospheric monitoring produces huge amounts of data. Univariate and bivariate statistics are widely used to investigate variations in the parameters. To summarize information graphs are usually used in the form of histograms or tendency profiles (e.g., variable concentration vs. time), as well as bidimensional plots where two-variable correlations are considered. However, when dealing with big data sets at least two problems arise: a great quantity of numbers (statistics) and graphs are produced, and only two-variable interactions are often considered. The aim of this article is to show how the use of multivariate statistics helps in handling atmospheric data sets. Multivariate modeling considers all the variables simultaneously and returns the main results as bidimensional graphs that are easy-to-read. Principal Component Analysis (PCA; the most known multivariate method) and multiway-PCA (Tucker3) are compared from methodological and interpretative points of view. The article demonstrates the ability to emphasize different information depending on the data handling performed. The results and benefits achieved using a more complex model that allows for the simultaneous consideration of the entire variability of the system are compared with the results provided by the simpler but better-known model. Atmospheric monitoring (SO2, NOx, NO2, NO, and O3) data from the Lake Como Area (Italy) since 1992 to 2007 were chosen for consideration for the case study.
2016
http://www.mdpi.com/2073-4433/7/12/155/pdf
Atmospheric monitoring; Multivariate analysis; Multiway analysis; PCA; Tropospheric pollutants; Tucker3; Environmental Science (miscellaneous); Atmospheric Science
Giussani, Barbara; Roncoroni, Simone; Recchia, Sandro; Pozzi, Andrea
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2060368
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
social impact