Atmospheric monitoring produces huge amounts of data. Univariate and bivariate statistics are widely used to investigate variations in the parameters. To summarize information graphs are usually used in the form of histograms or tendency profiles (e.g., variable concentration vs. time), as well as bidimensional plots where two-variable correlations are considered. However, when dealing with big data sets at least two problems arise: a great quantity of numbers (statistics) and graphs are produced, and only two-variable interactions are often considered. The aim of this article is to show how the use of multivariate statistics helps in handling atmospheric data sets. Multivariate modeling considers all the variables simultaneously and returns the main results as bidimensional graphs that are easy-to-read. Principal Component Analysis (PCA; the most known multivariate method) and multiway-PCA (Tucker3) are compared from methodological and interpretative points of view. The article demonstrates the ability to emphasize different information depending on the data handling performed. The results and benefits achieved using a more complex model that allows for the simultaneous consideration of the entire variability of the system are compared with the results provided by the simpler but better-known model. Atmospheric monitoring (SO2, NOx, NO2, NO, and O3) data from the Lake Como Area (Italy) since 1992 to 2007 were chosen for consideration for the case study.
Bidimensional and multidimensional principal component analysis in long term atmospheric monitoring
GIUSSANI, BARBARA;RECCHIA, SANDRO;POZZI, ANDREA
2016-01-01
Abstract
Atmospheric monitoring produces huge amounts of data. Univariate and bivariate statistics are widely used to investigate variations in the parameters. To summarize information graphs are usually used in the form of histograms or tendency profiles (e.g., variable concentration vs. time), as well as bidimensional plots where two-variable correlations are considered. However, when dealing with big data sets at least two problems arise: a great quantity of numbers (statistics) and graphs are produced, and only two-variable interactions are often considered. The aim of this article is to show how the use of multivariate statistics helps in handling atmospheric data sets. Multivariate modeling considers all the variables simultaneously and returns the main results as bidimensional graphs that are easy-to-read. Principal Component Analysis (PCA; the most known multivariate method) and multiway-PCA (Tucker3) are compared from methodological and interpretative points of view. The article demonstrates the ability to emphasize different information depending on the data handling performed. The results and benefits achieved using a more complex model that allows for the simultaneous consideration of the entire variability of the system are compared with the results provided by the simpler but better-known model. Atmospheric monitoring (SO2, NOx, NO2, NO, and O3) data from the Lake Como Area (Italy) since 1992 to 2007 were chosen for consideration for the case study.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.