Estimation of spatial econometric linear models with large datasets: How big can spatial Big Data be?

IRIS - Institutional Research Information System
IRIS è il sistema di gestione integrata dei dati della ricerca (persone, progetti, pubblicazioni, attività) adottato dall'Università degli Studi dell’Insubria.

IRInSubria - Institutional Repository Insubria
IRInSubria raccoglie, conserva, documenta e dissemina le informazioni sulla produzione scientifica dell'Università degli Studi dell’Insubria anche ai fini della valutazione della ricerca.

Spatial econometrics is currently experiencing the Big Data revolution both in terms of the volume of data and the velocity with which they are accumulated. Regional data, employed traditionally in spatial econometric modeling, can be very large, with information that are increasingly available at a very fine resolution level such as census tracts, local markets, town blocks, regular grids or other small partitions of the territory. When dealing with spatial microeconometric models referred to the granular observations of the single economic agent, the number of observations available can be a lot higher. This paper reports the results of a systematic simulation study on the limits of the current methodologies when estimating spatial models with large datasets. In our study we simulate a Spatial Lag Model (SLM), we estimate it using Maximum Likelihood (ML), Two Stages Least Squares (2SLS) and Bayesian estimator (B), and we test their performances for different sample sizes and different levels of sparsity of the weight matrices. We considered three performance indicators, namely: computing time, storage required and accuracy of the estimators. The results show that using standard computer capabilities the analysis becomes prohibitive and unreliable when the sample size is greater than 70,000 even for low levels of sparsity. This result suggests that new approaches should be introduced to analyze the big datasets that are quickly becoming the new standard in spatial econometrics.

Estimation of spatial econometric linear models with large datasets: How big can spatial Big Data be?

Chiara Ghiringhelli;Giuseppe Arbia;Antonietta Mira

2019-01-01

Abstract

Spatial econometrics is currently experiencing the Big Data revolution both in terms of the volume of data and the velocity with which they are accumulated. Regional data, employed traditionally in spatial econometric modeling, can be very large, with information that are increasingly available at a very fine resolution level such as census tracts, local markets, town blocks, regular grids or other small partitions of the territory. When dealing with spatial microeconometric models referred to the granular observations of the single economic agent, the number of observations available can be a lot higher. This paper reports the results of a systematic simulation study on the limits of the current methodologies when estimating spatial models with large datasets. In our study we simulate a Spatial Lag Model (SLM), we estimate it using Maximum Likelihood (ML), Two Stages Least Squares (2SLS) and Bayesian estimator (B), and we test their performances for different sample sizes and different levels of sparsity of the weight matrices. We considered three performance indicators, namely: computing time, storage required and accuracy of the estimators. The results show that using standard computer capabilities the analysis becomes prohibitive and unreliable when the sample size is greater than 70,000 even for low levels of sparsity. This result suggests that new approaches should be introduced to analyze the big datasets that are quickly becoming the new standard in spatial econometrics.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Rivista
	
				REGIONAL SCIENCE AND URBAN ECONOMICS
			
	DOI
	
				https://dx.doi.org/10.1016/j.regsciurbeco.2019.01.006
			
	Codice Web of Science
	
				WOS:000472693800006
			
	Codice Scopus
	
				2-s2.0-85063540141
			
	Parole chiave
	
				Bayesian estimator, Big spatial data, Computational issues, Dense matrix, Maximum Likelihood, Spatial econometric models,Spatial two stages least squares
			
	Tutti gli autori
	
						Ghiringhelli, Chiara; Arbia, Giuseppe; Mira, Antonietta
					
	Appare nelle tipologie:
	
				Articolo su Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2085930

Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni

ND

19

13

social impact