Document image classification combining textual and visual features.

IRIS - Institutional Research Information System
IRIS è il sistema di gestione integrata dei dati della ricerca (persone, progetti, pubblicazioni, attività) adottato dall'Università degli Studi dell’Insubria.

IRInSubria - Institutional Repository Insubria
IRInSubria raccoglie, conserva, documenta e dissemina le informazioni sulla produzione scientifica dell'Università degli Studi dell’Insubria anche ai fini della valutazione della ricerca.

This research contributes to the problem of classifying document images. The main addition of this thesis is the exploitation of textual and visual features through an approach that uses Convolutional Neural Networks. The study uses a combination of Optical Character Recognition and Natural Language Processing algorithms to extract and manipulate relevant text concepts from document images. Such content information are embedded within document images, with the aim of adding elements which help to improve the classification results of a Convolutional Neural Network. The experimental phase proves that the overall document classification accuracy of a Convolutional Neural Network trained using these text-augmented document images, is considerably higher than the one achieved by a similar model trained solely on classic document images, especially when different classes of documents share similar visual characteristics. The comparison between our method and state-of-the-art approaches demonstrates the effectiveness of combining visual and textual features. Although this thesis is about document image classification, the idea of using textual and visual features is not restricted to this context and comes from the observation that textual and visual information are complementary and synergetic in many aspects.

Document image classification combining textual and visual features / Noce, Lucia. - (2016).

Document image classification combining textual and visual features.

Noce, Lucia

2016-01-01

Abstract

This research contributes to the problem of classifying document images. The main addition of this thesis is the exploitation of textual and visual features through an approach that uses Convolutional Neural Networks. The study uses a combination of Optical Character Recognition and Natural Language Processing algorithms to extract and manipulate relevant text concepts from document images. Such content information are embedded within document images, with the aim of adding elements which help to improve the classification results of a Convolutional Neural Network. The experimental phase proves that the overall document classification accuracy of a Convolutional Neural Network trained using these text-augmented document images, is considerably higher than the one achieved by a similar model trained solely on classic document images, especially when different classes of documents share similar visual characteristics. The comparison between our method and state-of-the-art approaches demonstrates the effectiveness of combining visual and textual features. Although this thesis is about document image classification, the idea of using textual and visual features is not restricted to this context and comes from the observation that textual and visual information are complementary and synergetic in many aspects.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di discussione
	
				2016
			
	Parole chiave
	
				Document image classification, convolutional neural network, natural language processing
			
	Citazione
	
				Document image classification combining textual and visual features / Noce, Lucia. - (2016).
			
	Appare nelle tipologie:
	
				Tesi di dottorato (ex InsubriaSPACE)

File in questo prodotto:

File	Dimensione	Formato
Phd_Thesis_Nocelucia_completa.pdf accesso aperto Descrizione: testo completo tesi Tipologia: Tesi di dottorato Licenza: Non specificato Dimensione 13.08 MB Formato Adobe PDF Visualizza/Apri	13.08 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2090588

Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni

ND

ND

ND

social impact