This research contributes to the problem of classifying document images. The main addition of this thesis is the exploitation of textual and visual features through an approach that uses Convolutional Neural Networks. The study uses a combination of Optical Character Recognition and Natural Language Processing algorithms to extract and manipulate relevant text concepts from document images. Such content information are embedded within document images, with the aim of adding elements which help to improve the classification results of a Convolutional Neural Network. The experimental phase proves that the overall document classification accuracy of a Convolutional Neural Network trained using these text-augmented document images, is considerably higher than the one achieved by a similar model trained solely on classic document images, especially when different classes of documents share similar visual characteristics. The comparison between our method and state-of-the-art approaches demonstrates the effectiveness of combining visual and textual features. Although this thesis is about document image classification, the idea of using textual and visual features is not restricted to this context and comes from the observation that textual and visual information are complementary and synergetic in many aspects.

Document image classification combining textual and visual features / Noce, Lucia. - (2016).

Document image classification combining textual and visual features.

Noce, Lucia
2016-01-01

Abstract

This research contributes to the problem of classifying document images. The main addition of this thesis is the exploitation of textual and visual features through an approach that uses Convolutional Neural Networks. The study uses a combination of Optical Character Recognition and Natural Language Processing algorithms to extract and manipulate relevant text concepts from document images. Such content information are embedded within document images, with the aim of adding elements which help to improve the classification results of a Convolutional Neural Network. The experimental phase proves that the overall document classification accuracy of a Convolutional Neural Network trained using these text-augmented document images, is considerably higher than the one achieved by a similar model trained solely on classic document images, especially when different classes of documents share similar visual characteristics. The comparison between our method and state-of-the-art approaches demonstrates the effectiveness of combining visual and textual features. Although this thesis is about document image classification, the idea of using textual and visual features is not restricted to this context and comes from the observation that textual and visual information are complementary and synergetic in many aspects.
2016
Document image classification, convolutional neural network, natural language processing
Document image classification combining textual and visual features / Noce, Lucia. - (2016).
File in questo prodotto:
File Dimensione Formato  
Phd_Thesis_Nocelucia_completa.pdf

accesso aperto

Descrizione: testo completo tesi
Tipologia: Tesi di dottorato
Licenza: Non specificato
Dimensione 13.08 MB
Formato Adobe PDF
13.08 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2090588
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact