Texts, images and other information are posted everyday on the social network and provides a large amount of multimodal data. The aim of this work is to investigate if combining and integrating both visual and textual data permits to identify emotions elicited by an image. We focus on image emotion classification within eight emotion categories: amusement, awe, contentment, excitement, anger, disgust, fear and sadness. Within this classification task we here propose to adopt ensemble learning approaches based on the Bayesian model averaging method, that combine five state-of-the-art classifiers. The proposed ensemble approaches consider predictions given by several classification models, based on visual and textual data, through respectively a late and an early fusion schemes. Our investigations show that an ensemble method based on a late fusion of unimodal classifiers permits to achieve high classification performance within all of the eight emotion classes. The improvement is higher when deep image representations are adopted as visual features, compared with hand-crafted ones.

Ensemble learning on visual and textual data for social image emotion classification

Corchs, S
;
2019-01-01

Abstract

Texts, images and other information are posted everyday on the social network and provides a large amount of multimodal data. The aim of this work is to investigate if combining and integrating both visual and textual data permits to identify emotions elicited by an image. We focus on image emotion classification within eight emotion categories: amusement, awe, contentment, excitement, anger, disgust, fear and sadness. Within this classification task we here propose to adopt ensemble learning approaches based on the Bayesian model averaging method, that combine five state-of-the-art classifiers. The proposed ensemble approaches consider predictions given by several classification models, based on visual and textual data, through respectively a late and an early fusion schemes. Our investigations show that an ensemble method based on a late fusion of unimodal classifiers permits to achieve high classification performance within all of the eight emotion classes. The improvement is higher when deep image representations are adopted as visual features, compared with hand-crafted ones.
2019
2017
Image emotion; · Multimodal ensemble learning; · Bayesian model averaging; Visual and textual social data; Bayesian model averaging; Image emotion; Multimodal ensemble learning; Visual and textual social data
Corchs, S; Fersini, E; Gasparini, F
File in questo prodotto:
File Dimensione Formato  
Ensemble learning on visual and textual data for social image emotion classification .pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: DRM non definito
Dimensione 6.19 MB
Formato Adobe PDF
6.19 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2127913
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 54
  • ???jsp.display-item.citation.isi??? 34
social impact