Texts, images and other information are posted everyday on the social network and provides a large amount of multimodal data. The aim of this work is to investigate if combining and integrating both visual and textual data permits to identify emotions elicited by an image. We focus on image emotion classification within eight emotion categories: amusement, awe, contentment, excitement, anger, disgust, fear and sadness. Within this classification task we here propose to adopt ensemble learning approaches based on the Bayesian model averaging method, that combine five state-of-the-art classifiers. The proposed ensemble approaches consider predictions given by several classification models, based on visual and textual data, through respectively a late and an early fusion schemes. Our investigations show that an ensemble method based on a late fusion of unimodal classifiers permits to achieve high classification performance within all of the eight emotion classes. The improvement is higher when deep image representations are adopted as visual features, compared with hand-crafted ones.
Ensemble learning on visual and textual data for social image emotion classification
Corchs, S
;
2019-01-01
Abstract
Texts, images and other information are posted everyday on the social network and provides a large amount of multimodal data. The aim of this work is to investigate if combining and integrating both visual and textual data permits to identify emotions elicited by an image. We focus on image emotion classification within eight emotion categories: amusement, awe, contentment, excitement, anger, disgust, fear and sadness. Within this classification task we here propose to adopt ensemble learning approaches based on the Bayesian model averaging method, that combine five state-of-the-art classifiers. The proposed ensemble approaches consider predictions given by several classification models, based on visual and textual data, through respectively a late and an early fusion schemes. Our investigations show that an ensemble method based on a late fusion of unimodal classifiers permits to achieve high classification performance within all of the eight emotion classes. The improvement is higher when deep image representations are adopted as visual features, compared with hand-crafted ones.File | Dimensione | Formato | |
---|---|---|---|
Ensemble learning on visual and textual data for social image emotion classification .pdf
non disponibili
Tipologia:
Versione Editoriale (PDF)
Licenza:
DRM non definito
Dimensione
6.19 MB
Formato
Adobe PDF
|
6.19 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.