Anomaly detection has extensive use in a wide variety of applications, such techniques aim to find patterns in data that do not conform to expected behavior. In this work we apply anomaly detection to the task of discovering anomalies from user-generated content of commercial product descriptions. While most of the other works in literature rely exclusively on textual features, we combine those textual descriptors with visual information extracted from the media resources associated with each product description. Given a large corpus of documents, the proposed system infers the key features describing the behavioral traits of expert users, and automatically reports whenever a newly generated description contains suspicious or low quality textual/visual elements. We prove that the joint use of textual and visual features helps in obtaining a robust detection model that can be employed in an enterprise environment to automatically mark suspicious descriptions for further manual inspection.

Combining Textual and Visual Features to Identify Anomalous User-generated Content

NOCE, LUCIA;GALLO, IGNAZIO;ZAMBERLETTI, ALESSANDRO
2015-01-01

Abstract

Anomaly detection has extensive use in a wide variety of applications, such techniques aim to find patterns in data that do not conform to expected behavior. In this work we apply anomaly detection to the task of discovering anomalies from user-generated content of commercial product descriptions. While most of the other works in literature rely exclusively on textual features, we combine those textual descriptors with visual information extracted from the media resources associated with each product description. Given a large corpus of documents, the proposed system infers the key features describing the behavioral traits of expert users, and automatically reports whenever a newly generated description contains suspicious or low quality textual/visual elements. We prove that the joint use of textual and visual features helps in obtaining a robust detection model that can be employed in an enterprise environment to automatically mark suspicious descriptions for further manual inspection.
2015
Alexander F. Gelbukh
Computational Linguistics and Intelligent Text Processing, 16th International Conference, CICLing 2015, Cairo, Egypt,
16th International Conference on Intelligent Text Processing and Computational Linguistics
Cairo, Egypt
April 14–20, 2015
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2050393
 Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact