Document clustering techniques have been applied in several areas, with the web as one of the most recent and influential. Both general-purpose and text-oriented techniques exist and can be used to cluster a collection of documents in many ways. This work proposes a novel heuristic online document clustering model that can be specialized with a variety of text-oriented similarity measures. An experimental evaluation of the proposed model was conducted in the e-commerce domain. Performances were measured using a clustering-oriented metric based on F-Measure and compared with those obtained by other well-known approaches. The obtained results confirm the validity of the proposed method both for batch scenarios and online scenarios where document collections can grow over time.
|Titolo:||An online document clustering technique for short web contents|
|Data di pubblicazione:||2009|
|Appare nelle tipologie:||Articolo su Rivista|
File in questo prodotto:
|An online document clustering technique for short web contents.pdf||PDF editoriale||Altro||DRM non definito||Administrator Richiedi una copia|