loading...
Proximity Estimation and Hardness of Short-Text Corpora
2008 19th International Conference on ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
In this work, we investigate the relative hardness of short-text corpora in clustering problems and how this hardness relates to traditional similarity measures. Our approach basically attempts to establish a connection between the hardness of a corpus and the precisionlevel exhibited by similarity measures, according to the results obtainedwith different cluster validity measures on the "ideal" clustering ofeach corpus. Moreover, we also propose a new validity measure, namedcontiguity error that allowed us to observe this connection in a consistentway in all the collections considered.
Index Terms:
clustering, short-text corpora, proximity estimation, cluster validity measures
Citation:
Marcelo Luis Errecalde, Diego Ingaramo, Paolo Rosso, "Proximity Estimation and Hardness of Short-Text Corpora," dexa,pp.15-19, 2008 19th International Conference on Database and Expert Systems Application, 2008
Usage of this product signifies your acceptance of the Terms of Use.


Click here to go to beta feedback form