loading...
Fourth International Conference Docum ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Jiangying Zhou, Panasonic Information and Networking Technology Laboratory
Daniel Lopresti, Panasonic Information and Networking Technology Laboratory
In this paper, we examine the problem of locating and extracting text from in-line images of World Wide Web pages. We described a text detection algorithm which is based on color clustering and connected component analysis. The algorithm first quantizes the color space of the input image into a number of color classes using a parameter-free clustering procedure. It then identifies text-like connected components in each color class based on their shapes. Finally, a post-processing procedure aligns text-like components into textlines. The experimental results show that our text extraction algorithm works well on a variety of test images.
Index Terms:
text detection, information retrieval, World Wide Web.
Citation:
Jiangying Zhou, Daniel Lopresti, "Extracting Text from WWW Images," icdar,pp.248, Fourth International Conference Document Analysis and Recognition (ICDAR'97), 1997
Usage of this product signifies your acceptance of the Terms of Use.


Click here to go to beta feedback form