Abstract
While the problem to find needed information on the Web is critical, it is arguably much less pressing nowadays than it was over a decade ago when the Web was emerging. Back then it was much more difficult to find a Web resource of interest, because the search engines were in their infancy covering much lesser portion of the Web by their indices, armed with embryonic page ranking algorithms. Now, Web-search is by far not perfect yet, but definitely went a long way to become an everyday “go-to” resource for millions of people. By contrast, access to textual information is not even close to what Web-search algorithms offer today. In fact, it does not differ much from what everyone had a decade ago. That is keyword-search (exact substring match) is often the only way to find needle in a haystack in most modern word processors and text corpora search engines. Here we demonstrate ReadFast — a system, capable to extract certain structure from any natural language text corpus and use it to provide more relevant search results than keyword-search for specific classes of queries. Our evaluation justified significant relevance gain (20–30%) for two large Biomedical text corpora.