This is more a matter of theory, not practice. I am working on a project that is a fairly simple directory of links. The entire model is similar to the Dmoz or Yahoo directory, except that each entry has certain additional attributes.
I have a hierarchical taxonomy working on all elements with a many-to-many relationship, now all entries are sorted into these categories, and everything works fine. Now, what use is a directory if there is no search option?
Here is a little more detail about my models. Each entry has a name, description, URL and several social profiles: YouTube, Twitter, Flickr and a couple of others. Each entry can have a logo attached to it and a hidden tag field. In addition, the title and description are stored in three different languages. So basically I would like the search results to be:
- Relevant (including taxonomy)
- Perhaps with logos
- Perhaps with 100% completed profiles
I tried Sphinx and am currently working with Lucene, but it seems I am not getting the correct search in theory. I hope it makes sense that the completed entries should appear higher than the others, but I can not understand the numbers. I would not want unnecessary entries to be displayed on top if there is a simple coincidence of words in the entire description, since the headings are more relevant.
So my question is: are there any books, methods, or even other search engines (if Sphinx and Lucene aren't good enough) that you would recommend for this question ? Not only would I like to get full control over the search results and their ranking, but also give my visitors the correct and relevant information.
Links to interesting articles are also appreciated!
And No , I'm not trying to rebuild Google :)
Thanks:)
search search-engine full-text-search lucene sphinx
kovshenin
source share