Best text search engine for integrating with a custom web application?

Question

Best text search engine for integrating with a custom web application?

We have a web application that allows users to upload documents, create their own documents, etc. The downloaded files are stored on Amazon S3, the created information is stored in the MySQL database. What I'm looking for is a kind of search engine where I feed all our text documents, each with a unique identifier, and it builds an index or something else. Later I can give him search queries, and he will pull out the best matching documents (through their identifier) along with fragments of the corresponding text.

Basically, we want our users to be able to search through their repository of downloaded materials, as well as everything that other users have flagged as public. The solution should work on a standard Linux server, and ideally it would be open source, but I will also consider paid solutions if they are not outrageously priced.

So far I have found three potential candidates:

MySQL full-text search - some posts I read are very slow
Apache Lucene is, unfortunately, written in Java, but I will use it if necessary. Supposedly fast
Sphinx - it seems not so popular, ideally, any solution that I find will have great community support.

Please let me know if there are any other good options that I have missed, or if you have experience with any of the above.

+3

linux web-applications search full-text-search

davr Sep 22 '08 at 22:22

source share

6 answers

Sphinx may be worth your attention, as it works well with several common RDMS (in particular, MySQL)

+2

Marc Gear Sep 29 '08 at 16:15

source share

There is also Xapian , which is fast and completely customizable.

It supports custom indexes that allow you to index data that is not stored in the database, which may be useful for your documents stored on S3.

+1

sock Sep 29 '08 at 15:34

source share

I believe that Google will have a solution that suits your needs. Start here: Google Enterprise

0

teratorn Sep 22 '08 at 22:26

source share

There is a Ruby Lucene port called Ferret . "In addition to the Ruby API, you can get a basic c implementation called cFerret.

0

AShelly Sep 22 '08 at 22:42

source share

Lutsen is very good. And although it was originally written in java, there is an implementation of php http://framework.zend.com/manual/en/zend.search.lucene.html

0

Ryan White Sep 22 '08 at 22:46

source share

Mauricio Scheffer · Accepted Answer · 2008-09-29 16:12

Take a look at Solr . It is based on Lucene, so it is very fast, and it is very easy to use from any platform.

Best text search engine for integrating with a custom web application?

More articles: