What are the disadvantages of using Lucene?

I am thinking about using Lucene in my project for a very quick search. I know that Lucene creates its own files, where it stores all the data / indexes.

I wonder what are the disadvantages of using Lucene? Whether there is a?

Do you need to do anything with the file database or does it work fine without any external help?

PS I know that there is also Lucene.NET, and I am sure that the same rules apply there.

+6
java full-text-search lucene
source share
4 answers

Lutsen is wonderful. Very flexible, surprisingly fast and robust API. The mailing list is extremely helpful.

Files need a little maintenance, but this can be done using the provided tools. Index optimization is of prime importance, but it is only necessary if you regularly update the index.

I would suggest looking at Solr. These are essentially webapps and tools that sit on top of Lucene. This makes it easy to create new indexes, optimize their performance, and provide master / slave synchronization for a scalable search cluster. This, of course, depends on your real needs.

For a personal example, I used the search index for a large, well-known gaming company. The index contains hundreds of thousands of entries in several languages ​​(worldwide) and locales. He executed a million requests every day in the cluster, using virtually no processor, and a reasonable amount of memory. He experienced a load of up to about 300 million requests per day, on equipment that we had, and would scale linearly, simply adding more boxes to the cluster. Solr and Lucene were the main tools for this.

If I had to reevaluate, that would be a learning curve. Understand a little, and if you want a truly optimized solution, you need to know it well. However, this will happen with any search tool that you use if you do it yourself. The documentation, wiki, and mailing list provide great support for this increase.

+9
source share

I have limited experience with Lucene while it was great. The disadvantages that I see are mainly from a business perspective:

  • I should be actively using Lucene for my boss, by default we will use SQL Server. To do this, I will have to prove without a doubt that Lucene works better (and not only similar) to use the case we have. I guess this goes to "No One Has Been Fired For Purchasing IBM Equipment."
  • Current fixes / bug fixes for Lucene.Net in particular are questionable at the same time, again a tougher sale without taking this into account. I hope the community can rally.
+2
source share

Lucene works great for many people and companies . However, your mileage may vary. A possible problem is the Lucene scoring model. It uses a combination of TF / IDF and a Boolean valuation, while other IR tools use probabilistic BM25, which is stronger. However, you can work with Lucene for many years, and the search results will be pretty good. In addition, scaling for many millions of documents is not easy.

It comes down to your specific use case. It’s best to start the test using Solr and see if your needs are right for you.

+2
source share

Lucene has a scalability issue. Its performance worsens as the index gets bigger and bigger.

+2
source share

All Articles