Why are document stores such as Lucene / Solr not included in NoSQL conversations?

Recently, we have all come across a recent advertisement for non-SQL solutions. MongoDB, CouchDB, BigTable, Cassandra and others were specified as options without SQL. Here is an example:

http://architects.dzone.com/articles/what-nosql-store-should-i-use

However, three years ago, a staff member and I used Lucene.NET as something that looked like a no-SQL description. We did not use it only for custom search queries; we used it to make several reindexed RDBMS table data extremely efficient. To manage and enable these indexes, we have implemented our own .NET sorting service. When I left the company, the team switched to Salra. (For those not in the know, Solr is a web service that migrates Lucene with REST request requests and index dumps.)

I don’t understand why Solr is not taken into account in typical lists of no-SQL solution options? Am I missing something? I assume that there are technical reasons why Solr is not comparable to the likes of CouchDB, etc., And in fact, I understand that CouchDB uses Lucene as a data store (yes?), But what robs Solr?

I don’t ask how some Solr fan or something else, I just don’t understand why Solr and the like do not meet the definition of no-SQL, and if Solr technically matches the definition, then it probably makes people fluff his fluff? I ask because it’s difficult for me to determine if I should continue to use Lucene-based solutions (like Solr) for the solutions I create, or if I really will do more research with these other options.

+59
nosql lucene solr
Jul 26 '10 at 23:46
source share
6 answers

I once listened to an interview with fiction writer Ursula C. Le Guin. The interviewer asked her about authors who work in different genres of writing. What makes one author a romantic writer, and the other a mystery writer, and another a science fiction writer? LeGuin responded by explaining:

Genre is marketing, not content.

It was a frank expression.

I think the same goes for technology solutions. The NoSQL movement is attracting attention because it absorbs marketing energy right now. NoSQL data warehouses, such as Hadoop, CouchDB, MongoDB, have commercial enterprises supporting them, promoting their solutions as new, innovative and exciting so that they can grow their business. The term "NoSQL" is a marketing brand that helps them explain their value.

You are right that Lucene / Solr is technically very similar to the NoSQL document repository: it is a denormalized package of documents (their term) with fields that are not necessarily consistent between the collection of documents. It is indexed in a complex way, allowing you to search all fields or specific fields.

But this is not the genre that Lutsen uses to explain its value. They do not have the same market and business development mission as they are managed by the Apache Foundation. They are happy to focus on using full-text search, although the technology can be used in other ways. They follow the principle of software success: do one thing and do it well.

+70
Jul 26 '10 at 23:58
source share
— -

After doing a larger Google search, I find this document to be good enough:

https://web.archive.org/web/20100504055638/http://www.lucidimagination.com/blog/2010/04/30/nosql-lucene-and-solr/

The fact is, Lucene / Solr is NoSql and can be considered one of the more mature "ancestors" of NoSql. He just doesn't get the NoSql hype she deserves because she didn't coined the term "no-SQL" and her users don't use that term, so the hype machine ignored it.

+12
Jul 27 '10 at 0:37
source share

I think the most important feature of solr / lucene, which is not on the nosql list, is because until recently, lucene as a real-time system was a pain. The usual workflow for any executing application was to index incremental updates in packages and, for example, update the index every 5 minutes.

+4
Oct 08 2018-10-10
source share

I think stimpy77 is partly right on NoSQL branding . But also, NoSQL means it's a storage platform that is simpler and simpler than SQL-based solutions. And I think that while Solr / Lucene share some aspects (they store data), he really does not notice that Solr / Lucene can be used as the main data store for everything that has a relationship. Of course, you can throw a lot of documents into it, and a powerful search will drop them. But as soon as you want a relationship, others like CouchDB and others do much better that have query syntax. In this case, the search is a gangster decision. Think of the precedent “find all documents marked with the word“ car. ”If I have some structures in my data, then it’s easy for me to get a document for the car with a label and return all. Compared to a search query that includes fq = tag:” car. "Search is becoming more powerful, the less relationships you have, but the more relationships, the better the data warehouse, like CouchDB and the brothers. That's why you still see CouchDB and friends paired with Solr, and vice versa! Let everyone does what he does best.

Of course, this does not mean that you cannot use your source data in Solr, it can be a powerful tool to use!

+2
Jul 29 '10 at 2:03 p.m.
source share

The main differences between no sql and solr in operational mudras are the following, in my opinion.

  • Solr requires an intermediate data warehouse (databases or XML files), while nosql requires a direct data warehouse.
  • You cannot write to solr permanently (it seems that solr 4.0 supports this support), and you can index at most every 2 minutes and 200 records (which is very slow for high-throughput recordings, and you are forced to have intermediate storage).
  • You need to change / define the scheme when changing what is stored in the document. NoSQL has no such definitions.
  • Solr indexes have performance when its index size grows, while NoSQL is optimized for it (or claims to be :)
  • Solr contains the underlying lucene search algorithms, but in NoSQL you need to build them. This refers to the magnificent grand search or the quick search for documents provided by solr.
0
Jun 13 '13 at 20:08
source share

Last but not the least, the difference is not mentioned here as a marketing strategy in which solr leaves NoSQL

Lucene / Solr - Iam will use Solr since Solr uses lucene internally and has additional features. So, Solr is basically an upgrade to Lucene with a new outline.

  • Solr is mainly used to create facets and index simple texts for a search engine.

  • Solr can use most databases to store its data. It is incorrect to store data in solr, since it directly uses disks.

  • NoSQL databases are easy to learn compared to Solr. Solr more or less has many configurations and concepts (for example, fields).

  • Performance is what we should consider b / w. Solr provides high performance compared to other NoSQL databases.

Note. Combining Solr with some databases provides better performance.

Summary: Solr is also the NoSQL repository, which is the forerunner of all NoSQL databases. Which did not deceive others. But still in the field because of its performance and power.

0
Sep 27 '15 at 11:10
source share



All Articles