Elastic Search, Nest and Lucene.net

I know that Elastic Search is based on Lucene, but I wonder if elastic search gives any advantages when developing a search engine, and not in coding with Lucene.Net directly. Sorry if the question is a little simple, but I am confused after searching for opportunities to create a search engine.

I found more examples for a simple search for lucene.net, but not for an elastic search and socket. Another question: what is the difference between Socket and Elastix? are they the same

If someone sheds some light on me, maybe with a good model, I appreciate. what I need? Easy, fast and fast search engine. what would be the best option? any other alternative may also be, but only .net (C # or vb).

+5
source share
2 answers

Lucene

Lucene , and the .NET port Lucene.Net is a library of search engines to support full-text search in the application; it creates an inverted index based on the document (and the fields in the document) that you feed it to support full-text search. An example of this is a search in the Nuget gallery source , where the nuget package and its properties are converted into a document to go to Lucene. The inverted index is stored in files inside the directory.

Elasticsearch

Elasticsearch is a distributed search engine that uses Lucene under covers - the Elasticsearch cluster can consist of one or more nodes, where each node can contain several fragments and replicas ; each shard is a complete Lucene index . Having such an infrastructure provides fast performance and allows horizontal scaling to process searches on a large amount of data, since you are no longer limited by the limitations of one Lucene index on one machine. In addition, you can achieve high availability with fault tolerance and disaster recovery, as the data can be replicated via shards, which means that there is no single point of failure. An Elasticsearch example with NEST is included in my blog.

What to use?

Well, it depends on your use case (almost always, right?); if your application is installed on the computer and all data is saved locally, you can use the Lucene library in the application and save the index directory on the local disk. Similarly, if you have a simple web application running on a single server with a small number of users, then using Lucene can also be a smart choice. On the other hand, if your application runs on multiple computers in a web farm and requires search capabilities, working with a distributed search engine such as Elasticsearch would be a good idea.

How good is Elasticsearch? Back in 2013, Github used Elasticsearch to index 2 billion documents , i.e. all code files in each repository on the site - through 44 separate Amazon EC2, each with two terabytes of ephemeral SSD storage, which gives a total of 30 terabytes of primary data . Stackoverflow also uses Elasticsearch to search on this site (maybe the developer could comment on some numbers / metrics?)

+13
source

Lucene and Elasticsearch are two completely different application classes.

Lucene is a library that implements an inverted index, and search and ranking on it using the base query language Lucene. This is not a separate application that you can simply run and use (index documents, search for them, extract them, ...).

Elasticsearch is a distributed server built on top of Lucene. Elasticsearch gives you a nice REST API that you can use to index, search, and retrieve documents. It also implements a query language with features far superior to Lucene. It is also a distributed server, which means that you can run the Elasticsearch server as a cluster on multiple machines, and it will automatically take care of the distribution and replication of data between them.

Similarly, Solr is also a search engine built on top of Lucene.

So it really depends on what exactly you want to achieve. If it just implements the full-text search feature built into an existing application, then Lucene may be all you need. On the other hand, if you want to implement, let them say the search engine for your site about movies, then you will be much better off using Elasticsearch or Solr.

+2
source

All Articles