How should I think about search engine indexes?

I use elastic search and do not understand exactly what an index is. For example, if I have 3 models (backpack, boot and glove), I put each model in my own index or index the attributes of each model: for example, I indicate shoe laces, its sole, etc.

I'm trying to figure out if index search is slow. For example, if I index every attribute of my models and, say, 20 indexes, when I run a search that should search for data in all indexes, is it slower than having one index and viewing 20 attributes stored in this index?

+4
source share
1 answer

In Elasticsearch, an index consists of one or more primary fragments, where the fragment is an instance of Lucene. Each primary shard can have zero or more replicas, the existence of which gives you high availability and improves search performance.

One fragment can contain a lot of data. However, with multiple shards, it is easier to distribute the workload between multiple processors and multiple servers.

However, you need balance. The correct amount of shards depends on your data and context. Shards are not free, so if you find it useful to have thousands of shards, if you use a 100 node cluster, you don’t want it to be on the same node.

In Elasticsearch, as well as with indexes, you have a concept of types. Imagine the index is like a database and the type is like a table.

Using different types has no overhead and is better suited for your example than individual indexes.

You can search for all types (or the selected list of types) and by all indexes (or the selected list) or any combination.

Each type can have its own fields (for example, columns in a table).

So, in your example, I will have one index containing 3 types, each with its own fields. Start with the default number of primary surveys (5) and the default replica count (1) and change them only when you better understand your data.

Note: do not confuse the index in Elasticsearch with the index in the database.

+7
source

All Articles