In Elasticsearch, an index consists of one or more primary fragments, where the fragment is an instance of Lucene. Each primary shard can have zero or more replicas, the existence of which gives you high availability and improves search performance.
One fragment can contain a lot of data. However, with multiple shards, it is easier to distribute the workload between multiple processors and multiple servers.
However, you need balance. The correct amount of shards depends on your data and context. Shards are not free, so if you find it useful to have thousands of shards, if you use a 100 node cluster, you donβt want it to be on the same node.
In Elasticsearch, as well as with indexes, you have a concept of types. Imagine the index is like a database and the type is like a table.
Using different types has no overhead and is better suited for your example than individual indexes.
You can search for all types (or the selected list of types) and by all indexes (or the selected list) or any combination.
Each type can have its own fields (for example, columns in a table).
So, in your example, I will have one index containing 3 types, each with its own fields. Start with the default number of primary surveys (5) and the default replica count (1) and change them only when you better understand your data.
Note: do not confuse the index in Elasticsearch with the index in the database.
source share