Recommended / Standard Elasticsearch Database Strategy

I'm reflecting on an index maintenance strategy for Elasticsearch, I found a plugin that can serve the service well, but I would kind of get a little closer with Elasticsearch, since I really love it, and the plugin will make the game less intimate if you understand that I I mean.

Anyway, if I have a dataset that would have fairly frequent updates (say ~ 1 update / 10s), would I run into performance problems with Elasticsearch? Can partial index updates be performed when a single row changes or the index is completely reinstalled? The strategy I plan to implement includes modifying the index whenever I do CRUD with my application (python postgre), so there will be some overhead with code that doesn't bother me too much, just performance. Is my strategy common?

I used Sphinx, which had a partial reindexing that was run with the cron job in order to synchronize, it had a mapping of indexes and MySQL tables defined in the config. This was the recommended approach for the Sphinx. Is there a recommended approach with Elasticsearch?

+6
source share
1 answer

There are many different strategies for handling this; there is no simple one size for all solutions.

To answer some of your questions, firstly, there is no partial update in Elasticsearch / Lucene. If you update one field in a document, the entire document will be rewritten. Be aware of the spectacular effects of this when designing your circuit. However, if you are updating a single document, it should be available immediately. Elasticsearch is a search engine in almost real time, you do not need to worry about a constantly updated index.

For your write load, one update / 10 seconds, the default performance settings should be accurate. Because it is actually a very low load for ES, it can scale much higher. Netflix, for example, performs 7 million updates per minute in one of its clusters.

Regarding synchronization strategies, I wrote an in-depth article about this β€œKeeping Elasticsearch in Sync”

+12
source

All Articles