Indexing elasticsearch indexes slows over time with a constant number of indexes and documents

I observe that the volumetric indexing performance using the .NET NEST client and ElasticSearch worsens over time with a constant number of indexes and number of documents.

We are launching ElasticSearch Version: 0.19.11, JVM: 23.5-b02 on an Amazon m1.large instance with Ubuntu Server 12.04.1 LTS 64 bit and Sun Java 7. Nothing works on this instance except what comes with the Ubuntu installation .

Amazon M1 Large Instance : from http://aws.amazon.com/ec2/instance-types/

 7.5 GiB memory 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each) 850 GB instance storage 64-bit platform I/O Performance: High EBS-Optimized Available: 500 Mbps API name: m1.large 

ES_MAX_MEM is set to 4g and ES_MIN_MEM is set to 2g

Every night, we index / reindex ~ 15,000 documents using NEST in our .NET application. At any given time, there is only one index with <= 15,000 documents.

when the server was first installed, indexing and searching were fast for the first few days, and then indexing became slower and slower. indexing indexes indexes 100 documents at a time, and after a while it will take up to 15 seconds to complete a bulk operation. after that we began to observe with the exception of the following exception and interruption of indexing to a halt.

 System.Net.WebException: The request was aborted: The request was canceled. at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult) at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization) : 

The builk indexing implementation is as follows:

 private ElasticClient GetElasticClient() { var setting = new ConnectionSettings(ConfigurationManager.AppSettings["elasticSearchHost"], 9200); setting.SetDefaultIndex("products"); var elastic = new ElasticClient(setting); return elastic; } private void DisableRefreshInterval() { var elasticClient = GetElasticClient(); var s = elasticClient.GetIndexSettings("products"); var settings = s != null && s.Settings != null ? s.Settings : new IndexSettings(); settings["refresh_interval"] = "-1"; var result = elasticClient.UpdateSettings(settings); if (!result.OK) _logger.Warn("unable to set refresh_interval to -1, {0}", result.ConnectionStatus == null || result.ConnectionStatus.Error == null ? "" : result.ConnectionStatus.Error.ExceptionMessage); } private void EnableRefreshInterval() { var elasticClient = GetElasticClient(); var s = elasticClient.GetIndexSettings("products"); var settings = s != null && s.Settings != null ? s.Settings : new IndexSettings(); settings["refresh_interval"] = "1s"; var result = elasticClient.UpdateSettings(settings); if (!result.OK) _logger.Warn("unable to set refresh_interval to 1s, {0}", result.ConnectionStatus == null || result.ConnectionStatus.Error == null ? "" : result.ConnectionStatus.Error.ExceptionMessage); } public void Index(IEnumerable<Product> products) { var enumerable = products as Product[] ?? products.ToArray(); var elasticClient = GetElasticClient(); try { DisableRefreshInterval(); _logger.Info("Indexing {0} products", enumerable.Count()); var status = elasticClient.IndexMany(enumerable as IEnumerable<Product>, "products"); if (status.Items != null) _logger.Info("Done, Indexing {0} products, duration: {1}", status.Items.Count(), status.Took); if (status.ConnectionStatus.Error != null) { _logger.Error(status.ConnectionStatus.Error.OriginalException); } } catch(Exception ex) { _logger.Error(ex); } finally { EnableRefreshInterval(); } } 

Restarting the elasticsearch daemon does not seem to make any difference, but deleting the index and re-indexing does everything. But in a few days we will have the same problem of slow indexing.

I simply deleted the index and added Optimization after re-enabling the update interval after each operation with the bulk index in the hope that this could lead to a deterioration of the index.

 ... ... finally { EnableRefreshInterval(); elasticClient.Optimize("products"); } 

Am I doing something terribly wrong here?

+4
source share
2 answers

Sorry, I just started writing another rather long comment and thought that I would just put all this in the answer if it would benefit someone else ...

ES_HEAP_SIZE

The first thing I noticed here is that you said that you set the max and min heap values ​​for elasticsearch for different values. They must be the same. In the configuration /init.d script, EX_HEAP_SIZE must be set. Be sure to set this value (not the minimum and maximum values), as it will set the minimum and maximum values ​​to the same value that you want. If you do not, the JVM will block java processes when you need more memory - see this wonderful article about disconnecting on github recently (here is a quote):

Set the environment variable ES_HEAP_SIZE so that the JVM uses the same value for minimum and maximum memory. Setting the JVM to different minimum and maximum values ​​means that every time the JVM needs additional memory (to the maximum), it blocks the Java process to allocate it. In combination with the old version of Java, this explains the pauses that our nodes demonstrated when introduced to the higher load and continuous memory allocation when they were open for public searches. The elasticsearch team recommends installing 50% of the system’s RAM.

Also check out this great article for more elasticsearch configuration from tranches.

Memory lock to stop sharing

From my research, I found that you should also block the amount of memory available for the Java process to avoid memory sharing. I am not an expert in this field, but I was told that it would also kill performance. You can find bootstrap.mlockall in the elasticsearch.yml configuration file.

Update

Elasticsearch is still brand new. Plan your update often enough, because fixing bugs that were introduced between the version you were on (0.19.11) and the current version (0.20.4) are very important. See the ES website for more details. You are on Java 7, which is definitely the right way, I started on Java 6 and quickly realized that it was just not good enough, especially for mass insertion.

Plugins

Finally, for anyone who encounters such problems, get a decent plugin to review your nodes and JVM. I recommend bigdesk - start bigdesk, and then click elasticsearch with some voluminous inserts and keep an eye on the weird heaped memory patterns, a very large number of threads, etc., it's all there

Hope someone finds this helpful!

Cheers, James

+6
source

Just to take a chance:

How does index performance degrade, have you noticed that the index takes up more disk space?

Perhaps instead of replacing the old index or old documents when reindexing, instead you add a bunch of new documents, effectively doubling the number of documents, possibly using mostly duplicated data. It might be worth taking the old, slow index and loading it into some kind of viewer to debug it ( Luke , for example). If you see a lot more documents than you expected, you can see how to create a new file to create a new index instead of the old one.

Since restarting the daemon does not fix the problem, I would suggest leaving open file descriptors, running processes, connections, etc. can be excluded, although I would like to check these statistics and determine if I see any suspected server behavior.

In addition, as for Optimize, you can see some performance improvements with it, of course, but it is a very expensive operation. I would recommend only optimization after the complete rebuild, and not after every incremental indexing operation.

+2
source

All Articles