Elasticsearch not available when emptying the garbage collector?

I have a two-node Elasticsearch cluster. The website (live) directly uses this cluster, constantly performing search and index queries in my ES cluster.

My problem is that on a regular (and unpredictable) basis, the entire cluster becomes inaccessible when one of the nodes releases the garbage collector . The message I get from node log looks like

[2015-07-01 06:43:19,525][INFO ][monitor.jvm] [my_node] [gc][old][205450][116] duration [5.7s], collections [1]/[6.3s], total [5.7s]/[1m], memory [22.3gb]->[4.9gb]/[30.9gb], all_pools {[young] [392.9mb]->[17.2mb]/[665.6mb]} {[survivor] [29.1mb]->[0b]/[83.1mb]} {[old] [21.9gb]->[4.9gb]/[30.1gb]} 

From what I understand (I'm not a Java person), these lines indicate that the ES is emptying the garbage collector. So, during these 5.7 seconds, node is not responding, neither my cluster nor my site . This downtime occurs 5 to 10 times a day.

Am I doing something wrong here or is this downtime inevitable? Should I add an Elasticsearch load balancer (i.e. A node with data = false, master = false) to the cluster and point my site to this loadbalancer? Or should I add another kind of load balancing (HAProxy?) In front of my nodes? Or does this mean that something is wrong with the servers, the data?

Thank you very much in advance

Some cluster configuration information

  • Elasticsearch 1.6.0 cluster of 2 nodes (5 shards, 1 replica)
  • The cluster contains ~ 10 million documents, occupying ~ 30 Gb.
  • Each node is a server with 64 GB of RAM with MAX_HEAP_SIZE set to 31g
  • The website launches ~ 300 search queries per second and ~ 100 index queries per second
  • JVM heap utilization is always between 50% and 75%, never higher
+5
source share
1 answer

Since when the GC works, the heap goes from 21.9gb to 4.9gb, I doubt that the use of the heap is from 50% to 75%, most likely the GC is triggered by 75%, and then drops to ~ 15%. If you don’t have Marvel, install - install and check options such as number of segments and heap usage. To decrease the segment counter, optimize indexes that are written little or not at all ( https://www.elastic.co/guide/en/elasticsearch/reference/2.1/indices-optimize.html ). If you continue to slow down when you start GC, try reducing the heap size. I know this sounds counterproductive, but in this case it makes sense. There is a good entry on the Elastic website - https://www.elastic.co/blog/a-heap-of-trouble

-1
source

All Articles