Cassandra's seal tasks stuck

I am running Datastax Enterprise on a cluster of 3 nodes. They all work under the same equipment: 2 Intel Xeon 2.2 Ghz cores, 7 GB of RAM, 4 TB Raid-0

This should be enough to start a light-load cluster that stores less than 1 GB of data.

In most cases, everything is just fine, but it seems that sometimes the running tasks related to the repair service in OpsCenter sometimes get stuck; this causes instability in the node and increased load.

However, if the node reboots, the stuck tasks are not displayed and the download is again at a normal level.

Due to the fact that we do not have a lot of data in our cluster, we use the parameter min_repair_timedefined in opscenterd.confto delay the repair service so that it does not run too often.

It seems a little strange that the tasks that are designated as “Complete” and show 100% progress do not disappear, and yes, we waited an hour for them to leave, but they won’t; the only way we decided to solve is to restart the nodes.

Nodes with running tasks

Running tasks

Edit:

Here is the conclusion from nodetool compactionstats

enter image description here

Edit 2:

I work under Datastax Enterprise v. 4.6.0 with Cassandra v. 2.0.11.83

Edit 3:

This is inferred from dstatin node, which usually works

dstat from normal node

This is derived from dstaton node with a closed seal

dstat from node with stucked compaction

Change 4:

Output from iostaton node with closed seal, see tall "iowait"

enter image description here

+4
4

, , , , , , , , .

RAID-0, Striping, , 1 . - 4x IOPS Stripe, , - RAID.

, , , , , node "". , - IO , , RAID- . MDADM .., RAID-.

Azure Premium Storage ( ). , , SSD. , , SSDs = > IOPS, . RAID SSD. SSD- VM.

Cluster 3 , , .

, , , , , .

  • ( > IOPS)
  • RAID ,

, , , . SSDs - , SSD.

- , Azure RAID-0 , .

+4

azure storage

Azure . .

DSE [ cassandra] , , DSE [ cassandra] . node 16 . 500 IOPS. 8000 IOPS RAID-0. , 16 000 IOPS, .

+4

, , , , , 1 node . system.log , , .

+3

rollups_60 OpsCenter ( ) Cassandra, OS DSE. , , . , .

OpsCenter, . opscenterd.conf:

  • (, opsc) ignored_keyspaces
  • TTL , 1min_ttl

: Opscenter DataStax DataStax

+2
source

All Articles