Node Cluster Monitoring

I have> 10 nodes in a cluster. I installed the Hadoop stack on the cluster using Cloudera (YARN, HBase, Hue, Hadoop FS, Spark, Flink). Is there an easy way to collect global statistics of all nodes (in terms of CPU usage, memory usage and network usage) and read it using Python? The purpose of using Python is that I can fully specify graphs and provide a consistent plot style in my report. What software can be used for this? It does not need to be distributed, just a library.

+5
source share
2 answers

I made the package myself: http://github.com/kevin91nl/isa

The tutorial can be found at https://www.data-blogger.com/2016/07/18/monitoring-your-cluster-in-just-a-few-minutes/

If anyone knows a better alternative, please let me know.

+4
source

I would suggest considering using ansible for this purpose. Here is a simple playbook that collects some data on the nodes specified in the inventory file and adds it to the local file:

- hosts: all remote_user: your_user tasks: - name: collect load average shell: cat /proc/loadavg register: cluster_node_la - name: write to local disk lineinfile: dest=/tmp/cluster_stat create=yes line="{{ ansible_fqdn }}:{{ cluster_node_la.stdout_lines }}" delegate_to: 127.0.0.1 

You can run it like this: ansible-playbook -i ansible-inventory stats-playbook.yml --forks=1

  • ansible_inventory is a file containing a list of your hosts
  • stats-playbook.yml is the file printed above

Of course, depending on how you are going to store the collected data, this may be implemented differently, but I think the general idea is clear. Anyway, there are many ways to solve this problem in ansible .

In addition, ansible has a python API, and you can do most things directly from python! Ie, here's how we can build your cluster configuration:

 import pprint import ansible.runner import ansible.inventory inventory_file = 'ansible_inventory' # see ansible inventory files inventory = ansible.inventory.Inventory(inventory_file) runner = ansible.runner.Runner( module_name='setup', module_args='', pattern='all', inventory=inventory ) cluster_facts = runner.run() pprint.pprint(cluster_facts) 
+1
source

All Articles