Over 120 counters in hadoop

There is a limit on the size of the Hadoop counter. The default is 120. I am trying to use the configuration "mapreduce.job.counters.limit" to change this, but this will not work. I saw the source code. This is like an instance of JobConf in the class "org.apache.hadoop.mapred.Counters" is private. Has anyone seen this before? What is your decision? thanks:)

+6
source share
5 answers

You can override this property in mapred-site.xml on your client nodes JT, TT, but make sure that this is a system-wide modification:

 <configuration> ... <property> <name>mapreduce.job.counters.limit</name> <value>500</value> </property> ... </configuration> 

Then restart the mapreduce service in your cluster.

+5
source

In Hadoop 2, this configuration parameter is called

 mapreduce.job.counters.max 

Installing it on the command line or in your Configuration object is not enough. You need to call a static method

org.apache.hadoop.mapreduce.counters.Limits.init()

in the setup () method of your cartographer or reducer to force the parameter to take effect.

Tested with 2.6.0 and 2.7.1.

+4
source

The pair is configured by the configuration file, while the pair below takes effect

 mapreduce.job.counters.max=1000 mapreduce.job.counters.groups.max=500 mapreduce.job.counters.group.name.max=1000 mapreduce.job.counters.counter.name.max=500 
0
source

Just add this if someone else faces the same problem we are: increasing counters with MRJob .

To increase the number of counters, add emr_configurations to your mrjob.conf (or pass it to MRJob as a configuration parameter):

 runners: emr: emr_configurations: - Classification: mapred-site Properties: mapreduce.job.counters.max: 1024 mapreduce.job.counters.counter.name.max: 256 mapreduce.job.counters.groups.max: 256 mapreduce.job.counters.group.name.max: 256 
0
source

We can configure limits as command line parameters only for certain tasks, instead of making changes to mapred-site.xml .

 -Dmapreduce.job.counters.limit=x -Dmapreduce.job.counters.groups.max=y 

NOTE. x and y are custom values ​​based on your environment / requirements.

-1
source

Source: https://habr.com/ru/post/923846/


All Articles