According to http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/ , the formula for determining the number of simultaneously running tasks on a node:
min (yarn.nodemanager.resource.memory-mb / mapreduce.[map|reduce].memory.mb, yarn.nodemanager.resource.cpu-vcores / mapreduce.[map|reduce].cpu.vcores) .
However, when setting these parameters (for a cluster from c3.2xlarges):
yarn.nodemanager.resource.memory-mb = 14336
mapreduce.map.memory.mb = 2048
yarn.nodemanager.resource.cpu-vcores = 8
mapreduce.map.cpu.vcores = 1,
I find that I get only up to 4 tasks running at the same time in node, when formula 7 should be indicated. What is the deal?
I am running Hadoop 2.4.0 on AMI 3.1.0.
amazon-web-services elastic-map-reduce yarn hadoop2 hadoop-streaming
verve
source share