In Hadoop v1, I assigned every 7 slots for a 1 GB gimbal and gearbox, my cartographers and gearboxes work fine. My machine has 8G memory, 8 processors. Now with YARN, when you run the same application on the same machine, I got a container error. By default, I have the following settings:
<property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>8192</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>8192</value> </property>
This gave me an error:
Container [pid=28920,containerID=container_1389136889967_0001_01_000121] is running beyond virtual memory limits. Current usage: 1.2 GB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
Then I tried to set a memory limit in mapred-site.xml:
<property> <name>mapreduce.map.memory.mb</name> <value>4096</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>4096</value> </property>
But still an error occurs:
Container [pid=26783,containerID=container_1389136889967_0009_01_000002] is running beyond physical memory limits. Current usage: 4.2 GB of 4 GB physical memory used; 5.2 GB of 8.4 GB virtual memory used. Killing container.
I am confused why the map task needs so much memory. In my opinion, 1 GB of memory is enough for my map / reduce task. Why, since I assign more memory to the container, does the task use more? Is it because every task gets more splits? I feel that it is more efficient to reduce the size of the container a bit and create more containers so that more tasks are performed in parallel. The problem is, how can I make sure that no more partitions are assigned to each container than it can handle?
mapreduce hadoop yarn mrv2
Lishu Jan 08 '14 at 20:18 2014-01-08 20:18
source share