Limit YARN containers programmatically

I have 10 nodes in a cluster with clusters with 32 GB of RAM and one with 64 GB.

For these 10 nodes, the node limit is yarn.nodemanager.resource.memory-mbset to 26 GB and for 64 GB the node is up to 52 GB (50 GB is required for some resellers, they work on this node)

The problem is that when I run basic tasks that require 8 GB for the cartographer, 32 GB nodes carry 3 cards in parallel (26/8 = 3) and 64 GB node spawn 6 cards. This node usually ends last due to processor load.

I want to limit the resources of the job container programmatically, for example. Set the container limit to 26 GB for most jobs. How can I do that?

+6
source share
2 answers

First of all, yarn.nodemanager.resource.memory-mb(Memory), yarn.nodemanager.resource.cpu-vcores(vcore) are the configuration properties of the Nodemanager daemon / service and cannot be overridden in YARN client applications. You need to restart the nodemanager services if you change these configuration properties.

CPU , YARN Fairscheduler DRF (Dominant Resource Fairness) , , . (mapper/reducer/AM/tasks) vcores,

/ .

schedulingPolicy: . : "fifo" / "fair" / "drf"

. apache -

, / DRF, .

conf = new Configuration();

mapreduce.

Configuration conf = new Configuration();

conf.set("mapreduce.map.memory.mb","4096");
conf.set(mapreduce.reduce.memory.mb","4096");

conf.set(mapreduce.map.cpu.vcores","1");
conf.set(mapreduce.reduce.cpu.vcores","1");

- https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

cpu.vcores mapper/reducer 1, , . , , / .

+2

​​.

// create a configuration
Configuration conf = new Configuration();
// create a new job based on the configuration
Job job = new Job(conf);
// here you have to put your mapper class
job.setMapperClass(Mapper.class);
// here you have to put your reducer class
job.setReducerClass(Reducer.class);
// here you have to set the jar which is containing your 
// map/reduce class, so you can use the mapper class
job.setJarByClass(Mapper.class);
// key/value of your reducer output
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
// this is setting the format of your input, can be TextInputFormat
job.setInputFormatClass(SequenceFileInputFormat.class);
// same with output
job.setOutputFormatClass(TextOutputFormat.class);
// here you can set the path of your input
SequenceFileInputFormat.addInputPath(job, new Path("files/toMap/"));
// this deletes possible output paths to prevent job failures
FileSystem fs = FileSystem.get(conf);
Path out = new Path("files/out/processed/");
fs.delete(out, true);
// finally set the empty out path
TextOutputFormat.setOutputPath(job, out);

// this waits until the job completes and prints debug out to STDOUT or whatever
// has been configured in your log4j properties.
job.waitForCompletion(true); 

YARN .

// this should be like defined in your yarn-site.xml
conf.set("yarn.resourcemanager.address", "yarn-manager.com:50001"); 

//For set to 26GB
conf.set("yarn.nodemanager.resource.memory-mb", "26624"); 


// framework is now "yarn", should be defined like this in mapred-site.xm
conf.set("mapreduce.framework.name", "yarn");

// like defined in hdfs-site.xml
conf.set("fs.default.name", "hdfs://namenode.com:9000");
0

All Articles