The master instance acts as a manager and coordinates everything that happens in the entire cluster. Thus, it must exist in every workflow that you run, but just one instance is all you need. If you do not deploy a single node cluster (in this case, the main instance is the only node), it does not do any heavy lifting relative to the actual MapReducing, so the instance does not have to be a powerful machine.
The number of basic instances you need depends on the work and how quickly you want to process it, so there is no single correct answer. It’s good that you can change the size of the group instance of the kernel / task, so if you think your work is slow, you can add more instances to the running process.
One of the important differences between groups of kernel instances and task groups is that the main instances store the actual data in HDFS, while the task instances do not work. In turn, you can only increase the group of primary instances (since deleting running instances will lead to data loss in these instances). On the other hand, you can increase and decrease the group of task instances by adding or removing task instances.
Thus, these two types of instances can be used to tune the processing power of your work. Typically, you use indemand instances for core instances because they must work all the time and cannot be lost, and you use instances instances for task instances because losing instances of tasks does not kill all the work (for example, tasks not completed by instances tasks will be re-launched in primary instances). This is one way to work inexpensively with a large cluster using selective instances.
A general description of each type of instance is available here:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/InstanceGroups.html
In addition, this video may be useful for efficient use of EMR:
https://www.youtube.com/watch?v=a5D_bs7E3uc
source share