I am running Spark on EMR. I will post my EMR step with the following:
spark-submit --deploy-mode cluster --master yarn --num-executors 15 --executor-cores 3 --executor-memory 3G
Despite this, my resource manager interface shows that each of the 3 nodes has 4 or 6 YARN containers, each with 1 core and 3G memory.
Each node has 16 cores and 30 GB of RAM.
It seems that YARN creates as many containers with 1 core / 3 GB as it can until memory runs out on node. This leaves unused 10+ cores.
Why doesn't Spark complete my --executor-cores setup?
source share