How to limit the number of attempts if Spark crashes?

We run the Spark task through spark-submit , and I see that the task will be resubmitted in the event of a failure.

How can I prevent attempt No. 2 in case of failure of the container from under the yarn or any exception?

enter image description here

This was due to a lack of memory and the "GC Limit Exceeded" problem.

+21
scala yarn apache-spark
source share
3 answers

There are two parameters that control the number of retries (i.e., the maximum number of attempts to register with ApplicationMaster using YARN is considered unsuccessful and, therefore, the entire Spark application):

  • spark.yarn.maxAppAttempts - Spark's own customization. See MAX_APP_ATTEMPTS :

     private[spark] val MAX_APP_ATTEMPTS = ConfigBuilder("spark.yarn.maxAppAttempts") .doc("Maximum number of AM attempts before failing the app.") .intConf .createOptional 
  • yarn.resourcemanager.am.max-attempts - yarn.resourcemanager.am.max-attempts YARN setting, default 2.

(As you can see in YarnRMClient.getMaxRegAttempts ) the actual number is the minimum of the YARN and Spark configuration parameters, with YARN being the last resort.

+27
source share

A solution independent of the programming language / API would be to set the maximum number of yarn attempts as a command line argument:

 spark-submit --conf spark.yarn.maxAppAttempts=1 <application_name> 

See @code answer

+11
source share

Add the yarn.resourcemanager.am.max-attempts property to the yarn-default.xml file. It determines the maximum number of application attempts.

See link for more details.

+2
source share

All Articles