How to configure automatic restart of the application driver for yarn

From Spark Programming Guide

To automatically recover from a driver failure, the deployment infrastructure used to start the streaming application must monitor the driver process and restart the driver if it does not work. For this, different cluster managers have different tools.

Spark standalon

  • Spark Standalone The Spark application driver can be sent to run in a stand-alone Spark cluster (see cluster deployment mode), that is, the application driver itself runs on one of the working nodes. In addition, the stand-alone cluster manager may be instructed to monitor the driver and restart it if the driver fails due to a non-zero exit code or because of a node failure to start the driver. See Cluster mode and the Spark Standalone manual for more information.
  • YARN . Yarn supports a similar mechanism for automatically restarting an application. See the YARN documentation for more details. ....

    So the question is how to support automatic restart of Spark Streaming .

Thanks and best regards,

Tao

+5
source share
1 answer

What you are looking for is a set of instructions for starting the application in "cluster mode" mode of yarn: https://spark.apache.org/docs/latest/running-on-yarn.html

This means that your driver application runs in the cluster by YARN (not on your local computer). Thus, it can be restarted by YARN if it does not work.

+2
source

All Articles