I am running Spark Streaming in YARN in cluster mode and I am trying to do a graceful shutdown, so that when the application is killed, it will complete the current micropacket before it stops.
Following some tutorials, I set spark.streaming.stopGracefullyOnShutdown to true , and I added the following code to the application:
sys.ShutdownHookThread { log.info("Gracefully stopping Spark Streaming Application") ssc.stop(true, true) log.info("Application stopped") }
However, when I kill the application with
yarn application -kill application_1454432703118_3558
The micro-recording performed at this point is not completed.
In driver I see the first line of the magazine printed ("Gracefully stopping the Spark Streaming application"), but not the last ("Application stopped").
ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM INFO streaming.MySparkJob: Gracefully stopping Spark Streaming Application INFO scheduler.JobGenerator: Stopping JobGenerator gracefully INFO scheduler.JobGenerator: Waiting for all received blocks to be consumed for job generation INFO scheduler.JobGenerator: Waited for all received blocks to be consumed for job generation INFO streaming.StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook
The following error appears in the artist log:
ERROR executor.CoarseGrainedExecutorBackend: Driver 192.168.6.21:49767 disassociated! Shutting down. INFO storage.DiskBlockManager: Shutdown hook called WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp:// sparkDriver@192.168.6.21 :49767] has failed, address is now gated for [5000] ms. Reason: [Disassociated] INFO util.ShutdownHookManager: Shutdown hook called
I think the problem is with the way YARN sends a kill signal to the application. Any idea on how I can make the app a beautiful grace?
yarn spark-streaming
nicola
source share