Spark Yarn-client mode over the network via VPN

I am trying to start Spark yarn-client mode working through VPN. More specifically, the spark driver will run locally from my laptop, while the yarn cluster is on its own private network, accessible via a non-bridged VPN. The first task was to make the spark driver service available from the yarn cluster, since the VPN is one-way, my laptop is not routed from the cluster. I managed to get this working by adding an entry in / etc / hosts to indicate the name of the public domain on my LAN IP, something like

192.168.0.6 spark.driver.mydomain

Then I set spark.driver.host = spark.driver.mydomain. Now the spark driver can successfully bind to spark.driver.mydomain and say that the yarn application manager connects to spark.driver.mydomain. I also need to configure spark.driver.mydomain to point to my public IP address by changing the DNS domain of my domain and configure the firewall to make the service public. Now I can launch a spark from my laptop to control the cluster, almost there. However, SparkUI does not work. It is not possible to connect to SparkUI even though the message says that it was successfully launched in spark.driver.mydomain: 4040. I opened all the ports through the firewall of the local network using DMZ. I also tried using the LAN IP. I can notice that it redirects to the link of yarn resource managers,http: // resourcemanager / proxy / application_id , but ended up just getting outdated, and I didn’t understand how the proxy server works. A session spark also occasionally spits out warning messages, such as

WARN ReliableDeliverySupervisor: Association with the remote system [akka.tcp: // sparkExecutor @executor: port] failed, the address is now blocked for [5000] ms. The reason is: [Disassociated].

Basic spark actions all work, despite the warning. There are still many problems and questions.

  • Is there a connection between the spark driver and the yarn cluster in unregistered data in this scenario? Are there any problems with data security (assuming the VPN is secure).
  • SparkUI is unavailable, which is unbearable.
  • Warning messages
  • "-"? , , , ?

, JIRA, . https://issues.apache.org/jira/browse/SPARK-5113

+4

All Articles