Dataproc launches Spark on top of YARN, so you wonβt find typical Spark standalone ports; instead, when you start the Spark job, you can visit port 8088 , which will show you the main page of the YARN ResourceManager. Any running Spark jobs will be available through the Application Master link on this page. The Spark Application Master home page looks just like the regular Spark-standalone landing page, which you usually find on port 8080 for the default Spark settings.
Since workers are registered on the internal network, YARN links will use the internal cluster node names (host names must include the Dataproc cluster name as a prefix), but this means that if you access from the external network, the links may not work first; you need to replace the host name with an external IP address if you are using a firewall based approach.
A simpler experience would be to use the SOCKS proxy approach, as described here: https://cloud.google.com/dataproc/cluster-web-interfaces
In this case, just using gcloud compute ssh to start the socks local lightweight proxy and then open the browser you pointed to will allow you to click all the YARN links as usual.
Dennis huo
source share