LAST EDIT
For those who have this problem, the answer is simpler: here .
EDIT 2
I realized after the first editing that it is a bit confusing, so here is a new change for those who may find it useful in the future.
The problem is that Spark no longer provides the ec2 directory as part of the official distribution. If you are used to deploying your autonomous clusters this way, this is a problem.
The solution is simple:
- Download the official ec2 catalog as described in the Spark 2.0.0 documentation.
- If you simply copy the directory to Spark 2.0.0 and run the
spark-ec2 executable to simulate how everything works in Spark 1. *, you can deploy your cluster as usual. But when you enter this, you will realize that not one of the binaries exists anymore. - So, as soon as you deploy your cluster (as usual with
spark-ec2 , which you downloaded in step 1), you need rsync to create a local Spark 2.0.0 directory as the master of your newly created cluster. Once this is done, you can spark-submit to complete the tasks as usual.
Actually simple, but it seems to me that the Spark docs can be understood by all of us as normal.
EDIT: That was the right decision. For those who have the same question: download the ec2 directory from AMPLab, as Spark suggests, put this folder in the local Spark-2.0.0 and usual startup scripts. Apparently, they only turned off the directory for maintenance purposes, but the logic is the same. It would be nice to have a few words about this in the Spark docs.
I tried the following: cloned the spark-ec2-branch-1.6 directory from the AMPLab link into the spark-2.0.0 directory and tried to start the cluster using the regular ./ec2/spark-ec2 . Maybe this is what they want from us?
I am running a small 16 node cluster. I see this on the AWS dashboard, but the terminal got stuck printing a regular SSH error in the past ... almost two hours.
Warning: SSH connection error. (This could be temporary.) Host: ec2-54-165-25-18.compute-1.amazonaws.com SSH return code: 255 SSH output: ssh: connect to host ec2-54-165-25-18.compute-1.amazonaws.com port 22: Connection refused
Will be updated if I find anything useful.
xv70
source share