Hadoop HDFS with Spark

I am new to cluster computing and I am trying to set up a minimum 2 node cluster in Spark. What am I still a bit confused: Do I need to first install a full Hadoop installation if the Spark ship with the included version of Hadoop inside?

What I find in Spark doesn't really make it clear. I realized that Spark is meant as an extension of Hadoop, not a replacement, but if it requires that an independently working Hadoop system does not understand me.

I need HDFS, is this enough to just use the Hadoop file system?

Can anyone point out this probably the obvious thing for me?

+4
source share
1 answer

Apache Spark Hadoop. Spark ( HDFS) , (, YARN, Mesos).

, Spark, Hadoop.

+4

All Articles