Error "Connection Refused" when starting Spark Streaming on the local machine

I know that there are many streams that are already associated with spark streaming issues. But most of them are on Linux, or at least point to HDFS. I run this on my local windows laptop.

I am launching a very simple basic standalone Spark streaming application to see how streaming works. There is nothing complicated: -

import org.apache.spark.streaming.Seconds import org.apache.spark.streaming.StreamingContext import org.apache.spark.SparkConf object MyStream { def main(args:Array[String]) { val sc = new StreamingContext(new SparkConf(),Seconds(10)) val mystreamRDD = sc.socketTextStream("localhost",7777) mystreamRDD.print() sc.start() sc.awaitTermination() } } 

I get the following error: -

 2015-07-25 18:13:07 INFO ReceiverSupervisorImpl:59 - Starting receiver 2015-07-25 18:13:07 INFO ReceiverSupervisorImpl:59 - Called receiver onStart 2015-07-25 18:13:07 INFO SocketReceiver:59 - Connecting to localhost:7777 2015-07-25 18:13:07 INFO ReceiverTracker:59 - Registered receiver for stream 0 from 192.168.19.1:11300 2015-07-25 18:13:08 WARN ReceiverSupervisorImpl:92 - Restarting receiver with delay 2000 ms: Error connecting to localhost:7777 java.net.ConnectException: Connection refused 

I tried using different port numbers, but that does not help. Therefore, he continues to repeat the cycle and continues to receive the same error. Does anyone have an idea?

+6
source share
1 answer

In the code for socketTextStream Spark creates an instance of SocketInputDStream that uses java.net.Socket https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/ SocketInputDStream.scala # L73

java.net.Socket is a client socket, which means that it expects that the specified address and port are already running on the server. If you do not have a service running on a server on port 7777 of your local computer, the error you see is expected.

To understand what I mean, try the following (you may not need to install master or appName in your environment).

 import org.apache.spark.streaming.Seconds import org.apache.spark.streaming.StreamingContext import org.apache.spark.SparkConf object MyStream { def main(args:Array[String]) { val sc = new StreamingContext(new SparkConf().setMaster("local").setAppName("socketstream"),Seconds(10)) val mystreamRDD = sc.socketTextStream("bbc.co.uk",80) mystreamRDD.print() sc.start() sc.awaitTermination() } } 

This does not return any content because the application does not say HTTP to the bbc website, but does not receive a connection rejection.

To start a local server on Linux, I would use netcat with a simple command like

 cat data.txt | ncat -l -p 7777 

I am not sure what your best approach is on Windows. You can write another application that listens as a server on this port and sends some data.

+13
source

All Articles