The start and wait order is really inverted.
In addition to this, the easiest way to transfer data to the Spark Streaming app for testing is QueueDStream. This is a variable RDD queue of arbitrary data. This means that you can create data programmatically or download it from disk to RDD and transfer it to Spark Streaming code.
Eg. To avoid problems with synchronization with the file computer, you can try the following:
val rdd = sparkContext.textFile(...) val rddQueue: Queue[RDD[String]] = Queue() rddQueue += rdd val dstream = streamingContext.queueStream(rddQueue) doMyStuffWithDstream(dstream) streamingContext.start() streamingContext.awaitTermination()
source share