I have a sherlock.txt text file containing several lines of text. I load it into a spark shell using:
val textFile = sc.textFile("sherlock.txt")
My goal is to count the number of words in a file. I came across two alternative ways to do this job.
First use flatMap:
textFile.flatMap(line => line.split(" ")).count()
The second use of the card, followed by a decrease:
textFile.map(line => line.split(" ").size).reduce((a, b) => a + b)
Both give the same result correctly. I want to know the difference in time and complexity space of the two above-mentioned alternative implementations, if there really are any?
Does scala translate both into the most efficient form?