How to use Scala to analyze CSV data with empty columns?

Raw data is as follows:

YAPM1,20100901,23:36:01.563,Quote,,,,,,,4563,,,,,,
YAPM1,20100901,23:36:03.745,Quote,,,,,4537,,,,,,,,

The first row has extra empty columns. I analyze the data as follows:

val tokens = List.fromString(line, ',')

Result:

List(YAPM1, 20100901, 23:36:01.563, Quote, 4563)
List(YAPM1, 20100901, 23:36:03.745, Quote, 4537)

There is currently no way to use the resulting lists to deduce which rows had the extra columns. How to do it?

+5
source share
1 answer

Use a line separator and pass -1 as the second argument!

scala> "a,b,c,d,,,,".split(",")
res1: Array[java.lang.String] = Array(a, b, c, d)

scala> "a,b,c,d,,,,".split(",", -1)
res2: Array[java.lang.String] = Array(a, b, c, d, "", "", "", "")

The FYI fromString list is deprecated in favor of line splitting.

+10
source

All Articles