Is Scala a good way to split CSV by column?

I have csv with column headers. One column heading is DATE. If I want to generate two CSVs separated before DATE, and after / including DATE, is there any way to do this without a procedural loop? I noticed that most list functions are suitable for filtering by line.

+4
source share
1 answer

Suppose you have already analyzed your data like this:

val myDoc = List( List("ID", "NAME", "DATE", "DESC"), List("1", "a", "1990", "x"), List("2", "b", "1991", "y") ) 

Now we can use splitAt and unzip to split the list. Please note that I accept a lot about the real data code, which we would like to check that the list is not empty and that the header actually contains the "DATE" column.

 def split(doc: Seq[Seq[String]]) = { val i = doc.head.indexOf("DATE") doc.map(_.splitAt(i)).unzip } 

We can apply it to our test data:

 scala> val (b, a) = split(myDoc) b: List[Seq[String]] = List(List(ID, NAME), List(1, a), List(2, b)) a: List[Seq[String]] = List(List(DATE, DESC), List(1990, x), List(1991, y)) 

It looks reasonable to me.

+6
source

All Articles