I need to be able to process large JSON files by creating objects from deserialized substrings as we iterate over / stream in the file.
For example:
Say I can only deserialize to the following cases:
case class Data(val a: Int, val b: Int, val c: Int)
and expected JSON format:
{ "foo": [ {"a": 0, "b": 0, "c": 0 }, {"a": 0, "b": 0, "c": 1 } ], "bar": [ {"a": 1, "b": 0, "c": 0 }, {"a": 1, "b": 0, "c": 1 } ], .... MANY ITEMS .... , "qux": [ {"a": 0, "b": 0, "c": 0 } }
What I would like to do:
import com.codahale.jerkson.Json val dataSeq : Seq[Data] = Json.advanceToValue("foo").stream[Data](fileStream) // NOTE: this will not compile since I pulled the "advanceToValue" out of thin air.
As a last note, I would prefer to find a solution that includes Jerkson or any other libraries that come with the Play platform, but if another Scala library handles this script with greater ease and decent performance: I am not against trying another library. If there is a clean way to manually search through a file, then use the Json library to continue parsing from there: I'm fine with that.
What I don't want to do is swallow the entire file without streaming or using an iterator, since storing the entire file in memory at one time will be overly expensive.
Ryan delucchi
source share