Best way to parse a huge ruby ​​JSON file

It's hard for me to understand a huge json file.

The file is> 1 GB, and I tried to use two stones: ruby-stream and yajl, and both of them do not work.

Here is an example of what is happening.

fileStr = File.read("hugeJSONfile.json") 

^ This part is ok.

But when I try to load the Str file into the JSON hash code (via ruby-stream or yajl), my computer freezes.

Any other ideas on how to do this more efficiently? Thanks.

+5
source share
2 answers

Take a look at json-stream or yajl :

Quote from the documentation:

JSON stream:

the document itself is never read completely in memory.

yajl:

The main advantage of this library is the use of its memory. Because he is able to analyze the stream in pieces, his memory requirements are very, very low.

You register the events you are looking for and return keys / values ​​while reading through JSON instead of loading all of this into the ruby ​​data structure (and therefore into memory).

+3
source

Ok, I was able to make it out.

Honestly, this is not the most elegant solution, but in desperate times, one quick way to parse a huge JSON file is to check the file manually, notice the template and rip out what you need.

In my case, here is what I did in the pseudo code

 fileStr = File.read("hugeJSONfile.json") arr = fileStr.split("[some pattern]") arr.each do |str| extract desired value from str end 

Again, not the most elegant solution, but it does not require maintenance and, depending on the circumstances, just adapts to what your crappy laptop can assemble.

0
source

All Articles