Why is repeated JSON analysis consuming more and more memory?

It seems that parsing the same JSON file over and over in Ruby is using ever larger amounts of memory. Consider the code and the result below:

  • Why is memory not freed up after the first iteration?
  • Why does it parse a 1 MB 116 MB RAM file after parsing? This is surprising given that the text file is converted to hashes. What am I missing here?

the code:

require 'json' def memused `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1]/1024 end text = IO.read('../data-grouped/2012-posts.json') puts "before parsing: #{memused}MB" iter = 1 while true items = JSON.parse(text) GC.start puts "#{iter}: #{memused}MB" iter += 1 end 

Output:

 before parsing: 116MB 1: 1840MB 2: 2995MB 3: 2341MB 4: 3017MB 5: 2539MB 6: 3019MB 
+8
json ruby memory memory-leaks
source share
1 answer

When Ruby parses a JSON file, it creates many intermediate objects to achieve the goal. These objects remain in memory until the GC starts to work.

If the JSON file has a complex structure, many arrays and internal objects, the number will grow rapidly.

Have you tried calling "GC.start" to suggest Ruby to clear unused memory? If the memory size has decreased significantly, it can be assumed that these are just intermediate objects used for data analysis, otherwise your data structure is complex or there is something your data that lib cannot free.

For great JSON processing, I use yajl-ruby ( https://github.com/brianmario/yajl-ruby ). This C is implemented and has a low area.

+1
source share

All Articles