Rails - request loop hangs there is memory

I use gem Curb (also tried with httparty ) to execute many http requests, and this works well. But in one of my tasks ( rake ) (where I make 20k + requests) I have a memory problem (Rails "eats" more than 2 GB of RAM until there is no more free memory).

It seems that Rails "did not wait" for an answer and moved to another thread in the loop, the problem is that in this way a lot of objects are created that are not collected by the garbage collector (I think) and is causing a memory leak.

Is there any way to tell the rails to wait for an answer? (I tried with sleep , but was not a sustainable solution).

I have a pseudo code:

 def the_start while start_date <= end_date do # ~ 140 loop a_method_that_do_sub_specifics_call end end def a_method_that_do_sub_specifics_call some_data.each do |r| # ~ 180 loop do_a_call #do something with models (update/create entries,...) end end def do_a_call # called ~ 25k times # with the gem Curb version req = Curl::Easy.new do |curl| curl.ssl_verify_peer = false curl.url = url curl.headers['Content-type'] = 'application/json' end req.perform # actual version, with httparty gem req = HTTParty.get("#{url}", :headers => {'Content-type' => 'application/json'}) end 

Rails does not seem to wait for req.perform results.

EDIT:
I also tried to initialize the Curl::Easy object only once, using Curl::Easy.perform() and req.close (which should call GC implicitly) after the call, but without success when using a large amount of memory. The only solution that (I think) can work is to “block” the rails until an answer arrives, but how?

EDIT 2
In another task, I only call a_method_that_do_sub_specifics_call without any problems.

EDIT 3
After some performance modification (placing find_each(:batch_size => ...) , GC.start , ...) the task works a little better .. now the first cycle ~ 100 ( do_a_call ) do_a_call , after which the memory usage jump from 100 MB to 2 Gbps + again.

+4
source share
2 answers

After several days of debugging, reading a ton of forums and posts, I found a solution:
a string variable of a variable that grows until a memory leak occurs.

Some helpful notes I made on my trip:

Curb vs HTTParty
Between the two gems that fulfill curl requests, Curb is the best in terms of performance. http://bibwild.wordpress.com/2012/04/30/ruby-http-performance-shootout-redux/

Pay attention to class variables
My problem was that the debug / info line class continues to grow, avoiding the use of a class variable that is never collected by the garbage collector. In my particular case, it was:

 @status = "#{@status} Warning - response is empty for #{description}\n" 

Manual garbage collection
Run manual GC.start at a critical point to free up memory that is no longer needed. Remember that calling GC.start does not make an instant call to the garbage collector, it only offers it.

ActiveRecords array call
When calling large ActiveRecords, use .find_each , for example:

 Model.find_each(:batch_size => 50) do |row| 

Executes a query for only 50 (or something smaller than the default), each time is better than calling a single query with a 1k line. (I think the default value of batch_size is 1000).

Useful links:

+5
source

Try using multiple instances of Curl, such as a pool. You create a new object all the time.

0
source

All Articles