Multithreaded rake task

I am writing a rake task that will be called every minute (maybe every 30 seconds in the future) when and when it contacts the polling API endpoint (for each user in our database). Obviously, it is inefficient to work as a single thread, but is multithreading possible? If not, is there a good event-based HTTP library that can do the job?

+6
source share
2 answers

I am writing a rake task that will be called every minute (maybe every 30 seconds in the future) when

Beware of the Rails launch time, it might be better to use a forking model like Resque or Sidekiq, Rescue provides https://github.com/bvandenbos/resque-scheduler , which should be able to do what you need, I don’t I can talk about Sidekiq, but I'm sure it has something similar (Sidekiq is much newer than Resque)

Obviously, this does not work effectively as a single thread, but is multithreading possible? If not, is there a good event-based HTTP library that can do the job?

I would advise you to take a look at ActiveRecord find_each for tips on improving the efficiency of your search process, as soon as you have your parties you can easily do something using threads such as:

 # # Find each returns 50 by default, you can pass options # to optimize that for larger (or smaller) batch sizes # depending on your available RAM # Users.find_each do |batch_of_users| # # Find each returns an Enumerable collection of users # in that batch, they'll be always smaller than or # equal to the batch size chosen in `find_each` # # # We collect a bunch of new threads, one for each # user, eac # batch_threads = batch_of_users.collect do |user| # # We pass the user to the thread, this is good # habit for shared variables, in this case # it doesn't make much difference # Thread.new(user) do |u| # # Do the API call here use `u` (not `user`) # to access the user instance # # We shouldn't need to use an evented HTTP library # Ruby threads will pass control when the IO happens # control will return to the thread sometime when # the scheduler decides, but 99% of the time # HTTP and network IO are the best thread optimized # thing you can do in Ruby. # end end # # Joining threads means waiting for them to finish # before moving onto the next batch. # batch_threads.map(&:join) end 

This will start no more than batch_size threads waiting for each batch_size .

You could do something like this, but then you will have an uncontrolled number of threads, there is an alternative that you can extract from this, it becomes much more difficult, including ThreadPool, and a general list of work to be done, I published it as in Github so no spam stackoverflow: https://gist.github.com/6767fbad1f0a66fa90ac

+12
source

I would suggest using sidekiq , which is great for multithreading. Then you can set separate tasks for each user to poll the API. clockwork can be used to complete the tasks that you queue.

+3
source

Source: https://habr.com/ru/post/927125/


All Articles