Rails, Heroku, and Resque: Continuous work to optimize background work

We are creating a tinder style app that allows users to like or dislike. Each event has about 100 keywords associated with it. When the user “likes” or “does not like” the event, we associate the keywords of this event with the user. Users can quickly get thousands of keywords.

We use tables to associate users and events with keywords (event_keywords and user_keywords). The wire table has an additional column relevance_score , which is a float (for example, the keyword can be 0.1 if it is very small or 0.9 if it is very important).

Our goal is to show users the most relevant events based on their keywords. Thus, in Events there are many event_rankings that belong to the user. Theoretically, we want to rank all events differently for each user.

Here are the models:

User.rb:

  has_many :user_keywords, :dependent => :destroy has_many :keywords, :through => :user_keywords has_many :event_rankings, :dependent => :destroy has_many :events, :through => :event_rankings 

Event.rb

  has_many :event_keywords, :dependent => :destroy has_many :keywords, :through => :event_keywords has_many :event_rankings, :dependent => :destroy has_many :users, :through => :event_rankings 

UserKeyword.rb:

  belongs_to :user belongs_to :keyword 

EventKeyword.rb:

  belongs_to :keyword belongs_to :event 

EventRanking.rb:

  belongs_to :user belongs_to :event 

Keyword.rb:

  has_many :event_keywords, :dependent => :destroy has_many :events, :through => :event_keywords has_many :user_keywords, :dependent => :destroy has_many :users, :through => :user_keywords 

We have a method that calculates how relevant an event is for a particular user based on their keywords. This method works very fast, as it is just math.

User.rb:

 def calculate_event_relevance(event_id) ## Step 1: Find which of the event keywords the user has ## Step 2: Compare those keywords and do math to calculate a score ## Step 3: Update the event_ranking for this user end 

Each time the user “loves” or “does not like” an event, a background work is created:

RecalculateRelevantEvents.rb:

 def self.perform(event_id) ## Step 1: Find any events that that share keywords with Event.find(event_id) ## Step 2: calculate_event_relevance(event) for each event from above step end 

So, here is a brief description of the process:

  • User likes or dislikes the event.
  • A background task is created that finds similar events in the event in step 1
  • Each similar event is recalculated based on user keywords.

I'm trying to figure out ways to optimize my approach, as it can quickly get out of hand. The average user will miss about 20 events per minute. An event can contain up to 1000 such events. And each event has about 100 keywords.

Thus, with my approach, for each miss, I need to go through 1000 events, and then skip 100 keywords in each event. And this happens 20 times per minute per user.

How do I approach this?

+1
ruby-on-rails heroku resque
source share
1 answer

Do you need to pay for napkins? can you debounce it and recount for the user no more than once every 5 minutes?

This data does not need to be updated 20 times per second to be useful, in fact, being updated every second, probably much more often than useful.

With a 5-minute refusal, you go from 6,000 (20 * 60 * 5) allocations per user to 1 for the same period - quite a big savings.

I would also recommend using sidekiq, if possible, with its multi-threaded processing, you will get a huge boost to the number of simultaneous tasks - I am a big fan.

And when you use it, you can try the stone: https://github.com/hummingbird-me/sidekiq-debounce

... which provides the type of debounce that I suggested.

+1
source share

All Articles