Activerecord has_many: via a single sql call search

I have these 3 models:

class User < ActiveRecord::Base has_many :permissions, :dependent => :destroy has_many :roles, :through => :permissions end class Permission < ActiveRecord::Base belongs_to :role belongs_to :user end class Role < ActiveRecord::Base has_many :permissions, :dependent => :destroy has_many :users, :through => :permissions end 

I want to find the user and his roles in one sql statement, but I cannot achieve this:

The following statement:

 user = User.find_by_id(x, :include => :roles) 

Gives me the following queries:

  User Load (1.2ms) SELECT * FROM `users` WHERE (`users`.`id` = 1) LIMIT 1 Permission Load (0.8ms) SELECT `permissions`.* FROM `permissions` WHERE (`permissions`.user_id = 1) Role Load (0.8ms) SELECT * FROM `roles` WHERE (`roles`.`id` IN (2,1)) 

Not quite perfect. How to do this so that it executes a single SQL query with joins and loads user roles into memory, so that:

 user.roles 

does not issue a new SQL query

+4
source share
3 answers

Loading roles into a separate SQL query is actually an optimization called Optimized Impatient Download.

 Role Load (0.8ms) SELECT * FROM `roles` WHERE (`roles`.`id` IN (2,1)) 

(This is done instead of loading each role separately, problem N + 1.)

The Rails team found that it is generally faster to use an IN query with associations that were viewed earlier, rather than making a large connection.

Joining will occur only in this query if you add conditions to one of the other tables. Rails will detect this and complete the connection.

For instance:

 User.all(:include => :roles, :conditions => "roles.name = 'Admin'") 

See the original ticket , this previous stack overflow question, and Fabio Akita's message about optimized download loading .

+5
source

As Damien remarked, if you really want every request to be used every time you have to use join.

But you may not need one SQL call. Here's why (from here ):

Optimized Download Download


Let's look at this:

 Post.find(:all, :include => [:comments]) 

Prior to Rails 2.0, we would see something like the following SQL query in the log:

 SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, `posts`.`body` AS t0_r2, `comments`.`id` AS t1_r0, `comments`.`body` AS t1_r1 FROM `posts` LEFT OUTER JOIN `comments` ON comments.post_id = posts.id 

But now, in Rails 2.1, the same command will provide different SQL queries. In fact, at least 2, instead of 1. "And how could this be an improvement?" Let's look at the generated SQL queries:

 SELECT `posts`.`id`, `posts`.`title`, `posts`.`body` FROM `posts` SELECT `comments`.`id`, `comments`.`body` FROM `comments` WHERE (`comments`.post_id IN (130049073,226779025,269986261,921194568,972244995)) 

Keyword :include for Eager Loading was implemented to solve the 1 + N danger problem. This problem occurs when you have associations, then you load the parent object and start loading one association at a time, so the problem is 1 + N. If your parent the object has 100 children, you will run 101 requests, which is not very good. One way to try to optimize this is to combine everything using the OUTER JOIN clause in SQL, so that both the parent and child objects are loaded immediately in one query.

It seemed like a good idea indeed. But for some situations, the external connection of the monster becomes slower than many smaller requests. There is a lot of discussion, and you can take a look at the details on tickets 9640, 9497, 9560, L109.

Bottom line : as a rule, it’s better to split the monster into smaller ones, as you saw in the above example. This avoids problems with overloading the product. For the uninitiated, allows you to run an external version of the connection request:

 mysql> SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, `posts`.`body` AS t0_r2, `comments`.`id` AS t1_r0, `comments`.`body` AS t1_r1 FROM `posts` LEFT OUTER JOIN `comments` ON comments.post_id = posts.id ; +-----------+-----------------+--------+-----------+---------+ | t0_r0 | t0_r1 | t0_r2 | t1_r0 | t1_r1 | +-----------+-----------------+--------+-----------+---------+ | 130049073 | Hello RailsConf | MyText | NULL | NULL | | 226779025 | Hello Brazil | MyText | 816076421 | MyText5 | | 269986261 | Hello World | MyText | 61594165 | MyText3 | | 269986261 | Hello World | MyText | 734198955 | MyText1 | | 269986261 | Hello World | MyText | 765025994 | MyText4 | | 269986261 | Hello World | MyText | 777406191 | MyText2 | | 921194568 | Rails 2.1 | NULL | NULL | NULL | | 972244995 | AkitaOnRails | NULL | NULL | NULL | +-----------+-----------------+--------+-----------+---------+ 8 rows in set (0.00 sec) 

Pay attention to this: do you see a lot of duplicates in the first three columns (t0_r0 to t0_r2)? These are columns of the Post model, the rest are columns of comment columns. Please note that the message "Hello World" was repeated 4 times. What the connection does: parent lines are repeated for each child. This particular post has 4 comments, so it has been repeated 4 times.

The problem is that this hits Rails hard because it will have to deal with a few small and short-lived objects. The pain is felt on the Rails side, but not on the MySQL side. Now compare this with smaller queries:

 mysql> SELECT `posts`.`id`, `posts`.`title`, `posts`.`body` FROM `posts` ; +-----------+-----------------+--------+ | id | title | body | +-----------+-----------------+--------+ | 130049073 | Hello RailsConf | MyText | | 226779025 | Hello Brazil | MyText | | 269986261 | Hello World | MyText | | 921194568 | Rails 2.1 | NULL | | 972244995 | AkitaOnRails | NULL | +-----------+-----------------+--------+ 5 rows in set (0.00 sec) mysql> SELECT `comments`.`id`, `comments`.`body` FROM `comments` WHERE (`comments`.post_id IN (130049073,226779025,269986261,921194568,972244995)); +-----------+---------+ | id | body | +-----------+---------+ | 61594165 | MyText3 | | 734198955 | MyText1 | | 765025994 | MyText4 | | 777406191 | MyText2 | | 816076421 | MyText5 | +-----------+---------+ 5 rows in set (0.00 sec) 

In fact, I'm a little cheating, I manually deleted the created_at and updated_at fields from all of the above requests so that you understand it a little more clearly. So, there you have it: a set of post results, divided and not duplicated, and the comment result is set with the same size as before. The longer and more complex the result set is, the more it matters because a large number of Rails objects will have to be dealt with. Highlighting and releasing several hundred or thousands of small duplicated objects is never good.

But this new feature is smart. Lets say you need something like this:

 >> Post.find(:all, :include => [:comments], :conditions => ["comments.created_at > ?", 1.week.ago.to_s(:db)]) 

In Rails 2.1, he will understand that there is a filter condition in the comment table, so he will not break it into small requests, but instead he will generate an old external version of the attachment, for example:

 SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, `posts`.`body` AS t0_r2, `posts`.`created_at` AS t0_r3, `posts`.`updated_at` AS t0_r4, `comments`.`id` AS t1_r0, `comments`.`post_id` AS t1_r1, `comments`.`body` AS t1_r2, `comments`.`created_at` AS t1_r3, `comments`.`updated_at` AS t1_r4 FROM `posts` LEFT OUTER JOIN `comments` ON comments.post_id = posts.id WHERE (comments.created_at > '2008-05-18 18:06:34') 

Thus, nested connections, conditions, etc. on connection tables should still work fine. In general, this should speed up your queries. Some reported that, due to more individual queries, MySQL seems to be getting a harder hit on the processor. You work at home and do your stress tests and tests to find out what is going on.

+5
source

Enabling the model loads the data. But makes a second request.
What do you want to use the parameter for :joins .

 user = User.find_by_id(x, :joins => :roles) 
0
source

All Articles