Are there any internal queries?

I often see something like ...

SELECT events.id, events.begin_on, events.name FROM events WHERE events.user_id IN ( SELECT contacts.user_id FROM contacts WHERE contacts.contact_id = '1') OR events.user_id IN ( SELECT contacts.contact_id FROM contacts WHERE contacts.user_id = '1') 

Can I request a request? Is this an "internal request"? "Sub request"? Does it calculate three queries (my example)? If this is bad to do ... how can I rewrite my example?

+4
source share
4 answers

Your example is not so bad. The biggest problems usually arise from cases where there is a so-called “correlated subquery”. This is when the subquery depends on the column from the outer query. This is especially bad because the subquery needs to be repeated effectively for each row in the potential results.

You can rewrite your subqueries with joins and GROUP BY , but as you have them, performance may vary, especially depending on your RDBMS.

+3
source

It varies from database to database, especially if compared columns

  • indexed or not
  • nullable or not

... but usually, if your query does not use the columns from the table you joined, you should use either IN or EXISTS :

 SELECT e.id, e.begin_on, e.name FROM EVENTS e WHERE EXISTS (SELECT NULL FROM CONTACTS c WHERE ( c.contact_id = '1' AND c.user_id = e.user_id ) OR ( c.user_id = '1' AND c.contact_id = e.user_id ) 

Using a JOIN (INNER or OUTER) can overstate records if the child table has more than one record per parent table record. This is good if you need this information, but if not, you need to use either GROUP BY or DISTINCT to get a set of unique result values, and this may cost you when viewing the cost of a request.

EXISTS

Although EXISTS clauses look like correlated subqueries, they are not executed as such (RBAR: Row By Agonizing Row). EXISTS returns a boolean based on the provided criteria and leaves the first instance, which is true - this can make it faster than IN when working with duplicates in a child table.

+3
source

Instead of JOIN in the contact table:

 SELECT events.id, events.begin_on, events.name FROM events JOIN contacts ON (events.user_id = contacts.contact_id OR events.user_id = contacts.user_id) WHERE events.user_id = '1' GROUP BY events.id -- exercise: without the GROUP BY, how many duplicate rows can you end up with? 

This leaves the following question for the database: “Should we look at the entire contact table and find everything” 1 in different columns or do something else? "where your original SQL didn't give him much choice.

+1
source

The most common term for this type of query is "subquery". There is nothing wrong with using them and can make your life easier. However, performance can often be improved by rewriting queries with subqueries instead of using JOIN, because the server can find optimizations.

In your example, three queries are executed: the main SELECT query and two SELECT subqueries.

 SELECT events.id, events.begin_on, events.name FROM events JOIN contacts ON (events.user_id = contacts.contact_id OR events.user_id = contacts.user_id) WHERE events.user_id = '1' GROUP BY events.id 

In your case, I believe that the JOIN version will be better, since you can avoid two SELECT queries on contacts, choose JOIN instead.

See mysql docs in the topic.

+1
source

All Articles