MySQL grouping does not comply with ORDER BY

I am trying to execute a MySQL query to find the latest active threads (and the most recent comment for each thread) in a web forum. The topics are stored in two tables, forum_topics and forum_responses , where each forum_topic has many forum_responses .

Here I do a search on forum_reponses attached to forum_topics , with downward sorting on forum_response.id :

 select t.id, t.title, r.id, r.body from forum_responses r inner join forum_topics t on (r.forum_topic_id = t.id) order by r.id desc; +----+--------------+----+----------------------------------+ | id | title | id | body | +----+--------------+----+----------------------------------+ | 17 | New Topic | 69 | yes | | 19 | Test Topic 1 | 68 | This is a test | | 17 | New Topic | 64 | hey yo | | 19 | Test Topic 1 | 63 | Test Topic Starter | | 18 | Test Topic | 62 | Test. | | 18 | Test Topic | 61 | Test | | 17 | New Topic | 60 | Another test response. | | 17 | New Topic | 59 | Test response. | | 17 | New Topic | 54 | What should this topic be about? | +----+--------------+----+----------------------------------+ 

Ok, so far so good. But it returns duplicates - I just want to have the most recent answers to forum topics. Therefore, I am adding GROUP BY to my query so that we can group by topic id:

 select t.id, t.title, r.id, r.body from forum_responses r inner join forum_topics t on (r.forum_topic_id = t.id) group by t.id order by r.id desc; +----+--------------+----+----------------------------------+ | id | title | id | body | +----+--------------+----+----------------------------------+ | 19 | Test Topic 1 | 63 | Test Topic Starter | | 18 | Test Topic | 61 | Test | | 17 | New Topic | 54 | What should this topic be about? | +----+--------------+----+----------------------------------+ 

But now we have a problem: it is grouped by the identifier of the forum topic, but it is not interesting, we do not receive the topics of our forum sorted by the latest actions, and the forum responses associated with them are not the latest.

What's going on here? Is there a way to modify this query to get a list of the most recent forum topics, as well as their latest comments?

+6
source share
2 answers

Diagnostics

The problem with your second query is that you execute GROUP BY t.id , but the other selected columns ( t.title , r.id and r.body ) are not aggregate functions. This is essentially a mistake with standard SQL, but MySQL will not complain if you do not enable the ONLY_FULL_GROUP_BY setting for the entire server. Instead, MySQL simply gives you non-deterministic results.

Proposed solution

 select t.id, t.title, r.id, r.body from ( select forum_topic_id, max(id) as id from forum_responses group by forum_topic_id ) as latest inner join forum_topics t on latest.forum_topic_id = t.id inner join forum_responses r on latest.id = r.id order by r.id desc; 

Beyond mysql

With a more capable database, you can write this using window functions, also called analytic functions. Then you could avoid one of these associations. On PostgreSQL, MS SQL Server or Oracle, the query will look something like this:

 select t.id, t.title, r.id, r.body from ( select forum_topic_id, id, body from forum_responses where 1 = rank(id over partition by forum_topic_id order by id desc) ) as r inner join forum_topics t on t.id = r.forum_topic_id order by r.id desc; 
0
source

This is what I suggested in my original comment; you need GROUP separately like this:

 SELECT t.id, t.title, r2.id, r2.body FROM ( SELECT forum_topic_id, MAX(id) AS lastResponseID FROM forum_responses GROUP BY forum_topic_id ) AS r INNER JOIN forum_responses AS r2 ON r.forum_topic_id = r2.forum_topic_id AND r.lastResponseID = r2.id INNER JOIN forum_topics AS t ON r.forum_topic_id = t.id ORDER BY t.id; 

It may be faster (it is better to use indexing) to include forum_topics in the subquery (and this is the title field), but it depends a lot on the distribution of data; joining millions of indexed rows (each answer to their topic) can be slower than a relatively small number of non-indexed rows (the most recent answers to their topic).

+2
source

All Articles