Get the maximum points from the group

I'm having trouble getting output for a group function in sql.Below - details for a table

I have 1 table table. The name "check" has 2 columns pid, cid

Name Null? Type ----------------------------------------- -------- ---------------------------- PID VARCHAR2(20) CID VARCHAR2(20) 

Below are the available lines.

 select * from checks; PID CID -------------------- -------------------- p1 c1 p1 c1 p1 c2 p1 c2 p1 c2 p2 c1 p2 c1 p2 c1 p2 c1 p2 c1 p2 c1 p2 c2 p2 c2 p2 c2 p2 c2 p2 c2 

P represents the participants and c represents the category

Question

I need to know which participant participates in more than one category, in that the participant in the category participates maximum (for each participant)

Expected Result:

 pid cid count(cid) --- --- ----------- p1 c2 3 p2 c1 6 
+7
max sql group-by
source share
5 answers

Assuming a database system (you did not specify one, but I suspect Oracle?) That supports window and CTE functions, I would write:

 ;With Groups as ( select pid,cid,COUNT(*) as cnt from checks group by pid,cid ), Ordered as ( select pid,cid,cnt, ROW_NUMBER() OVER (PARTITION BY pid ORDER BY cnt desc) as rn, COUNT(*) OVER (PARTITION BY pid) as multi from Groups ) select pid,cid,cnt from Ordered where rn = 1 and multi > 1 

The first CTE ( Groups ) simply finds the counts for each unique combination of cid,pid . The second CTE ( Ordered ) assigns line numbers to these results based on counting - with the highest score assigning a line number of 1. We also count how many total lines were created for each pid .

Finally, we select those lines that are assigned a line number of 1 (the highest counter) and for which we got several results for the same pid .

Here is the Oracle script for the game. And here is the version of SQL Server (and thanks to Andriy M for Oracle production)

+4
source share

This will give you some basic ideas:

enter image description here

And the results shown below. In addition, since p1 participated in more than one category, so p1 will appear with each new category on a different line when we use: 'group by PID, CID'

enter image description here

+1
source share

Step by step:

First get the number of rows per (PID, CID) . It's simple:

 SELECT PID, CID, COUNT(*) AS cnt FROM checks GROUP BY PID, CID 

And you get this result set for your example:

 PID CID cnt --- --- --- p1 c1 2 p1 c2 3 p2 c1 6 p2 c2 5 

Now enter COUNT(*) OVER (PARTITION BY PID) to return the number of categories per person:

 SELECT PID, CID, COUNT(*) AS cnt, COUNT(*) OVER (PARTITION BY PID) AS cat_cnt FROM checks GROUP BY PID, CID 

The OVER clause turns the โ€œnormalโ€ aggregate function COUNT() into a window aggregation function. This makes COUNT(*) work with the grouped rowset, not the original one. So, COUNT(*) OVER ... in this case, it counts the lines by PID , which for us matters the number of categories per person. And this is an updated result set:

 PID CID cnt cnt_cat --- --- --- ------- p1 c1 2 2 p1 c2 3 2 p2 c1 6 2 p2 c2 5 2 

One more thing to do: rank cnt values โ€‹โ€‹by PID . This can be tricky, as there may be connections in the upper counts. If you always want one line per PID and are completely indifferent to which CID, cnt will be in case of communication, you can change the request as follows:

 SELECT PID, CID, COUNT(*) AS cnt, COUNT(*) OVER (PARTITION BY PID) AS cat_cnt, ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rn FROM checks GROUP BY PID, CID 

And it will look like this:

 PID CID cnt cnt_cat rn --- --- --- ------- -- p1 c1 2 2 2 p1 c2 3 2 1 p2 c1 6 2 1 p2 c2 5 2 2 

At this point, the results contain all the data needed to get the final output, you just need to filter on cnt_cat and rn . However, you cannot do this directly. Instead, use the last query as a derived table, whether with a table expression WITH or a "normal" subtask. The following is an example of using WITH :

 WITH grouped AS ( SELECT PID, CID, COUNT(*) AS cnt, COUNT(*) OVER (PARTITION BY PID) AS cat_cnt, ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rn FROM checks GROUP BY PID, CID ) SELECT PID, CID, cnt FROM grouped WHERE cat_cnt > 1 AND rn = 1 ; 

Here's the SQL Fiddle demo (using Oracle): http://sqlfiddle.com/#!4/cd62d/8

To expand a bit more in the ranking part, if you still want to return one CID, cnt per PID , but would prefer to have more control over which line should be defined as the โ€œwinnerโ€, you will need to add a tie-break to the ORDER BY ranking function. For example, you can change the original expression,

 ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rn 

with this:

 ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC , CID ) AS rn 

those. a tie-break CID , which means two or more CID with an upper counter, one that is sorted before others win.

However, you may want to return all the top bills for the PID . In this case, use RANK() or DENSE_RANK() instead of ROW_NUMBER() (and without a time switch), for example:

 RANK() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rn 
+1
source share
 select pid, cid, count from ( select pid, cid, count(*) as count from checks group by pid, cid order by count DESC ) as temp group by pid; 

The same thing works in MySQL.

-one
source share

Here is the MySQL solution:

 SELECT tbl1.pid, tbl1.cid, tbl1.pairCount FROM ( SELECT checks.pid, checks.cid, COUNT(*) AS pairCount FROM checks GROUP BY checks.pid, checks.cid ) AS tbl1 INNER JOIN ( SELECT checks.pid, checks.cid, COUNT(*) AS pairCount FROM checks GROUP BY checks.pid, checks.cid ) AS tbl2 ON tbl1.pid=tbl2.pid AND tbl1.cnt > tbl2.cnt 

Sorry, I use 2 subqueries, but could not understand anything better. At least it works. Fiddle

I couldn't just use GROUP BY because when the GROUP BY values โ€‹โ€‹returned for non-group columns are arbitrary, and not from the same row where MAX () is: MYSQL shows invalid rows when using GROUP BY

-one
source share

All Articles