SQL - SELECT MAX () and companion field

I have a problem that can be easily solved with the help of several tables, but for this I have only one table.

Consider the following database table

UserID UserName EmailAddress Source 3K3S9 Ben ben@myisp.com user SF13F Harry lharry_x@hotbail.com 3rd_party SF13F Harry reside@domain.com user 76DSA Lisa cake@insider.com user OL39F Nick stick@whatever.com 3rd_party 8F66S Stan myman@lol.com user 

I need to select all the fields, but only each user once has one of their email addresses (the "largest" defined by the MAX () function). This is the result I am after ...

 UserID UserName EmailAddress Source 3K3S9 Ben ben@myisp.com user SF13F Harry lharry_x@hotbail.com 3rd_party 76DSA Lisa cake@insider.com user OL39F Nick stick@whatever.com 3rd_party 8F66S Stan myman@lol.com user 

As you can see, "Harry" is displayed only once with his "highest" email address corresponding to the "source"

Currently, we are grouping by UserID, UserName and using MAX () for EmailAddress and Source, but the maximum of these two fields does not always coincide, they must be from the same record.

I tried a different process, attaching the table to myself, but I managed to get the correct email address, but not the appropriate "source" for this address.

Any help would be appreciated as I tried to solve this problem for too long :)

+6
sql database join
source share
4 answers

If you are using SQL Server 2005 or higher,

 SELECT UserID, UserName, EmailAddress, Source FROM (SELECT UserID, UserName, EmailAddress, Source, ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EmailAddress DESC) AS RowNumber FROM MyTable) AS a WHERE a.RowNumber = 1 

Of course, there are ways to accomplish the same task without ranking functions (SQL-Standard), such as ROW_NUMBER , which SQL Server has only implemented since 2005, including nested dependent queries and its own left connections with ON including the '>' and WHERE ... IS NULL , but the ranking functions make code that can be read and (theoretically) optimized well using SQL Server Engine.

Edit: this article is a good ranking tutorial, but it uses RANK in the examples instead of ROW_NUMBER (or another DENSE_RANK ranking DENSE_RANK ) - the difference matters when there are β€œlinks” between grouped rows in the same section according to the ordering criteria. this post does a great job explaining the difference.

+7
source share
 select distinct * from table t1 where EmailAddress = (select max(EmailAddress) from table t2 where t1.userId = t2.userId) 
+5
source share
 select distinct * from SomeTable a inner join ( select max(emailAddress), userId from SomeTable group by userId ) b on a.emailAddress = b.emailAddress and a.userId = b.userId 
0
source share

I think I have a solution different from the one proposed:

  select *
 from foo
 where id = (
   select id
   from foo F
   where F.bar = foo.bar
   order by F.baz
   limit 1
 )

This gives you all the foo entries that have the largest bases compared to other foo files with the same bar.

0
source share

All Articles