Selecting columns with DISTINCT in PostgreSQL

I request bus stops from the database and I want it to return only 1 stop per bus line / direction. This request does just this:

Stop.select("DISTINCT line_id, direction") 

Except that it will not give me a different attribute than those of 2. I tried a couple of other queries to return an id in addition to the line_id and direction fields (ideally, it returned all columns), with no luck:

 Stop.select("DISTINCT line_id, direction, id") 

and

 Stop.select("DISTINCT(line_id || '-' || direction), id") 

In both cases, the query loses its separate sentence, and all rows are returned.

Some terrible dude helped me and suggested using a subquery to return all identifiers:

 Stop.find_by_sql("SELECT DISTINCT a1.line_id, a1.direction, (SELECT a2.id from stops a2 where a2.line_id = a1.line_id AND a2.direction = a1.direction ORDER BY a2.id ASC LIMIT 1) as id FROM stops a1 

Then I can extract all the identifiers and execute a second query to get the full attributes for each stop.

Is there a way to get all this inside 1 request and return all attributes?

+6
ruby-on-rails activerecord postgresql
source share
2 answers
 Stop.select("DISTINCT ON (line_id, direction) *") 
+27
source share

Not so fast - another answer chooses stop_id arbitrary

That is why your question does not make sense. We can pull out stop_ids and have different line_id and direction. But we have no idea why we have stop_id.

  create temp table test( line_id integer, direction char(1), stop_id integer); insert into test values (1, 'N', 1), (1, 'N', 2), (1, 'S', 1), (1, 'S', 2), (2, 'N', 1), (2, 'N', 2), (2, 'S', 1), (2, 'S', 2) ; select distinct on (line_id, direction) * from test; -- do this again but will reverse the order of stop_ids -- could it possible change our Robust Query?!!! drop table test; create temp table test(line_id integer,direction char(1),stop_id integer); insert into test values (1, 'N', 2), (1, 'N', 1), (1, 'S', 2), (1, 'S', 1), (2, 'N', 2), (2, 'N', 1), (2, 'S', 2), (2, 'S', 1) ; select distinct on (line_id, direction) * from test; 

First select:

 line_id | direction | stop_id ---------+-----------+--------- 1 | N | 1 1 | S | 1 2 | N | 1 2 | S | 1 

Second choice:

 line_id | direction | stop_id ---------+-----------+--------- 1 | N | 2 1 | S | 2 2 | N | 2 2 | S | 2 

So, we left without stop_id grouping, but we have no guarantees why we got the one we did. All we know is that it is a valid stop_id. Any updates, inserts, or other material that is not guaranteed by RDMS may change around the physical order of the lines.

This is what I meant in the top comment. There is no known reason to pull one stop_id on top of another, but for some reason you need this stop_id (or something else) desperately.

+3
source share

All Articles