How to combine DISTINCT and ORDER BY in array_agg jsonb values ​​in postgresSQL

Note. I am using the latest version of Postgres (9.4)

I am trying to write a query that makes a simple join of two tables and groups using the primary key of the first table, and makes array_agg of several fields in the second table that I want to return as an object. The array must be sorted by a combination of two fields in json objects as well as uniquified.

So far I have come up with the following:

SELECT zoo.id, ARRAY_AGG( DISTINCT ROW_TO_JSON(( SELECT x FROM ( SELECT animals.type, animals.name ) x ))::JSONB -- ORDER BY animals.type, animals.name ) FROM zoo JOIN animals ON animals.zooId = zoo.id GROUP BY zoo.id; 

This results in one row for each zoo with a cumulative array of jsonb objects, one for each animal, in a unique way.

However, I cannot figure out how to sort this by parameters in the commented out part of the code.

If I select a report, I can create the original ORDER BY fields, which works fine, but then I have duplicates.

+5
source share
1 answer

If you use row_to_json() , you will lose the column names unless you enter a string to be typed. If you manually create a jsonb object using json_build_object() using explicit names, you return them:

 SELECT zoo.id, array_agg(za.jb) AS animals FROM zoo JOIN ( SELECT DISTINCT ON (zooId, "type", "name") zooId, json_build_object('animal_type', "type", 'animal_name', "name")::jsonb AS jb FROM animals ORDER BY zooId, jb->>'animal_type', jb->>'animal_name' -- ORDER BY zooId, "type", "name" is far more efficient ) AS za ON za.zooId = zoo.id GROUP BY zoo.id; 

You can ORDER BY elements of a jsonb object as shown above, but (as far as I know) you cannot use DISTINCT for a jsonb object. In your case, this would be pretty inefficient in any case (building all the jsonb objects jsonb and then throwing out duplicates), and at the aggregate level, this is simply not possible with standard SQL. However, you can achieve the same result by applying the DISTINCT before creating the jsonb object.

Also, avoid using SQL keywords such as "type" and standard data types such as "name" as column names. Both are unreserved keywords, so you can use them in your contexts, but in practice your teams can become very confusing. For example, you might have a schema with a table, a column in that table and a data type, each of which is called a "type", and then you can get this:

 SELECT type::type FROM type.type WHERE type = something; 

While PostgreSQL will kindly agree with this, it is at best confused and error prone in all the more complex situations. You can get a long way by double quoting any keywords, but they are best avoided as identifiers.

+2
source

All Articles