The correct way to create a pivot table in postgresql using CASE WHEN

I am trying to create a pivot table type view in postgresql and almost there! Here is the basic request:

select acc2tax_node.acc, tax_node.name, tax_node.rank from tax_node, acc2tax_node where tax_node.taxid=acc2tax_node.taxid and acc2tax_node.acc='AJ012531'; 

And the data:

  acc | name | rank ----------+-------------------------+-------------- AJ012531 | Paromalostomum fusculum | species AJ012531 | Paromalostomum | genus AJ012531 | Macrostomidae | family AJ012531 | Macrostomida | order AJ012531 | Macrostomorpha | no rank AJ012531 | Turbellaria | class AJ012531 | Platyhelminthes | phylum AJ012531 | Acoelomata | no rank AJ012531 | Bilateria | no rank AJ012531 | Eumetazoa | no rank AJ012531 | Metazoa | kingdom AJ012531 | Fungi/Metazoa group | no rank AJ012531 | Eukaryota | superkingdom AJ012531 | cellular organisms | no rank 

What I'm trying to get is the following:

 acc | species | phylum AJ012531 | Paromalostomum fusculum | Platyhelminthes 

I am trying to do this with CASE WHEN, so I got the following:

 select acc2tax_node.acc, CASE tax_node.rank WHEN 'species' THEN tax_node.name ELSE NULL END as species, CASE tax_node.rank WHEN 'phylum' THEN tax_node.name ELSE NULL END as phylum from tax_node, acc2tax_node where tax_node.taxid=acc2tax_node.taxid and acc2tax_node.acc='AJ012531'; 

Which gives me the result:

  acc | species | phylum ----------+-------------------------+----------------- AJ012531 | Paromalostomum fusculum | AJ012531 | | AJ012531 | | AJ012531 | | AJ012531 | | AJ012531 | | AJ012531 | | Platyhelminthes AJ012531 | | AJ012531 | | AJ012531 | | AJ012531 | | AJ012531 | | AJ012531 | | AJ012531 | | 

Now I know that at some point I need to group acc, so I'm trying

 select acc2tax_node.acc, CASE tax_node.rank WHEN 'species' THEN tax_node.name ELSE NULL END as sp, CASE tax_node.rank WHEN 'phylum' THEN tax_node.name ELSE NULL END as ph from tax_node, acc2tax_node where tax_node.taxid=acc2tax_node.taxid and acc2tax_node.acc='AJ012531' group by acc2tax_node.acc; 

But I get scary

 ERROR: column "tax_node.rank" must appear in the GROUP BY clause or be used in an aggregate function 

All the previous examples that I could find use something like SUM () around the CASE statements, so I assume this is an aggregated function. I tried using FIRST ():

 select acc2tax_node.acc, FIRST(CASE tax_node.rank WHEN 'species' THEN tax_node.name ELSE NULL END) as sp, FIRST(CASE tax_node.rank WHEN 'phylum' THEN tax_node.name ELSE NULL END) as ph from tax_node, acc2tax_node where tax_node.taxid=acc2tax_node.taxid and acc2tax_node.acc='AJ012531' group by acc2tax_node.acc; 

but get an error:

 ERROR: function first(character varying) does not exist 

Can anyone suggest any hints?

+6
sql postgresql pivot-table pivot case-when
source share
6 answers

Use MAX () or MIN (), not FIRST (). In this case, you will have all the NULLs in the column for each group value, with the exception of at most one with a non-empty value. By definition, this is both MIN and MAX of this set of values ​​(all zeros are excluded).

+5
source share

PostgreSQL has several functions for summary queries, see this postgresonline article. These features can be found in contrib .

+2
source share
 SELECT atn.acc, ts.name AS species, tp.name AS phylum FROM acc2tax_node atn LEFT JOIN tax_node ts ON ts.taxid = atn.taxid AND ts.rank = 'species' LEFT JOIN tax_node tp ON tp.taxid = atn.taxid AND tp.rank = 'phylum' WHERE atn.acc = 'AJ012531 ' 
0
source share

Additional information on request (in response, not in comments for good formatting):

 SELECT * FROM acc2tax_node WHERE acc = 'AJ012531'; acc | taxid ----------+-------- AJ012531 | 66400 AJ012531 | 66399 AJ012531 | 39216 AJ012531 | 39215 AJ012531 | 166235 AJ012531 | 166384 AJ012531 | 6157 AJ012531 | 33214 AJ012531 | 33213 AJ012531 | 6072 AJ012531 | 33208 AJ012531 | 33154 AJ012531 | 2759 AJ012531 | 131567 
0
source share

Execute:

 SELECT report.* FROM crosstab( select acc2tax_node.acc, tax_node.name, tax_node.rank from tax_node, acc2tax_node where tax_node.taxid=acc2tax_node.taxid and acc2tax_node.acc='AJ012531'; ) AS report(species text, enus text, family text, ...) 
0
source share

As Matthew Wood pointed out, use MIN () or MAX (), not FIRST ():

 SELECT an.acc, MAX( CASE tn.rank WHEN 'species' THEN tn.name ELSE NULL END ) AS species, MAX( CASE tn.rank WHEN 'phylum' THEN tn.name ELSE NULL END ) AS phylum FROM tax_node tn, acc2tax_node an WHERE tn.taxid = an.taxid and an.acc = 'AJ012531' GROUP by an.acc; 
0
source share

All Articles