How to generate empty aggregated results in SQL

I am trying to refine a SQL query so that my reports look better. My query reads data from one table, groups by several columns and calculates some aggregated fields (numbers and sums).

SELECT A, B, C, COUNT(*), SUM(D) FROM T GROUP BY A, B, C ORDER BY A, B, C 

Now suppose that columns B and C are defined constant rows, for example, B can be 'B1' or 'B2' , C can be 'C1' or 'C2' . So, an example set of results:

 A | B | C | COUNT(*) | SUM(D) -------------------------------- A1 | B1 | C1 | 34 | 1752 A1 | B1 | C2 | 4 | 183 A1 | B2 | C1 | 199 | 8926 A1 | B2 | C2 | 56 | 2511 A2 | B1 | C2 | 6 | 89 A2 | B2 | C2 | 12 | 231 A3 | B1 | C1 | 89 | 552 ... 

As you can see, for 'A1' I have all four possible combinations (B, C), but this is not the case for 'A2' . My question is: how can I also generate summary lines for a combination (B, C) that are not present, in fact, in this table? That is, how can I print, for example, also these lines:

 A | B | C | COUNT(*) | SUM(D) -------------------------------- A2 | B1 | C1 | 0 | 0 A2 | B2 | C1 | 0 | 0 

The only solution I see is to create some helper function tables with all values ​​(B, C), and then make a RIGHT OUTER JOIN with this Aux table. But I'm looking for a cleaner way ...

Thanks to everyone.

+7
source share
3 answers

The auxiliary table does not have to be a real table, it can be a common table expression - at least if you can get all the possible values ​​(or whatever interests you) from the table itself. Using the @Bob Jarvis query to create all possible combinations, you can do something like:

 WITH CTE AS ( SELECT * FROM (SELECT DISTINCT a FROM T) JOIN (SELECT DISTINCT b, c FROM T) ON (1 = 1) ) SELECT CTE.A, CTE.B, CTE.C, SUM(CASE WHEN TA IS NULL THEN 0 ELSE 1 END), NVL(SUM(TD),0) FROM CTE LEFT JOIN T ON TA = CTE.A AND TB = CTE.B AND TC = CTE.C GROUP BY CTE.A, CTE.B, CTE.C ORDER BY CTE.A, CTE.B, CTE.C; 

If you have fixed values ​​that cannot be in the table, then this is a little more complicated (or, in any case, uglier and worsens with more possible values):

 WITH CTE AS ( SELECT * FROM (SELECT DISTINCT a FROM T) JOIN (SELECT 'B1' AS B FROM DUAL UNION ALL SELECT 'B2' FROM DUAL) ON (1 = 1) JOIN (SELECT 'C1' AS C FROM DUAL UNION ALL SELECT 'C2' FROM DUAL) ON (1 = 1) ) SELECT CTE.A, CTE.B, CTE.C, SUM(CASE WHEN TA IS NULL THEN 0 ELSE 1 END), NVL(SUM(TD),0) FROM CTE LEFT JOIN T ON TA = CTE.A AND TB = CTE.B AND TC = CTE.C GROUP BY CTE.A, CTE.B, CTE.C ORDER BY CTE.A, CTE.B, CTE.C; 

But you must join what you know about the "missing" values. If the same logic is needed elsewhere, and you have fixed values, then a persistent table can be cleaner β€” maintenance can be necessary in any way. You can also consider the pipelined function, which will act as a surrogate table, but may depend on volumes.

+2
source

The fact is, if you do not have a specific combination in your database, how could the engine include this combination in the results? To have all the combinations in the results, you need to have all the combinations available, whether in the main table or in some other table used for links. For example, you can create another table R with the following data:

 A | B | C ------------ A1 | B1 | C1 A1 | B1 | C2 A1 | B2 | C1 A1 | B2 | C2 A2 | B1 | C1 A2 | B1 | C2 A2 | B2 | C1 A2 | B2 | C2 A3 | B1 | C1 A3 | B1 | C2 A3 | B1 | C1 A3 | B2 | C2 ... 

And then your request will look like this:

 SELECT r.*, COUNT(td), coalesce(SUM(td), 0) FROM r LEFT OUTER JOIN t on (ra=ta and rb=tb and rc=tc) GROUP BY ra, rb, rc ORDER BY ra, rb, rc 

This will return you the set you want with 0 | 0 0 | 0 for a combination that does not exist in the main table. Please note that this is only possible if you know all the possible combinations that you want to include, which may not always be the case.

If, on the other hand, your A, B, C are numeric values, and you just want to include all the numbers in the range, then there might be another way to handle this, something like this:

 SELECT an, bn, cn, COUNT(td), coalesce(SUM(td), 0) FROM (SELECT (rownum) "n" FROM DUAL WHERE LEVEL >= start_a CONNECT BY LEVEL <= end_a) a, (SELECT (rownum) "n" FROM DUAL WHERE LEVEL >= start_b CONNECT BY LEVEL <= end_b) b, (SELECT (rownum) "n" FROM DUAL WHERE LEVEL >= start_c CONNECT BY LEVEL <= end_c) c, t WHERE an = ta(+) AND bn = tb(+) AND cn = tc(+) GROUP BY an, bn, cn ORDER BY an, bn, cn 

(I do not have an Oracle instance to test this, so this is more of a somewhat enlightened guess than anything else.)

The bottom line is that the engine must know what to include in the final results - one way or another.

+1
source

There are probably nicer ways to do this, but the following should help you get started with what you want:

 SELECT * FROM (SELECT DISTINCT a FROM T) JOIN (SELECT DISTINCT b, c FROM T) ON (1 = 1) ORDER BY a, b, c 

This will give you all the combinations that exist from B and C, together with all A that exist, like

 A1 B1 C1 A1 B1 C2 A1 B2 C1 A1 B2 C2 A2 B1 C1 A2 B1 C2 A2 B2 C1 A2 B2 C2 

Share and enjoy.

0
source

All Articles