Oracle SQL: how to select N records for each "group" / "cluster",

I have a big_table , with 4 million records, they are grouped into 40 groups through a column called " process_type_cod ". The list of values ​​that this column can take is in the second table. Let me call it small_table .

So, we have a big_table with a NOT NULL FK called process_type_cod that points to small_table (suppose the colum name is the same for both tables).

I need a record N (i.e. 10) from big_table, for each record for a small_table.

those. 10 records from big_table related to the first record of small_table UNION 10 different records from big_table related to the second record of a small table, etc.

Is it possible to get using a single SQL function?

+4
source share
1 answer

I recommend an analytic function such as rank () or row_number (). You can do this with hard-coded joins, but the analytic function does all the hard work for you.

select * from ( select bt.col_a, bt.col_b, bt.process_type_cod, row_number() over ( partition by process_type_cod order by col_a nulls last ) rank from small_table st inner join big_table bt on st.process_type_cod = bt.process_type_cod ) where rank < 11 ; 

You may not even need this connection, as big_table has all the types you care about. In this case, just change the "from" parameter to use big_table and drop the connection.

What this does is fulfill the query and then sort the records using the "order by" operator in the section statement. For this group (here we are grouped by col_a) the numerical number of rows (i.e. 1, 2, 3, 4, 5, n + 1 ...) is applied to each record sequentially. In the external where clause, simply filter the entries with a number below N.

+9
source

All Articles