Update ordered row with last non-null value

Consider a type table with the following data

column_a (boolean) | column_order (integer) TRUE | 1 NULL | 2 NULL | 3 TRUE | 4 NULL | 5 FALSE | 6 NULL | 7 

I would like to write queries that replace each NULL value in column_a last non- NULL from the previous column values ​​in the order specified by column_order . The result should look like this:

 column_a (boolean) | column_order (integer) TRUE | 1 TRUE | 2 TRUE | 3 TRUE | 4 TRUE | 5 FALSE | 6 FALSE | 7 

For simplicity, we can assume that the first value is never zero. The following works if there is at most one NULL value:

 SELECT COALESCE(column_a, lag(column_a) OVER (ORDER BY column_order)) FROM test_table ORDER BY column_order; 

However, the above does not work for an arbitrary number of consecutive NULL values. What is a Postgres query that can achieve the results above? Is there an efficient query that scales well for a large number of rows?

+5
source share
5 answers

You can use a convenient trick where you sum over case to create partitions based on divisions between zero and nonzero rows, then first_value to bring them forward.

eg.

 select *, sum(case when column_a is not null then 1 else 0 end) OVER (order by column_order) as partition from table1; column_a | column_order | partition ----------+--------------+----------- t | 1 | 1 | 2 | 1 | 3 | 1 t | 4 | 2 | 5 | 2 f | 6 | 3 | 7 | 3 (7 rows) 

then

 select first_value(column_a) OVER (PARTITION BY partition ORDER BY column_order), column_order from ( select *, sum(case when column_a is not null then 1 else 0 end) OVER (order by column_order) as partition from table1 ) partitioned; 

gives you:

  first_value | column_order -------------+-------------- t | 1 t | 2 t | 3 t | 4 t | 5 f | 6 f | 7 (7 rows) 
+2
source

Not sure if Postgresql supports this, but try:

 SELECT COALESCE(column_a, (select t2.column_a from test_table t2 where t2.column_order < t1.column_order and t2.column_a is not null order by t2.column_order desc fetch first 1 row only)) FROM test_table t1 ORDER BY column_order; 
+1
source

I am more familiar with SqlServer, but this should do what you need.

 update tableA as a2 set column_a = b2.column_a from ( select a.column_order, max(b.column_order) from tableA as a inner join tableA as b on a.column_order > b.column_order and b.column_a is not null where a.column_a is null group by a.column_order ) as junx inner join tableA as b2 on junx.max =b2.column_order where a2.column_order = junx.column_order 

SQL Fiddle

+1
source

Use this:

 select case when column_a is null then (select top 1 column_a from myTable where column_order < mt.column_order AND column_a is not null order by column_order desc) else column_a end, column_order from myTable mt 

here is a fiddle for it.

0
source

The below query is working on Postgresql. You should consider adding an index to column_order, otherwise the execution can be very slow.

EDIT: in a table of 100,000 records it took 3661 ms, while "select * from test_table" took 2511 ms. Performance is very good when you add an index to column a. (statistics are taken on the client side)

 select case when a.column_a is null then b.column_a else a.column_a end, a.column_order from test_table a left join test_table b on b.column_order = ( select max(column_order) from test_table c where c.column_order < a.column_order and c.column_a is not null) order by column_order 
0
source

All Articles