Thread: Window function for get the last value to extend missing rows

Window function for get the last value to extend missing rows

From
Durumdara
Date:
Dear Members!

I have a table with temperature measures.
The data is coming from the PLC, but sometimes the period is "slipping", so the values are not correctly minute based.

03:00     10
03:02     12
03:03     11
03:05     13

I have to make a virtual table which is minute based.

I thought I would make a generated temp table (generate_series) and then join these values based on minue.

03:00     10
03:01     NULL
03:02     12
03:03     11
03:04     NULL
03:05     13

I need a code to replace the value to the last value on NULL.

03:00     10
03:01     10 <
03:02     12
03:03     11
03:04     11 <
03:05     13

The LAG function seems to be ok, but how to handle if more than two periods are missing?

03:00     10
03:01     NULL
03:02     NULL
03:03     NULL
03:04     11
03:05     13

select *, coalesce(value, prev_value) as value from (
select mmin, value,
lag(value, 1) over (order by mmin) as prev_value
from test_table) t


or

select *, coalesce(value, prev_value) as value from (
select mmin, value,
coalesce(lag(value, 1) over (order by mmin),
lag(value, 2) over (order by mmin)) as prev_value
from tmp_test_table) t

The test data:

create table tmp_test_table (mmin int, value int);
insert into tmp_test_table values
(1, 1),
(2, 1),
(3, NULL),
(4, NULL),
(5, 2),
(6, NULL),
(7, NULL),
(10, 10),
(11, NULL),
(12, NULL),
(13, NULL),
(14, NULL);

The result is:

11NULL1
2111
3NULL11
4NULL11
52NULL2
6NULL22
7NULL22
1010NULL10
11NULL1010
12NULL1010
13NULLNULLNULL
14NULLNULLNULL

So you can see, the last values are NULL because the LAG can't use the last calculated value.

Do you have any idea how to get the last value, doesn't matter how many NULL-s are in the set?

(15, NULLx20, 10) => 15x21, 10

Thanks

Best regards
dd

On Fri, 12 May 2023 at 13:04, Durumdara <durumdara@gmail.com> wrote:
Dear Members!

I have a table with temperature measures.
The data is coming from the PLC, but sometimes the period is "slipping", so the values are not correctly minute based.

03:00     10
03:02     12
03:03     11
03:05     13

I have to make a virtual table which is minute based.

I thought I would make a generated temp table (generate_series) and then join these values based on minue.

03:00     10
03:01     NULL
03:02     12
03:03     11
03:04     NULL
03:05     13

I need a code to replace the value to the last value on NULL.

03:00     10
03:01     10 <
03:02     12
03:03     11
03:04     11 <
03:05     13

"The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, lag, first_value, last_value, and nth_value. This is not implemented in PostgreSQL: the behavior is always the same as the standard's default, namely RESPECT NULLS".

So, I'd keep the incoming data as is, i.e. with no nulls in values, and densify it with some generate_series magic:
select gmin as mmin, d.value
from
    ( select mmin, lead(mmin) over (order by mmin) nextmin, value from test_table ) d,
    generate_series(d.mmin, nextmin - interval'1 minute') gmin

(I assumed a time representation for mmin, but adapt the generate_series call to whatever your representation is)
Best,
g

Re: Window function for get the last value to extend missing rows

From
Thorsten Glaser
Date:
On Fri, 12 May 2023, GF wrote:

>"The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead,
>lag, first_value, last_value, and nth_value. This is not implemented in
>PostgreSQL: the behavior is always the same as the standard's default,
>namely RESPECT NULLS".

Yeah, THAT caused no small amount of cursing, earlier this year,
I’d have also used IGNORE NULLS somewhere…

bye,
//mirabilos
--
15:41⎜<Lo-lan-do:#fusionforge> Somebody write a testsuite for helloworld :-)



Re: Window function for get the last value to extend missing rows

From
Andrew Gierth
Date:
>>>>> "Durumdara" == Durumdara  <durumdara@gmail.com> writes:

 Durumdara> I have to make a virtual table which is minute based.

 Durumdara> I thought I would make a generated temp table
 Durumdara> (generate_series) and then join these values based on minue.

 Durumdara> 03:00     10
 Durumdara> 03:01     NULL
 Durumdara> 03:02     12
 Durumdara> 03:03     11
 Durumdara> 03:04     NULL
 Durumdara> 03:05     13

 Durumdara> I need a code to replace the value to the last value on
 Durumdara> NULL.

It is possible to do this, even without the IGNORE NULLS option for
window functions, but it's somewhat awkward.

Someone else already mentioned using generate_series in lateral position
to fill in values; in most cases this is probably the best approach,
though it requires a little care.

There is also a window function approach based on using the non-null
values to delimit partitions:

create table tmp_test_table(mmin,val)
   as select o, v
        from unnest(array[1,5,NULL,3,NULL,NULL,10,7,NULL,NULL,NULL,4])
               with ordinality as u(v,o);
select * from tmp_test_table order by mmin;
 mmin | val 
------+-----
    1 |   1
    2 |   5
    3 |    
    4 |   3
    5 |    
    6 |    
    7 |  10
    8 |   7
    9 |    
   10 |    
   11 |    
   12 |   4
(12 rows)

First we take advantage of the fact that mmin is increasing to generate
a distinguishing value for each block of nulls:

select *,
       max(case when val is not null then mmin end)
         over (order by mmin) as grp
  from tmp_test_table
 order by mmin;
 mmin | val | grp 
------+-----+-----
    1 |   1 |   1
    2 |   5 |   2
    3 |     |   2
    4 |   3 |   4
    5 |     |   4
    6 |     |   4
    7 |  10 |   7
    8 |   7 |   8
    9 |     |   8
   10 |     |   8
   11 |     |   8
   12 |   4 |  12
(12 rows)

Then we can fill in the missing vals by using the fact that there is
at most one non-null val in each group:

select *,
       max(val) over (partition by grp) as val2
  from (select *,
               max(case when val is not null then mmin end)
                 over (order by mmin) as grp
          from tmp_test_table) s
 order by mmin;
 mmin | val | grp | val2 
------+-----+-----+------
    1 |   1 |   1 |    1
    2 |   5 |   2 |    5
    3 |     |   2 |    5
    4 |   3 |   4 |    3
    5 |     |   4 |    3
    6 |     |   4 |    3
    7 |  10 |   7 |   10
    8 |   7 |   8 |    7
    9 |     |   8 |    7
   10 |     |   8 |    7
   11 |     |   8 |    7
   12 |   4 |  12 |    4
(12 rows)

This _really_ isn't efficient, though; you end up with typically three
sorts of the data. For a one-off operation or for generating a
materialized view it might be acceptable.

-- 
Andrew (irc:RhodiumToad)



Re: Window function for get the last value to extend missing rows

From
Kirk Wolak
Date:
On Sat, May 13, 2023 at 2:18 AM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>> "Durumdara" == Durumdara  <durumdara@gmail.com> writes:

create table tmp_test_table(mmin,val)
   as select o, v
        from unnest(array[1,5,NULL,3,NULL,NULL,10,7,NULL,NULL,NULL,4])
               with ordinality as u(v,o);
select * from tmp_test_table order by mmin;
 
 
That seems like a lot of work. 
If you have ALL the values (no missing values) a simple CTE handles this:


WITH RECURSIVE rec_cte(mmin, value) AS (
    SELECT mmin, value from tmp_test_table where mmin=1
    UNION ALL
    SELECT t.mmin, COALESCE(t.value,r.value)
      FROM tmp_test_table t, rec_cte r WHERE r.mmin=(t.mmin-1)
)
SELECT * from rec_cte order by mmin;


 

Re: Window function for get the last value to extend missing rows

From
postmaster@heinz-it.de
Date:
Am 12.05.23 um 13:04 schrieb Durumdara:
> [...]
>
> The LAG function seems to be ok, but how to handle if more than two periods
> are missing?
>
> 03:00     10
> 03:01     NULL
> 03:02     NULL
> 03:03     NULL
> 03:04     11
> 03:05     13
>
> [...]
and how do you think about NULL in first(and second/third) row?