lag() default value ignored for some window partition depending on table records count? - Mailing list pgsql-bugs

From Loïc Revest
Subject lag() default value ignored for some window partition depending on table records count?
Date
Msg-id CABkOrLve_F5PyQNL+BhnUCT1FRVF5QV6mfVAuhrharF_Ka+9eQ@mail.gmail.com
Whole thread Raw
Responses Re: lag() default value ignored for some window partition depending on table records count?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: lag() default value ignored for some window partition depending on table records count?  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-bugs
Hello,

Long story short:
- We track history data in a partitioned table which boils down to  "key fields + a JSONB data column", the latter being a list of date/numeric values pair ;
- For every of this date/numeric couple within their "window partitioning", we need to determine the preceding value, thus the use of lag(<value>, 1, 0::numeric), since it's relevant for us here to get the very first record of the partition having "0.00" as its preceding value;
- Except that the "first" row of some partitions gets NULL instead of 0::numeric as the result of lag(<value>, 1, 0::numeric), while other get 0::numeric as expected;
- Things get stranger when the table containing the data get "purged" from every record except those corresponding to the "window partition" whose first record gets NULL as lag() output: now it also gets 0::numeric...

More details:
- Behavior first observed on 12.10 (Debian, through pgdg repo), then reproduced on 14.2 (OpenBSD, through packages);
- "Odd" behavior may be seen for particular window partition when row count ∼ 600k, but not when only relevant rows (58 records) are kept;
- EXPLAIN ANALYZE when getting the "odd" behavior: https://explain.depesz.com/s/4kEt
- Output (first rows) to illustrate (\pset null 'NULL'):
lag_return=> SELECT * FROM points_timeserie_full WHERE peer_id = 10043;
peer_id |             date              | points_count | previous_points_count  
---------+-------------------------------+--------------+-----------------------
  10043 | 2019-11-13 13:59:50+01        |       770.00 |                  NULL
  10043 | 2019-11-14 15:17:15+01        |       480.00 |                770.00
  10043 | 2019-12-17 13:23:20+01        |       770.00 |                480.00

While for another "peer_id":
peer_id |          date                 | points_count | previous_points_count  
---------+-------------------------------+--------------+-----------------------
  10015 | 2019-11-15 12:39:34+01        |       840.00 |                  0.00
  10015 | 2019-11-19 14:12:26+01        |      1165.00 |                840.00
  10015 | 2019-11-21 15:51:52+01        |      1165.00 |               1165.00

- EXPLAIN ANALYZE when keeping relevant records only: https://explain.depesz.com/s/pZKFL
- Output (first rows) to illustrate (\pset null 'NULL'):
lag_return=> SELECT * FROM points_timeserie_10043 WHERE peer_id = 10043;     
peer_id |             date              | points_count | previous_points_count  
---------+-------------------------------+--------------+-----------------------
  10043 | 2019-11-13 13:59:50+01        |       770.00 |                     0
  10043 | 2019-11-14 15:17:15+01        |       480.00 |                770.00
  10043 | 2019-12-17 13:23:20+01        |       770.00 |                480.00

(Note: points_timeserie_full and points_timeserie_10043 are strictly identical views structure-wise, the only difference being in the dataset of their respective tables (themselves identical structure-wise), as explained above).

Attached is the test case database script we've been able to reproduce the "issue" one hundred percent of the time on, but without the data - even after compression this weights ∼8.5Mb. Data that I'd happily share would anyone points me towards the customary way to do it here with such "big" a file.

Regards,
L. Revest
Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: R: 14.1 immutable function, bad performance if check number = 'NaN'
Next
From: Tom Lane
Date:
Subject: Re: lag() default value ignored for some window partition depending on table records count?