Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options - Mailing list pgsql-hackers

From Oliver Ford
Subject Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Date
Msg-id CAGMVOdvCwq-jUYgBA1H_3oeHR7HOYTSGtwPoFsJPkyRfjWRVTQ@mail.gmail.com
Whole thread Raw
In response to Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options  (Tatsuo Ishii <ishii@postgresql.org>)
List pgsql-hackers
On Fri, Feb 28, 2025 at 11:49 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
>> BTW, I noticed that in the code path where
>> ignorenulls_getfuncarginframe() is called, WinSetMarkPosition() is
>> never called?
>>
>> Attached version uses the mark_pos at the end.

I did simple performance test against v8.

EXPLAIN ANALYZE
SELECT
  x,
nth_value(x,2) IGNORE NULLS OVER w
FROM generate_series(1,$i) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);

I changed $i = 1k, 2k, 3k, 4k, 5k... 10k and got this:

Number  Time (ms)
of rows
----------------
1000    28.977
2000    96.556
3000    212.019
4000    383.615
5000    587.05
6000    843.23
7000    1196.177
8000    1508.52
9000    1920.593
10000   2514.069

As you can see, when the number of rows = 1k, it took 28 ms. For 10k
rows, it took 2514 ms, which is 86 times slower than the 1k case. Can
we enhance this?


Attached version removes the non-nulls array. That seems to speed everything up.  Running the above query with 1 million rows averages 450ms, similar when using lead/lag.
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Separate GUC for replication origins
Next
From: Amit Kapila
Date:
Subject: Re: Selectively invalidate caches in pgoutput module