Thread: Re: pgsql: Add memory/disk usage for Window aggregate nodes in EXPLAIN.

Re: Tatsuo Ishii
> Add memory/disk usage for Window aggregate nodes in EXPLAIN.

This is failing for PG18 on Debian unstable on 32-bit i386:

******** build/src/test/regress/regression.diffs ********
diff -U3 /build/reproducible-path/postgresql-18-18~~devel.20250331/src/test/regress/expected/explain.out
/build/reproducible-path/postgresql-18-18~~devel.20250331/build/src/test/regress/results/explain.out
--- /build/reproducible-path/postgresql-18-18~~devel.20250331/src/test/regress/expected/explain.out    2025-03-31
06:17:21.000000000+0000
 
+++ /build/reproducible-path/postgresql-18-18~~devel.20250331/build/src/test/regress/results/explain.out    2025-03-31
15:18:19.914783369+0000
 
@@ -792,7 +792,7 @@
 ----------------------------------------------------------------------------------
  WindowAgg (actual time=N.N..N.N rows=N.N loops=N)
    Window: w1 AS ()
-   Storage: Disk  Maximum Storage: NkB
+   Storage: Memory  Maximum Storage: NkB
    ->  Function Scan on generate_series a (actual time=N.N..N.N rows=N.N loops=N)
  Planning Time: N.N ms
  Execution Time: N.N ms

This is the first build of PG18 on i386, so I cannot say if it broke with that commit or later.

Full log:
https://buildd.debian.org/status/fetch.php?pkg=postgresql-18&arch=i386&ver=18%7E%7Edevel.20250331-1&stamp=1743434305&raw=0

Christoph



On Tue, 1 Apr 2025 at 04:40, Christoph Berg <myon@debian.org> wrote:
> -   Storage: Disk  Maximum Storage: NkB
> +   Storage: Memory  Maximum Storage: NkB
>     ->  Function Scan on generate_series a (actual time=N.N..N.N rows=N.N loops=N)

We'll probably just need to bump that 2000 row count to something a
bit more for 32-bit.

Any chance you could share the output of:

explain (analyze,buffers off,costs off) select sum(n) over() from
generate_series(1,2000) a(n);

Could you maybe also do a binary search for the number of rows where
it goes to disk by adjusting the 2000 up in some increments until the
Storage method is disk? (Not that I think we should set it to the
minimum, but it would be good to not set it too much higher than we
need to)

David



Re: David Rowley
> Any chance you could share the output of:
> 
> explain (analyze,buffers off,costs off) select sum(n) over() from
> generate_series(1,2000) a(n);

PostgreSQL 18devel on x86-linux, compiled by gcc-14.2.0, 32-bit

=# explain (analyze,buffers off,costs off) select sum(n) over() from generate_series(1,2000) a(n);
                                        QUERY PLAN                                        
──────────────────────────────────────────────────────────────────────────────────────────
 WindowAgg (actual time=1.248..1.731 rows=2000.00 loops=1)
   Window: w1 AS ()
   Storage: Memory  Maximum Storage: 63kB
   ->  Function Scan on generate_series a (actual time=0.301..0.536 rows=2000.00 loops=1)
 Planning Time: 0.066 ms
 Execution Time: 1.913 ms
(6 rows)


> Could you maybe also do a binary search for the number of rows where
> it goes to disk by adjusting the 2000 up in some increments until the
> Storage method is disk? (Not that I think we should set it to the
> minimum, but it would be good to not set it too much higher than we
> need to)

The test has a `set work_mem = 64;` which I used here:

=# explain (analyze,buffers off,costs off) select sum(n) over() from generate_series(1,2047) a(n);
                                        QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────
 WindowAgg (actual time=1.037..1.429 rows=2047.00 loops=1)
   Window: w1 AS ()
   Storage: Memory  Maximum Storage: 64kB
   ->  Function Scan on generate_series a (actual time=0.262..0.457 rows=2047.00 loops=1)
 Planning Time: 0.058 ms
 Execution Time: 1.594 ms
(6 rows)

=# explain (analyze,buffers off,costs off) select sum(n) over() from generate_series(1,2048) a(n);
                                        QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────
 WindowAgg (actual time=2.073..2.686 rows=2048.00 loops=1)
   Window: w1 AS ()
   Storage: Disk  Maximum Storage: 65kB
   ->  Function Scan on generate_series a (actual time=0.624..1.064 rows=2048.00 loops=1)
 Planning Time: 0.064 ms
 Execution Time: 2.934 ms
(6 rows)

(With the default work_mem, the tipping point is around 149500)

Christoph



On Tue, 1 Apr 2025 at 09:40, Christoph Berg <myon@debian.org> wrote:
> =# explain (analyze,buffers off,costs off) select sum(n) over() from generate_series(1,2048) a(n);
>                                         QUERY PLAN
> ──────────────────────────────────────────────────────────────────────────────────────────
>  WindowAgg (actual time=2.073..2.686 rows=2048.00 loops=1)
>    Window: w1 AS ()
>    Storage: Disk  Maximum Storage: 65kB

Thank you for testing that. I've just pushed a patch to bump it up to 2500.

I suspect the buildfarm didn't catch this due to the tuplestore
consuming enough memory in MEMORY_CONTEXT_CHECKING builds.

David

From: David Rowley <dgrowleyml@gmail.com>
Subject: Re: pgsql: Add memory/disk usage for Window aggregate nodes in EXPLAIN.
Date: Tue, 1 Apr 2025 11:09:11 +1300
Message-ID: <CAApHDvoOHfFYXUryAymxiZjvyvhEt0ueeBOJRUOJWn1W7e3eyA@mail.gmail.com>

> On Tue, 1 Apr 2025 at 09:40, Christoph Berg <myon@debian.org> wrote:
>> =# explain (analyze,buffers off,costs off) select sum(n) over() from generate_series(1,2048) a(n);
>>                                         QUERY PLAN
>> ──────────────────────────────────────────────────────────────────────────────────────────
>>  WindowAgg (actual time=2.073..2.686 rows=2048.00 loops=1)
>>    Window: w1 AS ()
>>    Storage: Disk  Maximum Storage: 65kB
> 
> Thank you for testing that. I've just pushed a patch to bump it up to 2500.
> 
> I suspect the buildfarm didn't catch this due to the tuplestore
> consuming enough memory in MEMORY_CONTEXT_CHECKING builds.

David,
Christoph,

Thank you for fixing this!
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp