Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN
Date
Msg-id CAExHW5uwOz8aySLQHb1VRmfccf0-JMydvcS2A9_O-j=Kduqkyw@mail.gmail.com
Whole thread Raw
In response to Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN  (Tatsuo Ishii <ishii@postgresql.org>)
Responses Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN
List pgsql-hackers
On Sat, Sep 14, 2024 at 1:42 PM Tatsuo Ishii <ishii@postgresql.org> wrote:
>
> >> > or the case when the last usage fit in memory but an earlier
> >> > usage spilled to disk.
> >>
> >> In my understanding once tuplestore changes the storage type to disk,
> >> it never returns to the memory storage type in terms of
> >> tuplestore_get_stats.  i.e. once state->usedDisk is set to true, it
> >> never goes back to false. So the test case is not necessary.
> >> David, am I correct?
> >
> > I understand that. I am requesting a testcase to test that same logic.
>
> Maybe something like this? In the example below there are 2
> partitions. the first one has 1998 rows and the second one has 2
> rows. Assuming that work_mem is 64kB, the first one does not fit the
> memory and spills to disk. The second partition fits memory. However as
> state->usedDisk remains true, explain shows "Storage: Disk".
>
> test=# explain (analyze,costs off) select sum(n) over(partition by m) from (SELECT n < 3 as m, n from
generate_series(1,2000)a(n)); 
> n                                         QUERY PLAN
>
> --------------------------------------------------------------------------------
> -------------
>  WindowAgg (actual time=1.958..473328.589 rows=2000 loops=1)
>    Storage: Disk  Maximum Storage: 65kB
>    ->  Sort (actual time=1.008..1.277 rows=2000 loops=1)
>          Sort Key: ((a.n < 3))
>          Sort Method: external merge  Disk: 48kB
>          ->  Function Scan on generate_series a (actual time=0.300..0.633 rows=2
> 000 loops=1)
>  Planning Time: 0.069 ms
>  Execution Time: 474515.476 ms
> (8 rows)

Thanks. This will do. Is there a way to force the larger partition to
be computed first? That way we definitely know that the last
computation was done when all the tuples in the tuplestore were in
memory.

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Jim Jones
Date:
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Next
From: Amit Kapila
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation