On Sat, Sep 14, 2024 at 1:42 PM Tatsuo Ishii <ishii@postgresql.org> wrote:
>
> >> > or the case when the last usage fit in memory but an earlier
> >> > usage spilled to disk.
> >>
> >> In my understanding once tuplestore changes the storage type to disk,
> >> it never returns to the memory storage type in terms of
> >> tuplestore_get_stats. i.e. once state->usedDisk is set to true, it
> >> never goes back to false. So the test case is not necessary.
> >> David, am I correct?
> >
> > I understand that. I am requesting a testcase to test that same logic.
>
> Maybe something like this? In the example below there are 2
> partitions. the first one has 1998 rows and the second one has 2
> rows. Assuming that work_mem is 64kB, the first one does not fit the
> memory and spills to disk. The second partition fits memory. However as
> state->usedDisk remains true, explain shows "Storage: Disk".
>
> test=# explain (analyze,costs off) select sum(n) over(partition by m) from (SELECT n < 3 as m, n from
generate_series(1,2000)a(n));
> n QUERY PLAN
>
> --------------------------------------------------------------------------------
> -------------
> WindowAgg (actual time=1.958..473328.589 rows=2000 loops=1)
> Storage: Disk Maximum Storage: 65kB
> -> Sort (actual time=1.008..1.277 rows=2000 loops=1)
> Sort Key: ((a.n < 3))
> Sort Method: external merge Disk: 48kB
> -> Function Scan on generate_series a (actual time=0.300..0.633 rows=2
> 000 loops=1)
> Planning Time: 0.069 ms
> Execution Time: 474515.476 ms
> (8 rows)
Thanks. This will do. Is there a way to force the larger partition to
be computed first? That way we definitely know that the last
computation was done when all the tuples in the tuplestore were in
memory.
--
Best Wishes,
Ashutosh Bapat