Home > mailing lists

Excessive disk usage in WindowAgg - Mailing list pgsql-hackers

From	Andrew Gierth
Subject	Excessive disk usage in WindowAgg
Date	September 24, 2019 11:49:54
Msg-id	87a7ator8n.fsf@news-spur.riddles.org.uk Whole thread Raw
Responses	Re: Excessive disk usage in WindowAgg
List	pgsql-hackers

Tree view

This one just came up on IRC:

create table tltest(a integer, b text, c text, d text);
insert into tltest
  select i, repeat('foo',100), repeat('foo',100), repeat('foo',100)
    from generate_series(1,100000) i;
set log_temp_files=0;
set client_min_messages=log;

select count(a+c) from (select a, count(*) over () as c from tltest s1) s;
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp82513.3", size 92600000

Using 92MB of disk for one integer seems excessive; the reason is clear
from the explain:

                                                            QUERY PLAN
         
 

----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=16250.00..16250.01 rows=1 width=8) (actual time=1236.260..1236.260 rows=1 loops=1)
   Output: count((tltest.a + (count(*) OVER (?))))
   ->  WindowAgg  (cost=0.00..14750.00 rows=100000 width=12) (actual time=1193.846..1231.216 rows=100000 loops=1)
         Output: tltest.a, count(*) OVER (?)
         ->  Seq Scan on public.tltest  (cost=0.00..13500.00 rows=100000 width=4) (actual time=0.006..14.361
rows=100000loops=1)
 
               Output: tltest.a, tltest.b, tltest.c, tltest.d

so the whole width of the table is being stored in the tuplestore used
by the windowagg.

In create_windowagg_plan, we have:

    /*
     * WindowAgg can project, so no need to be terribly picky about child
     * tlist, but we do need grouping columns to be available
     */
    subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);

Obviously we _do_ need to be more picky about this; it seems clear that
using CP_SMALL_TLIST | CP_LABEL_TLIST would be a win in many cases.
Opinions?

-- 
Andrew (irc:RhodiumToad)

pgsql-hackers by date:

From: rmrodriguez@carto.com
Date: 24 September 2019, 11:39:20
Subject: Re: Optimze usage of immutable functions as relation

From: Tomas Vondra
Date: 24 September 2019, 12:05:51
Subject: Re: Memory Accounting

Excessive disk usage in WindowAgg - Mailing list pgsql-hackers

Previous

Next