Home > mailing lists

Re: explain HashAggregate to report bucket and memory stats - Mailing list pgsql-hackers

From	Justin Pryzby
Subject	Re: explain HashAggregate to report bucket and memory stats
Date	February 16, 2020 00:02:20
Msg-id	20200216000220.GF31889@telsasoft.com Whole thread Raw
In response to	Re: explain HashAggregate to report bucket and memory stats (Andres Freund <andres@anarazel.de>)
Responses	Re: explain HashAggregate to report bucket and memory stats
List	pgsql-hackers

Tree view

On Mon, Feb 03, 2020 at 06:53:01AM -0800, Andres Freund wrote:
> On 2020-01-03 10:19:26 -0600, Justin Pryzby wrote:
> > On Sun, Feb 17, 2019 at 11:29:56AM -0500, Jeff Janes wrote:
> > https://www.postgresql.org/message-id/CAMkU%3D1zBJNVo2DGYBgLJqpu8fyjCE_ys%2Bmsr6pOEoiwA7y5jrA%40mail.gmail.com
> > > What would I find very useful is [...] if the HashAggregate node under
> > > "explain analyze" would report memory and bucket stats; and if the Aggregate
> > > node would report...anything.
> 
> Yea, that'd be amazing. It probably should be something every
> execGrouping.c using node can opt into.

Do you think it should be implemented in execGrouping/TupleHashTableData (as I
did) ?  I also did an experiment moving into the higher level nodes, but I
guess that's not actually desirable.  There's currently different output from
tests between the implementation using execGrouping.c and the one outside it,
so there's at least an issue with grouping sets.

> > +    hashtable->hinstrument.nbuckets_original = nbuckets;
> > +    hashtable->hinstrument.nbuckets = nbuckets;
> > +    hashtable->hinstrument.space_peak = entrysize * hashtable->hashtab->size;
> 
> That's not actually an accurate accounting of memory, because for filled
> entries a lot of memory is used to store actual tuples:

Thanks - I think I finally understood this.

I updated some existing tests to show the new output.  I imagine that's a
throwaway commit, and should eventually add new tests for each of these node
types under explain analyze.

I've been testing the various nodes like:

--heapscan:
DROP TABLE t; CREATE TABLE t (i int unique) WITH(autovacuum_enabled=off); INSERT INTO t SELECT
generate_series(1,99999);SET enable_seqscan=off; SET parallel_tuple_cost=0; SET parallel_setup_cost=0; SET
enable_indexonlyscan=off;explain analyze verbose SELECT * FROM t WHERE i BETWEEN 999 and 99999999;
 

--setop:
explain( analyze,verbose) SELECT * FROM generate_series(1,999) EXCEPT (SELECT NULL UNION ALL SELECT * FROM
generate_series(1,99999));
   Buckets: 2048 (originally 256)  Memory Usage: hashtable: 48kB, tuples: 8Kb

--recursive union:
explain analyze verbose WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION SELECT n || ' bar' FROM t WHERE length(n) < 9999)
SELECTn, n IS OF (text) AS is_text FROM t;
 

--subplan
explain analyze verbose SELECT i FROM generate_series(1,999)i WHERE (i,i) NOT IN (SELECT 1,1 UNION ALL SELECT j,j FROM
generate_series(1,99999)j);
   Buckets: 262144 (originally 131072)  Memory Usage: hashtable: 6144kB, tuples: 782Kb
explain analyze verbose select i FROM generate_series(1,999)i WHERE(1,i) NOT in (select i,null::int from t) ;

--Agg:
explain (analyze,verbose) SELECT A,COUNT(1) FROM generate_series(1,99999)a GROUP BY 1;
   Buckets: 262144 (originally 256)  Memory Usage: hashtable: 6144kB, tuples: 782Kb

explain (analyze, verbose) select i FROM generate_series(1,999)i WHERE(1,1) not in (select a,null from (SELECT
generate_series(1,99999)a)x) ;
 

explain analyze verbose select * from (SELECT a FROM generate_series(1,99)a)v left join lateral (select v.a, four, ten,
count(*)from (SELECT b four, 2 ten, b FROM generate_series(1,999)b)x group by cube(four,ten)) s on true order by
v.a,four,ten;

--Grouping sets:
explain analyze verbose   select unique1,
         count(two), count(four), count(ten),
         count(hundred), count(thousand), count(twothousand),
         count(*)
    from tenk1 group by grouping sets (unique1,twothousand,thousand,hundred,ten,four,two);

-- 
Justin

Attachment

pgsql-hackers by date:

From: Andrew Dunstan
Date: 15 February 2020, 22:40:02
Subject: Re: Just for fun: Postgres 20?

From: Dave Cramer
Date: 16 February 2020, 00:50:40
Subject: Re: New messages from Priscilla Ip

Re: explain HashAggregate to report bucket and memory stats - Mailing list pgsql-hackers

Attachment

Previous

Next