On Mon, Feb 03, 2020 at 06:53:01AM -0800, Andres Freund wrote:
> On 2020-01-03 10:19:26 -0600, Justin Pryzby wrote:
> > On Sun, Feb 17, 2019 at 11:29:56AM -0500, Jeff Janes wrote:
> > https://www.postgresql.org/message-id/CAMkU%3D1zBJNVo2DGYBgLJqpu8fyjCE_ys%2Bmsr6pOEoiwA7y5jrA%40mail.gmail.com
> > > What would I find very useful is [...] if the HashAggregate node under
> > > "explain analyze" would report memory and bucket stats; and if the Aggregate
> > > node would report...anything.
>
> Yea, that'd be amazing. It probably should be something every
> execGrouping.c using node can opt into.
Do you think it should be implemented in execGrouping/TupleHashTableData (as I
did) ? I also did an experiment moving into the higher level nodes, but I
guess that's not actually desirable. There's currently different output from
tests between the implementation using execGrouping.c and the one outside it,
so there's at least an issue with grouping sets.
> > + hashtable->hinstrument.nbuckets_original = nbuckets;
> > + hashtable->hinstrument.nbuckets = nbuckets;
> > + hashtable->hinstrument.space_peak = entrysize * hashtable->hashtab->size;
>
> That's not actually an accurate accounting of memory, because for filled
> entries a lot of memory is used to store actual tuples:
Thanks - I think I finally understood this.
I updated some existing tests to show the new output. I imagine that's a
throwaway commit, and should eventually add new tests for each of these node
types under explain analyze.
I've been testing the various nodes like:
--heapscan:
DROP TABLE t; CREATE TABLE t (i int unique) WITH(autovacuum_enabled=off); INSERT INTO t SELECT
generate_series(1,99999);SET enable_seqscan=off; SET parallel_tuple_cost=0; SET parallel_setup_cost=0; SET
enable_indexonlyscan=off;explain analyze verbose SELECT * FROM t WHERE i BETWEEN 999 and 99999999;
--setop:
explain( analyze,verbose) SELECT * FROM generate_series(1,999) EXCEPT (SELECT NULL UNION ALL SELECT * FROM
generate_series(1,99999));
Buckets: 2048 (originally 256) Memory Usage: hashtable: 48kB, tuples: 8Kb
--recursive union:
explain analyze verbose WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION SELECT n || ' bar' FROM t WHERE length(n) < 9999)
SELECTn, n IS OF (text) AS is_text FROM t;
--subplan
explain analyze verbose SELECT i FROM generate_series(1,999)i WHERE (i,i) NOT IN (SELECT 1,1 UNION ALL SELECT j,j FROM
generate_series(1,99999)j);
Buckets: 262144 (originally 131072) Memory Usage: hashtable: 6144kB, tuples: 782Kb
explain analyze verbose select i FROM generate_series(1,999)i WHERE(1,i) NOT in (select i,null::int from t) ;
--Agg:
explain (analyze,verbose) SELECT A,COUNT(1) FROM generate_series(1,99999)a GROUP BY 1;
Buckets: 262144 (originally 256) Memory Usage: hashtable: 6144kB, tuples: 782Kb
explain (analyze, verbose) select i FROM generate_series(1,999)i WHERE(1,1) not in (select a,null from (SELECT
generate_series(1,99999)a)x) ;
explain analyze verbose select * from (SELECT a FROM generate_series(1,99)a)v left join lateral (select v.a, four, ten,
count(*)from (SELECT b four, 2 ten, b FROM generate_series(1,999)b)x group by cube(four,ten)) s on true order by
v.a,four,ten;
--Grouping sets:
explain analyze verbose select unique1,
count(two), count(four), count(ten),
count(hundred), count(thousand), count(twothousand),
count(*)
from tenk1 group by grouping sets (unique1,twothousand,thousand,hundred,ten,four,two);
--
Justin