Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. |
Date | |
Msg-id | CAApHDvqTvD3bXBiC5cmuoATFYeP+RSkv5WTgfyuBb6JpwKmSoQ@mail.gmail.com Whole thread Raw |
In response to | Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1.
|
List | pgsql-hackers |
On Wed, 1 Jul 2020 at 18:46, Jeff Davis <pgsql@j-davis.com> wrote: > > On Tue, Jun 30, 2020, 7:04 PM David Rowley <dgrowleyml@gmail.com> wrote: >> >> Does anyone have any objections to that being changed? > > That's OK with me. By the way, I'm on vacation and will catch up on these HashAgg threads next week. (Adding Justin as I know he's expressed interest in the EXPLAIN output of HashAgg before) I've written a patch to bring the HashAgg EXPLAIN ANALYZE output to be more aligned to the Hash Join output. Couple of things I observed about Hash Join EXPLAIN ANALYZE: 1. The number of batches starts at 1. 2. We always display the number of batches. 3. We write "Batches" for text format and "Hash Batches" for non-text formats. 4. We write "Memory Usage" for text format and "Peak Memory Usage" for non-text formats. 5. "Batches" comes before memory usage. Before this patch, HashAgg EXPLAIN ANALYZE output would: 1. Start the number of batches at 0. 2. Only display "Hash Batches" when batches > 0. 3. Used the words "HashAgg Batches" for text and non-text formats. 4. Used the words "Peak Memory Usage" for text and non-text formats. 5. "Hash Batches" was written after memory usage. In the attached patch I've changed HashAgg to be aligned to Hash Join on each of the points above. e.g. Before: postgres=# explain analyze select c.relname,count(*) from pg_class c inner join pg_Attribute a on c.oid = a.attrelid group by c.relname; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=138.37..142.23 rows=386 width=72) (actual time=3.121..3.201 rows=427 loops=1) Group Key: c.relname Peak Memory Usage: 109kB -> Hash Join (cost=21.68..124.10 rows=2855 width=64) (actual time=0.298..1.768 rows=3153 loops=1) Hash Cond: (a.attrelid = c.oid) -> Seq Scan on pg_attribute a (cost=0.00..93.95 rows=3195 width=4) (actual time=0.011..0.353 rows=3153 loops=1) -> Hash (cost=16.86..16.86 rows=386 width=68) (actual time=0.279..0.279 rows=427 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 50kB -> Seq Scan on pg_class c (cost=0.00..16.86 rows=386 width=68) (actual time=0.007..0.112 rows=427 loops=1) Planning Time: 0.421 ms Execution Time: 3.294 ms (11 rows) After: postgres=# explain analyze select c.relname,count(*) from pg_class c inner join pg_Attribute a on c.oid = a.attrelid group by c.relname; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=566.03..580.00 rows=1397 width=72) (actual time=13.097..13.430 rows=1397 loops=1) Group Key: c.relname Batches: 1 Memory Usage: 321kB -> Hash Join (cost=64.43..496.10 rows=13985 width=64) (actual time=0.838..7.546 rows=13985 loops=1) Hash Cond: (a.attrelid = c.oid) -> Seq Scan on pg_attribute a (cost=0.00..394.85 rows=13985 width=4) (actual time=0.010..1.462 rows=13985 loops=1) -> Hash (cost=46.97..46.97 rows=1397 width=68) (actual time=0.820..0.821 rows=1397 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 153kB -> Seq Scan on pg_class c (cost=0.00..46.97 rows=1397 width=68) (actual time=0.009..0.362 rows=1397 loops=1) Planning Time: 0.440 ms Execution Time: 13.634 ms (11 rows) (ignore the change in memory consumption. That was due to adding records for testing) Any objections to this change? David
Attachment
pgsql-hackers by date: