Re: Show hash / bitmap sizes in EXPLAIN ANALYZE? - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Show hash / bitmap sizes in EXPLAIN ANALYZE?
Date
Msg-id b189ab4f-a7c2-efa4-56b9-5fb50d9e6afa@2ndquadrant.com
Whole thread Raw
In response to Show hash / bitmap sizes in EXPLAIN ANALYZE?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On 10/01/2016 01:37 AM, Andres Freund wrote:
> Hi,
>
> At the moment in-memory sort and hash nodes show their memory usage in
> explain:
> │   ->  Sort  (cost=59.83..62.33 rows=1000 width=4) (actual time=0.512..0.632 rows=1000 loops=1)
             │
 
> │         Sort Key: a.a
             │
 
> │         Sort Method: quicksort  Memory: 71kB
             │
 
> │         ->  Function Scan on generate_series a  (cost=0.00..10.00 rows=1000 width=4) (actual time=0.165..0.305
rows=1000loops=1) │
 
> and
> │   ->  Hash  (cost=10.00..10.00 rows=1000 width=4) (actual time=0.581..0.581 rows=1000 loops=1)
             │
 
> │         Buckets: 1024  Batches: 1  Memory Usage: 44kB
             │
 
>
> I think we should show something similar for bitmap scans, and for
> some execGrouping.c users (at least hash aggregates, subplans and
> setop seem good candidates too).
>

+1 to improve this

> For both categories it's useful to see how close within work_mem a
> scan ended up being (to understand how high to set it, and how much
> the data can grow till work_mem is excceded), and for execGrouping.c
> users it's also very interesting to see the actual memory usage
> because the limit is only a very soft one.
>
> Does anybody see a reason not to add that?
>

Well, the obvious problem with execGrouping.c is that we don't have 
information about memory usage - we don't know how large the aggregate 
state is. It's trivial to compute it for aggregates that use 
fixed-length data types, but for aggregates that use varlena/internal 
state that's not going to work.

This is actually the same problem Jeff Davis ran into when trying to 
implement memory-bounded HashAgg ~2 years ago, which also needs this 
information. Back then there was a lot of discussion about whether the 
~1% penalty measured is acceptable price for the accounting, which kinda 
killed the whole patch.

I plan to revisit that hashagg patch, or rather a new patch with the 
same goal - now that we have serial/deserial functions for aggregates, 
we should be able to implement much nicer spill-to-disk method. But 
that'll need the memory accounting, so if you want to look into it, 
you're welcome.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: ktm@rice.edu
Date:
Subject: Re: Hash Indexes
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"