Re: Memory-Bounded Hash Aggregation - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Memory-Bounded Hash Aggregation
Date
Msg-id 2370ca86d4c3b46befc031d8421de02a58df20f1.camel@j-davis.com
Whole thread Raw
In response to Re: Memory-Bounded Hash Aggregation  (Adam Lee <ali@pivotal.io>)
Responses Re: Memory-Bounded Hash Aggregation  (Adam Lee <ali@pivotal.io>)
List pgsql-hackers
On Tue, 2019-12-10 at 13:34 -0800, Adam Lee wrote:
> Melanie and I tried this, had a installcheck passed patch. The way
> how
> we verify it is composing a wide table with long unnecessary text
> columns, then check the size it writes on every iteration.
> 
> Please check out the attachment, it's based on your 1204 version.

Thank you. Attached a new patch that incorporates your projection work.

A few comments:

* You are only nulling out up to tts_nvalid, which means that you can
still end up storing more on disk if the wide column comes at the end
of the table and hasn't been deserialized yet. I fixed this by copying
needed attributes to the hash_spill_slot and making it virtual.

* aggregated_columns does not need to be a member of AggState; nor does
it need to be computed inside of the perhash loop. Aside: if adding a
field to AggState is necessary, you need to bump the field numbers of
later fields that are labeled for JIT use, otherwise it will break JIT.

* I used an array rather than a bitmapset. It makes it easier to find
the highest column (to do a slot_getsomeattrs), and it might be a
little more efficient for wide tables with mostly useless columns.

* Style nitpick: don't mix code and declarations

The updated patch also saves the transitionSpace calculation in the Agg
node for better hash table size estimating. This is a good way to
choose an initial number of buckets for the hash table, and also to cap
the number of groups we permit in the hash table when we expect the
groups to grow.

Regards,
    Jeff Davis




Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: unsupportable composite type partition keys
Next
From: Jeff Davis
Date:
Subject: Re: Memory-Bounded Hash Aggregation