Re: Speed up transaction completion faster after many relations are accessed in a transaction - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Speed up transaction completion faster after many relations are accessed in a transaction
Date
Msg-id 31178.1554529157@sss.pgh.pa.us
Whole thread Raw
In response to Re: Speed up transaction completion faster after many relations areaccessed in a transaction  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> I wonder if one approach to solve this wouldn't be to just make the
> hashtable drastically smaller. Right now we'll often have have lots
> empty entries that are 72 bytes + dynahash overhead. That's plenty of
> memory that needs to be skipped over.   I wonder if we instead should
> have an array of held locallocks, and a hashtable with {hashcode,
> offset_in_array} + custom comparator for lookups.

Well, that's not going to work all that well for retail lock releases;
you'll end up with holes in the array, maybe a lot of them.

However, it led me to think of another way we might approach the general
hashtable problem: right now, we are not making any use of the fact that
the hashtable's entries are laid out in big slabs (arrays).  What if we
tried to ensure that the live entries are allocated fairly compactly in
those arrays, and then implemented hash_seq_search as a scan over the
arrays, ignoring the hash bucket structure per se?

We'd need a way to reliably tell a live entry from a free entry, but
again, there's plenty of space for a flag bit or two.

This might perform poorly if you allocated a bunch of entries,
freed most-but-not-all, and then wanted to seqscan the remainder;
you'd end up with the same problem I complained of above that
you're iterating over an array that's mostly uninteresting.
In principle we could keep count of the live vs free entries and
dynamically decide to scan via the hash bucket structure instead of
searching the storage array when the array is too sparse; but that
might be overly complicated.

I haven't tried to work this out in detail, it's just a late
night brainstorm.  But, again, I'd much rather solve this in
dynahash.c than by layering some kind of hack on top of it.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Speed up transaction completion faster after many relations areaccessed in a transaction
Next
From: Andrey Borodin
Date:
Subject: Re: Google Summer of Code: question about GiST API advancementproject