Re: Next Steps with Hash Indexes - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Next Steps with Hash Indexes
Date
Msg-id CA+Tgmoanvcdz4RfpqCb-1pNcf8mGd=3Uw6T68YcSYKnEJWCrfw@mail.gmail.com
Whole thread Raw
In response to Re: Next Steps with Hash Indexes  (Simon Riggs <simon.riggs@enterprisedb.com>)
Responses Re: Next Steps with Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tue, Oct 5, 2021 at 6:50 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> With unique data, starting at 1 and monotonically ascending, hash
> indexes will grow very nicely from 0 to 10E7 rows without causing >1
> overflow block to be allocated for any bucket. This keeps the search
> time for such data to just 2 blocks (bucket plus, if present, 1
> overflow block). The small number of overflow blocks is because of the
> regular and smooth way that splits occur, which works very nicely
> without significant extra latency.

It is my impression that with non-unique data things degrade rather
badly. There's no way to split the buckets that are overflowing
without also splitting the buckets that are completely empty or, in
any event, not full enough to need any overflow pages. I think that's
rather awful.

> The probability of bucket collision while we hold the lock is fairly
> low. This is because even with adjacent data values the hash values
> would be spread across multiple buckets, so we would expect the
> contention to be less than we would get on a monotonically increasing
> btree.
>
> So I don't now see any problem from holding the buffer lwlock on the
> bucket while we do multi-buffer operations.

I don't think that contention is the primary concern here. I think one
major concern is interruptibility: a process must be careful not to
hold lwlocks across long stretches of code, because it cannot be
cancelled while it does. Even if the code is bug-free and the database
has no corruption, long pauses before cancels take effect can be
inconvenient. But as soon as you add in those considerations things
get much worse. Imagine a hash index that is corrupted so that there
is a loop in the list of overflow pages. No matter what, we're going
to go into an infinite loop scanning that bucket, but if we're holding
a buffer lock while we do it, there's no way to escape short of
bouncing the entire server. That's pretty bad.

Undetected deadlock is something to think about, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Arjan van de Ven
Date:
Subject: [PATCH v2] src/port/snprintf.c: Optimize the common base=10 case in fmtint
Next
From: Jeff Davis
Date:
Subject: Re: Predefined role pg_maintenance for VACUUM, ANALYZE, CHECKPOINT.