Home > mailing lists

Nasty problem in hash indexes - Mailing list pgsql-hackers

From	Tom Lane
Subject	Nasty problem in hash indexes
Date	August 28, 2003 14:57:21
Msg-id	3633.1062093409@sss.pgh.pa.us Whole thread Raw
Responses	Re: Nasty problem in hash indexes
List	pgsql-hackers

Tree view

I've traced through the failure reported here by Markus Kr�utner:
http://archives.postgresql.org/pgsql-hackers/2003-08/msg01132.php

What is happening is that as the UPDATE adds tuples (all with the same
hash key value) to the table, the hash bucket being filled eventually
requires more pages, and this results in a _hash_splitpage() operation
(which is misnamed, it should be _hash_splitbucket).  By chance, the
bucket that is selected to be split is the one containing the older key
values, all of which get relocated to the new bucket.  So when control
returns to the indexscan that is sourcing the tuples for the UPDATE,
there are no tuples remaining in the bucket it is looking at, and it
exits thinking it's done.

I'm not sure how many variants on this problem there might be, but
clearly the fundamental bug is that a hash bucket split takes no account
of preserving the state of concurrent index scans.

This is likely to be messy to fix :-(.  A brute-force solution may be
possible by generalizing hash_adjscans so that it can update indexscans
of our own backend for bucket-split operations; we'd have to rely on
page locking to prevent problems against scans of other backends.  The
locking aspect is particularly unattractive because of the possibility
of deadlocks.  If a bucket split fails because of deadlock, we're
probably left with a corrupt hash index.

Does anyone see a better way?

Does anyone want to vote to jettison the hash index code entirely?
Personally I'm not eager to put a lot of work into fixing it.
        regards, tom lane

pgsql-hackers by date:

From: Larry Rosenman
Date: 28 August 2003, 14:34:05
Subject: Re: Beta2 Tag'd and Bundled ...

From: "Mendola Gaetano"
Date: 28 August 2003, 14:58:53
Subject: Re: SetVariable

Nasty problem in hash indexes - Mailing list pgsql-hackers

Previous

Next