Thread: BUG #5157: Hash index not concurrency safe

BUG #5157: Hash index not concurrency safe

From
"Jeff Janes"
Date:
The following bug has been logged online:

Bug reference:      5157
Logged by:          Jeff Janes
Email address:      jeff.janes@gmail.com
PostgreSQL version: 8.4.1
Operating system:   Linux
Description:        Hash index not concurrency safe
Details:

Hash index is not concurrency safe, starting in REL8_4_0 and up to HEAD.

T1:

create table foo (id int, x text);
create index asdlfkjsdf on foo using hash (id);
insert into foo select 1, 'xxxxxxxxxxx' from generate_series(1,100);
set enable_seqscan =off;
set enable_bitmapscan =off;
\timing on
select count(pg_sleep(.3)) from foo where id=1;
 count
-------
   100
(1 row)

Time: 30897.835 ms

select count(pg_sleep(.3)) from foo where id=1;

While that is running switch to T2:
insert into foo select generate_series, 'xxxxxxxxxxx' from
generate_series(1,100000);

Back in T1:

 count
-------
     8
(1 row)

Time: 2474.709 ms


The pg_sleep is simply there to give me time to force the two transactions
to overlap.

The problem is that hashgettuple releases the buffer content share lock when
it returns a tuple, so when it comes back to get another tuple the block may
have been rearranged by concurrent inserts.  But the state of the scan
object, specifically so->hashso_curpos, makes no attempt to detect or
correct for this rearragement.

Re: BUG #5157: Hash index not concurrency safe

From
Tom Lane
Date:
"Jeff Janes" <jeff.janes@gmail.com> writes:
> Hash index is not concurrency safe, starting in REL8_4_0 and up to HEAD.

Ouch.  This used to be okay, because adding new entries to a hash page
always added them at the end.  The 8.4 changes to keep individual hash
pages sorted by hashcode broke it :-(.

I think we could recover by having the hashgettuple code path
re-synchronize by looking for the heap TID it previously returned.
That must be at the same or higher index TID as we had stopped at.
(Deletions are not possible, so we only have to search forward,
and the TID must be there someplace.)

            regards, tom lane

Re: BUG #5157: Hash index not concurrency safe

From
Jeff Janes
Date:
On Sun, Nov 1, 2009 at 8:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jeff Janes" <jeff.janes@gmail.com> writes:
>> Hash index is not concurrency safe, starting in REL8_4_0 and up to HEAD.
>
> Ouch. =A0This used to be okay, because adding new entries to a hash page
> always added them at the end. =A0The 8.4 changes to keep individual hash
> pages sorted by hashcode broke it :-(.
>
> I think we could recover by having the hashgettuple code path
> re-synchronize by looking for the heap TID it previously returned.
> That must be at the same or higher index TID as we had stopped at.
> (Deletions are not possible, so we only have to search forward,
> and the TID must be there someplace.)

Can it get pushed to another page (an overflow page)?  My quick
reading of the code suggests it can't get pushed, which makes the fix
easier.

I'll work on a fix for it.  But if 8.4.2 is coming out in the next
couple of weeks and we want the fix to be in it, then we might want
someone more proficient than me to work on it.

Cheers,

Jeff

Re: BUG #5157: Hash index not concurrency safe

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Sun, Nov 1, 2009 at 8:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think we could recover by having the hashgettuple code path
>> re-synchronize by looking for the heap TID it previously returned.

> Can it get pushed to another page (an overflow page)?  My quick
> reading of the code suggests it can't get pushed, which makes the fix
> easier.

It can't, which is an important part of the reason why this fix is okay.

> I'll work on a fix for it.  But if 8.4.2 is coming out in the next
> couple of weeks and we want the fix to be in it, then we might want
> someone more proficient than me to work on it.

I'm working on it already ... I figure it's my fault for not catching
this while reviewing the patch :-(.

There's no timetable as yet for 8.4.2.

            regards, tom lane

Re: BUG #5157: Hash index not concurrency safe

From
Tom Lane
Date:
I wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> Hash index is not concurrency safe, starting in REL8_4_0 and up to HEAD.

> Ouch.  This used to be okay, because adding new entries to a hash page
> always added them at the end.  The 8.4 changes to keep individual hash
> pages sorted by hashcode broke it :-(.

Actually, now that I am looking at it, that patch COMPLETELY destroyed
hash indexes.  The search logic requires that index entries within a
page are ordered by hash value.  Although the insertion code preserves
that property, neither _hash_splitbucket nor _hash_squeezebucket make
any attempt to do so.  So it's not just a transient concurrency issue,
you can easily get corruption of a hash index leading to permanent
search failures.

Ugh.  Mea culpa for letting this one through.

            regards, tom lane

Re: BUG #5157: Hash index not concurrency safe

From
Tom Lane
Date:
I wrote:
> Ugh.  Mea culpa for letting this one through.

I've committed patches to fix this.

            regards, tom lane