Thread: Gin page deletion bug

Gin page deletion bug

From
Heikki Linnakangas
Date:
Gin page deletion fails to take into account that there might be a 
search in-flight to the page that is deleted. If the page is reused for 
something else, the search can get very confused.

That's pretty difficult to reproduce in a real system, as the window 
between releasing a lock on page and following its right-link is very 
tight, but by setting a breakpoint with a debugger it's easy. Here's how 
I reproduced it:

-----------
1. Put a breakpoint or sleep in entryGetNextItem() function, where it 
has released lock on one page and is about to read the next one. I used 
this patch:

--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -574,6 +574,9 @@ entryGetNextItem(GinState *ginstate, GinScanEntry entry)                 return;             }

+            elog(NOTICE, "about to move right to page %u", blkno);
+            sleep(5);
+             entry->buffer = ReleaseAndReadBuffer(entry->buffer,
ginstate->index,                                                 blkno);
 

2. Initialize a page with a gin index in suitable state:

create extension btree_gin;
create table foo (i int4);
create index i_gin_foo on foo using gin (i) with (fastupdate = off);

insert into foo select 1 from generate_series(1, 5000);
insert into foo select 2 from generate_series(1, 5000);
set enable_bitmapscan=off; set enable_seqscan=on;
delete from foo where i = 1;

3. Start a query, it will sleep between every page:

set enable_bitmapscan=on; set enable_seqscan=off;
select * from foo where i = 1;
postgres=# select * from foo where i = 1;
NOTICE:  about to move right to page 3
NOTICE:  about to move right to page 5
...

4. In another session, delete and reuse the pages:

vacuum foo;
insert into foo select 2 from generate_series(1, 10000) g

5. Let the query run to completion. It will return a lot of tuples with 
i=2, which should not have matched:

...
NOTICE:  about to move right to page 24
NOTICE:  about to move right to page 25 i
--- 2 2 2
...

-----------

The regular b-tree code solves this by stamping deleted pages with the 
current XID, and only allowing them to be reused once that XID becomes 
old enough (< RecentGlobalXmin). Another approach might be to grab a 
cleanup-strength lock on the left and parent pages when deleting a page, 
and requiring search to keep the pin on the page its coming from, until 
it has locked the next page.

- Heikki



Re: Gin page deletion bug

From
Heikki Linnakangas
Date:
On 07.11.2013 22:49, Heikki Linnakangas wrote:
> Gin page deletion fails to take into account that there might be a
> search in-flight to the page that is deleted. If the page is reused for
> something else, the search can get very confused.
>
> ...
>
> The regular b-tree code solves this by stamping deleted pages with the
> current XID, and only allowing them to be reused once that XID becomes
> old enough (< RecentGlobalXmin). Another approach might be to grab a
> cleanup-strength lock on the left and parent pages when deleting a page,
> and requiring search to keep the pin on the page its coming from, until
> it has locked the next page.

I came up with the attached fix. In a nutshell, when walking along a
right-link, the new page is locked before releasing the lock on the old
one. Also, never delete the leftmost branch of a posting tree. I believe
these changes are sufficient to fix the problem, because of the way the
posting tree is searched:

All searches on a posting tree are "full searches", scanning the whole
tree from left to right. Insertions do seek to the middle of the tree,
but they are locked out during vacuum by holding a cleanup lock on the
posting tree root (even without this patch). Thanks to this property, a
search cannot step on a page that's being deleted by following a
downlink, if we refrain from deleting the leftmost page on a level, as
searches only descend down the leftmost branch.

It would be nice to improve that in master - holding an exclusive lock
on the root page is pretty horrible - but this is a nice back-patchable
patch. I'm not worried about the loss of concurrency because we now have
to hold the lock on previous page when stepping to next page. In
searches, it's only a share-lock, and in insertions, it's very rare that
you have to step right.

The patch also adds some sanity checks to stepping right: the next page
should be of the same type as the current page, e.g stepping right
should not go from leaf to non-leaf or vice versa.

- Heikki

Attachment

Re: Gin page deletion bug

From
Tom Lane
Date:
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> I came up with the attached fix. In a nutshell, when walking along a 
> right-link, the new page is locked before releasing the lock on the old 
> one. Also, never delete the leftmost branch of a posting tree. I believe 
> these changes are sufficient to fix the problem, because of the way the 
> posting tree is searched:

This seems reasonable, but I wonder whether the concurrency discussion
shouldn't be in gist/README, rather than buried in a comment inside
ginDeletePage (not exactly the first place one would think to look IMO).
        regards, tom lane



Re: Gin page deletion bug

From
Heikki Linnakangas
Date:
On 08.11.2013 19:14, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas@vmware.com> writes:
>> I came up with the attached fix. In a nutshell, when walking along a
>> right-link, the new page is locked before releasing the lock on the old
>> one. Also, never delete the leftmost branch of a posting tree. I believe
>> these changes are sufficient to fix the problem, because of the way the
>> posting tree is searched:
>
> This seems reasonable, but I wonder whether the concurrency discussion
> shouldn't be in gist/README, rather than buried in a comment inside
> ginDeletePage (not exactly the first place one would think to look IMO).

Yeah, README is probably better. There's zero explanation of concurrency 
and locking issues there at the moment, so I added a new section there 
explaining why the page deletion is (now) safe. There's a lot more to 
say about locking in GIN, but I'll leave that for another day.

- Heikki