Gin page deletion bug - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Gin page deletion bug
Date
Msg-id 527BFCCA.9000000@vmware.com
Whole thread Raw
Responses Re: Gin page deletion bug  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Gin page deletion fails to take into account that there might be a 
search in-flight to the page that is deleted. If the page is reused for 
something else, the search can get very confused.

That's pretty difficult to reproduce in a real system, as the window 
between releasing a lock on page and following its right-link is very 
tight, but by setting a breakpoint with a debugger it's easy. Here's how 
I reproduced it:

-----------
1. Put a breakpoint or sleep in entryGetNextItem() function, where it 
has released lock on one page and is about to read the next one. I used 
this patch:

--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -574,6 +574,9 @@ entryGetNextItem(GinState *ginstate, GinScanEntry entry)                 return;             }

+            elog(NOTICE, "about to move right to page %u", blkno);
+            sleep(5);
+             entry->buffer = ReleaseAndReadBuffer(entry->buffer,
ginstate->index,                                                 blkno);
 

2. Initialize a page with a gin index in suitable state:

create extension btree_gin;
create table foo (i int4);
create index i_gin_foo on foo using gin (i) with (fastupdate = off);

insert into foo select 1 from generate_series(1, 5000);
insert into foo select 2 from generate_series(1, 5000);
set enable_bitmapscan=off; set enable_seqscan=on;
delete from foo where i = 1;

3. Start a query, it will sleep between every page:

set enable_bitmapscan=on; set enable_seqscan=off;
select * from foo where i = 1;
postgres=# select * from foo where i = 1;
NOTICE:  about to move right to page 3
NOTICE:  about to move right to page 5
...

4. In another session, delete and reuse the pages:

vacuum foo;
insert into foo select 2 from generate_series(1, 10000) g

5. Let the query run to completion. It will return a lot of tuples with 
i=2, which should not have matched:

...
NOTICE:  about to move right to page 24
NOTICE:  about to move right to page 25 i
--- 2 2 2
...

-----------

The regular b-tree code solves this by stamping deleted pages with the 
current XID, and only allowing them to be reused once that XID becomes 
old enough (< RecentGlobalXmin). Another approach might be to grab a 
cleanup-strength lock on the left and parent pages when deleting a page, 
and requiring search to keep the pin on the page its coming from, until 
it has locked the next page.

- Heikki



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Changing pg_dump default file format
Next
From: Euler Taveira
Date:
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]