When doing my experiments with bucket split ([1]), I noticed a comment that
_hash_getnewbuf should not be called concurrently. However, there's no
synchronization of calls from _hash_splitbucket in HEAD. I could reproduce
such concurrent calls using gdb (multiple bucket splits in progress at a
time).
When the function is called from _hash_getovflpage, content lock of metapage
buffer seems to be (mis)used to synchronize the calls:
/*
* Fetch the page with _hash_getnewbuf to ensure smgr's idea of the
* relation length stays in sync with ours. XXX It's annoying to do this
* with metapage write lock held; would be better to use a lock that
* doesn't block incoming searches.
*/
newbuf = _hash_getnewbuf(rel, blkno, MAIN_FORKNUM);
I think it'd also be the easiest fix for _hash_splitbucket. Or should a
separate ("regular") lock be introduced and used and used in both cases?
[1] http://www.postgresql.org/message-id/32423.1427413442@localhost
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
new file mode 100644
index 46c6c96..25c1dd1
*** a/src/backend/access/hash/hashpage.c
--- b/src/backend/access/hash/hashpage.c
*************** _hash_splitbucket(Relation rel,
*** 765,771 ****
--- 765,773 ----
oopaque = (HashPageOpaque) PageGetSpecialPointer(opage);
nblkno = start_nblkno;
+ _hash_chgbufaccess(rel, metabuf, HASH_NOLOCK, HASH_WRITE);
nbuf = _hash_getnewbuf(rel, nblkno, MAIN_FORKNUM);
+ _hash_chgbufaccess(rel, metabuf, HASH_WRITE, HASH_NOLOCK);
npage = BufferGetPage(nbuf);
/* initialize the new bucket's primary page */