Re: Relation extension scalability - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Relation extension scalability
Date
Msg-id 20150719160739.GJ25610@awork2.anarazel.de
Whole thread Raw
In response to Re: Relation extension scalability  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Relation extension scalability  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On 2015-07-19 11:56:47 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2015-07-19 11:28:25 -0400, Tom Lane wrote:
> >> At this point session 1 will go and create page 44, won't it, and you
> >> just wasted a page.
> 
> > My local code now recognizes that case and uses the page. We just need
> > to do an PageIsNew().
> 
> Er, what?  How can you tell whether an all-zero page was or was not
> just written by another session?

The check is only done while holding the io lock on the relevant page
(have to hold that anyway), after reading it in ourselves, just before
setting BM_VALID. As we only can get to that point when there wasn't any
other entry for the page in the buffer table, that guarantees that no
other backend isn't currently expanding into that page. Others might
wait to read it, but those'll wait behind the IO lock.


The situation the read() protect us against is that two backends try to
extend to the same block, but after one of them succeeded the buffer is
written out and reused for an independent page. So there is no in-memory
state telling the slower backend that that page has already been used.

Andres



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Relation extension scalability
Next
From: Peter Eisentraut
Date:
Subject: Re: Bug in bttext_abbrev_convert()