Re: Patent issues and 8.1 - Mailing list pgsql-hackers

From Neil Conway
Subject Re: Patent issues and 8.1
Date
Msg-id 41F74229.1000506@samurai.com
Whole thread Raw
In response to Re: Patent issues and 8.1  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Patent issues and 8.1
Re: Patent issues and 8.1
List pgsql-hackers
Tom Lane wrote:
> I've already pointed out a couple reasons why I don't have any
> confidence in its correctness.

Well, you've suggested that I should try and reduce the API churn caused 
by the patch. As I said on -patches, I don't really see this as an issue 
if we just apply the patch to REL8_0_STABLE. You never replied to my 
-patches mail -- AFAIK that's where the issue stands.

I think the biggest area of concern with the LRU patch is how it changes 
bgwriter behavior. To review, the bgwriter currently does the following 
each time it is invoked:

(1) acquire BufMgrLock
(2) sweep through the entire buffer pool in LRU order, adding dirty 
pages to a list
(3) examine a subset of the computed list according to bgwriter_maxpages 
and bgwriter_percent
(4) for each element in the subset:    (a) drop BufMgrLock    (b) flush page to disk    (c) reacquire BufMgrLock

This is tricky to implement with LRU; we only record the recency of last 
access for unpinned pages, so we can't get the list in #2 (we can either 
get all dirty pages, as required for checkpoint, or all the unpinned 
dirty pages in LRU order). Besides which, the ARC behavior has some 
issues that were raised prior to the 8.0 release: scanning the whole 
buffer pool with the BufMgrLock held is bad.

So the LRU patch changes the bgwriter to:

(1) acquire the BufMgrLock
(2) sweep through the free list (the list of unpinned buffers) in LRU 
order, adding up to bgwriter_maxpages dirty pages to the list
3) for each element in the list, write it out as in #4 above

That means we don't hold the BufMgrLock for as long and the bgwriter 
doesn't consider flushing pinned pages (both a good idea IMHO), but it 
also means there's no consideration of bgwriter_percent. I'm not at all 
sure that this is the best compromise, however, so some comments would 
be welcome.

Perhaps it would be best to keep bgwriter_percent, but redefine it to 
mean "the % of the buffer pool scanned by the bgwriter, at most." (I 
think this was Simon's idea originally.)

> This may be the right path to go for
> 8.0.* ... but we must NOT suppose that we can just push it out without
> a full beta test cycle.

Yeah, I think a beta period would be a good idea (not nearly as long as 
the 8.0 beta period was, though).

I think it would be better to have a few weeks of beta prior to 8.0.2 
and resolve the problem here and now, rather than crippling the 8.1 
cycle (as the no-initdb policy would) or waiting for the problem to 
*really* become serious (as the "no action needed now" policy would).

-Neil


pgsql-hackers by date:

Previous
From: Neil Conway
Date:
Subject: Re: Status of PostgreSQL for 4+ processors
Next
From: Simon Riggs
Date:
Subject: Re: Concurrent free-lock