Re: CLOG extension - Mailing list pgsql-hackers

From Robert Haas
Subject Re: CLOG extension
Date
Msg-id CA+TgmoaCtq5yd8yR3GkMOj=g8xeAqQpjdy7VWCBrmeV4u7XfCA@mail.gmail.com
Whole thread Raw
In response to Re: CLOG extension  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: CLOG extension
List pgsql-hackers
On Fri, May 4, 2012 at 3:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, May 3, 2012 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, May 3, 2012 at 3:20 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> Your two paragraphs have roughly opposite arguments...
>>>
>>> Doing it every 32 pages would give you 30 seconds to complete the
>>> fsync, if you kicked it off when half way through the previous file -
>>> at current maximum rates. So there is utility in doing it in larger
>>> chunks.
>>
>> Maybe, but I'd like to try changing one thing at a time.  If we change
>> too much at once, it's likely to be hard to figure out where the
>> improvement is coming from.  Moving the task to a background process
>> is one improvement; doing it in larger chunks is another.  Those
>> deserve independent testing.
>
> You gave a good argument why background pre-allocation wouldn't work
> very well if we do it a page at a time. I believe you.

Your confidence is sort of gratifying, but in this case I believe it's
misplaced.  On more careful analysis, it seems that ExtendCLOG() does
just two things: (1) evict a CLOG buffer and replace it with a zero'd
page representing the new page and (2)  write an XLOG record for the
change.  Apparently, "extending" CLOG doesn't actually involve
extending anything on disk at all.  We rely on the future buffer
eviction to do that, which is surprisingly different from the way
relation extension is handled.

So CLOG extension is normally fast, but occasionally something goes
wrong.  So far I see two ways that can happen: (1) the WAL insertion
stalls because wal_buffers are full, and we're forced to wait for WAL
to be written (and perhaps fsync'd, since both are covered by the same
lock) or (2) the page we choose to evict happens to be dirty, and we
have to write+fsync it before repurposing it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Future In-Core Replication
Next
From: Robert Haas
Date:
Subject: Re: Future In-Core Replication