Thread: log_newpage header comment

log_newpage header comment

From

Robert Haas

Date:

07 June 2012, 20:24:54

It seems that in implementing ginbuildempty(), I falsified the first
"note" in the header comment for log_newpage():
* Note: all current callers build pages in private memory and write them* directly to smgr, rather than using bufmgr.
Thereforethere is no need* to pass a buffer ID to XLogInsert, nor to perform MarkBufferDirty within* the critical
section.

Mea culpa, mea culpa, mea maxima culpa.

So, this leads to a couple of questions:

1. Considering that we're logging the entire page, is it necessary (or
even desirable) to include the buffer ID in the rdata structure?  If
so, why?  To put that another way, does my abuse of log_newpage
constitute a bug in gistbuildempty()?

2. Should we add a new function that does the same thing as
log_newpage for a shared buffer?  I'm imagining that the signature
would be:

XLogRecPtr log_newpage_buffer(RelFileNode *rnode, Buffer buffer);

Admittedly, it may seem that we don't quite need such a function to
cater to just one caller, but I think we might have more in the
future.  Among other things, it occurs to me to wonder whether I ought
to rewrite all the ambuildempty routines in the style of
ginbuildempty(), to avoid having to fsync in the foreground.  Even if
not, I think there might be other applications for this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: log_newpage header comment

From

Tom Lane

Date:

07 June 2012, 21:07:00

Robert Haas <robertmhaas@gmail.com> writes:
> It seems that in implementing ginbuildempty(), I falsified the first
> "note" in the header comment for log_newpage():

>  * Note: all current callers build pages in private memory and write them
>  * directly to smgr, rather than using bufmgr.  Therefore there is no need
>  * to pass a buffer ID to XLogInsert, nor to perform MarkBufferDirty within
>  * the critical section.

> 1. Considering that we're logging the entire page, is it necessary (or
> even desirable) to include the buffer ID in the rdata structure?  If
> so, why?  To put that another way, does my abuse of log_newpage
> constitute a bug in gistbuildempty()?

AFAICS, not passing the buffer ID to XLogInsert is not an issue, since
we are logging the whole page in any case.  However, failing to perform
MarkBufferDirty within the critical section definitely is an issue.

> 2. Should we add a new function that does the same thing as
> log_newpage for a shared buffer?  I'm imagining that the signature
> would be:

Either that or rethink building this data in shared buffers.  What's the
point of that, exactly, for a page that we are most certainly not going
to use in normal operation?
        regards, tom lane

Re: log_newpage header comment

From

Robert Haas

Date:

08 June 2012, 09:03:37

On Thu, Jun 7, 2012 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> It seems that in implementing ginbuildempty(), I falsified the first
>> "note" in the header comment for log_newpage():
>
>>  * Note: all current callers build pages in private memory and write them
>>  * directly to smgr, rather than using bufmgr.  Therefore there is no need
>>  * to pass a buffer ID to XLogInsert, nor to perform MarkBufferDirty within
>>  * the critical section.
>
>> 1. Considering that we're logging the entire page, is it necessary (or
>> even desirable) to include the buffer ID in the rdata structure?  If
>> so, why?  To put that another way, does my abuse of log_newpage
>> constitute a bug in gistbuildempty()?
>
> AFAICS, not passing the buffer ID to XLogInsert is not an issue, since
> we are logging the whole page in any case.  However, failing to perform
> MarkBufferDirty within the critical section definitely is an issue.

However, I'm not failing to do that: there's an enclosing critical section.

>> 2. Should we add a new function that does the same thing as
>> log_newpage for a shared buffer?  I'm imagining that the signature
>> would be:
>
> Either that or rethink building this data in shared buffers.  What's the
> point of that, exactly, for a page that we are most certainly not going
> to use in normal operation?

Well, right.  I mean, I think the current implementation is mostly
design-by-accident, and my first thought was the same as yours: don't
clutter shared_buffers with pages we don't actually need for anything.But there is a down side to doing it the other
way,too.  Look at 
what btbuildempty does:
       /* Write the page.      If archiving/streaming, XLOG it. */       smgrwrite(index->rd_smgr, INIT_FORKNUM,
BTREE_METAPAGE,                        (char *) metapage, true);       if (XLogIsNeeded())
log_newpage(&index->rd_smgr->smgr_rnode.node,INIT_FORKNUM,                                       BTREE_METAPAGE,
metapage);
       /*        * An immediate sync is require even if we xlog'd the page, because the        * write did not go
throughshared_buffers and therefore a concurrent        * checkpoint may have move the redo pointer past our xlog
record.       */       smgrimmedsync(index->rd_smgr, INIT_FORKNUM); 

So we have to write the page out immediately, then we have to XLOG it,
and then even though we've XLOG'd it, we still have to fsync it
immediately.  It might be better to go through shared_buffers, which
would allow the write and fsync to happen in the background.  The
cache-poisoning effect is probably trivial  - even on a system with
32MB of shared_buffers there are thousands of pages, and init forks
are never going to consume more than a handful unless you're creating
an awful lot of unlogged relations very quickly - in which case maybe
avoiding the immediate fsyncs is a more pressing concern.  On the
other hand, we haven't had any complaints about the way it works now,
either.  What's your thought?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: log_newpage header comment

From

Tom Lane

Date:

08 June 2012, 10:33:50

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jun 7, 2012 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> AFAICS, not passing the buffer ID to XLogInsert is not an issue, since
>> we are logging the whole page in any case. �However, failing to perform
>> MarkBufferDirty within the critical section definitely is an issue.

> However, I'm not failing to do that: there's an enclosing critical section.

Mph.  But is it being done in the right order relative to the other XLOG
related steps?  See the code sketch in transam/README.

> So we have to write the page out immediately, then we have to XLOG it,
> and then even though we've XLOG'd it, we still have to fsync it
> immediately.  It might be better to go through shared_buffers, which
> would allow the write and fsync to happen in the background.

Well, that's a fair point, but on the other hand we've not had any
complaints traceable to the cost of making init forks.

On the whole, I don't care for the idea that the
modify-and-xlog-a-buffer sequence is being split between log_newpage and
its caller.  That sounds like bugs waiting to happen anytime somebody
refactors XLOG operations.  It would probably be best to refactor this
as you're suggesting, so that that becomes less nonstandard.
        regards, tom lane

Re: log_newpage header comment

From

Robert Haas

Date:

08 June 2012, 10:56:29

On Fri, Jun 8, 2012 at 9:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Jun 7, 2012 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> AFAICS, not passing the buffer ID to XLogInsert is not an issue, since
>>> we are logging the whole page in any case.  However, failing to perform
>>> MarkBufferDirty within the critical section definitely is an issue.
>
>> However, I'm not failing to do that: there's an enclosing critical section.
>
> Mph.  But is it being done in the right order relative to the other XLOG
> related steps?  See the code sketch in transam/README.

It appears to me that it is being done in the right order.

>> So we have to write the page out immediately, then we have to XLOG it,
>> and then even though we've XLOG'd it, we still have to fsync it
>> immediately.  It might be better to go through shared_buffers, which
>> would allow the write and fsync to happen in the background.
>
> Well, that's a fair point, but on the other hand we've not had any
> complaints traceable to the cost of making init forks.
>
> On the whole, I don't care for the idea that the
> modify-and-xlog-a-buffer sequence is being split between log_newpage and
> its caller.  That sounds like bugs waiting to happen anytime somebody
> refactors XLOG operations.  It would probably be best to refactor this
> as you're suggesting, so that that becomes less nonstandard.

OK.  So what I'm thinking is that we should add a new function that
takes a relfilenode and a buffer and steps 4-6 of what's described in
transam/README: mark the buffer dirty, xlog it, and set the LSN and
TLI.  We might want to have this function assert that it is in a
critical section, for the avoidance of error.  Then anyone who wants
to use it can do steps 1-3, call the function, and then finish up with
steps 6-7.  I don't think we can cleanly encapsulate any more than
that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: log_newpage header comment

From

Robert Haas

Date:

08 June 2012, 14:20:34

On Fri, Jun 8, 2012 at 9:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> OK.  So what I'm thinking is that we should add a new function that
> takes a relfilenode and a buffer and steps 4-6 of what's described in
> transam/README: mark the buffer dirty, xlog it, and set the LSN and
> TLI.  We might want to have this function assert that it is in a
> critical section, for the avoidance of error.  Then anyone who wants
> to use it can do steps 1-3, call the function, and then finish up with
> steps 6-7.  I don't think we can cleanly encapsulate any more than
> that.

On further review, I think that we ought to make MarkBufferDirty() the
caller's job, because sometimes we may need to xlog only if
XLogIsNeeded(), but the buffer's got to get marked dirty either way.
So I think the new function should just do step 5 - emit XLOG and set
LSN/TLI.

Proposed patch attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

log-newpage-buffer-v1.patch

Re: log_newpage header comment

From

Robert Haas

Date:

08 June 2012, 15:08:16

On Fri, Jun 8, 2012 at 1:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jun 8, 2012 at 9:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> OK.  So what I'm thinking is that we should add a new function that
>> takes a relfilenode and a buffer and steps 4-6 of what's described in
>> transam/README: mark the buffer dirty, xlog it, and set the LSN and
>> TLI.  We might want to have this function assert that it is in a
>> critical section, for the avoidance of error.  Then anyone who wants
>> to use it can do steps 1-3, call the function, and then finish up with
>> steps 6-7.  I don't think we can cleanly encapsulate any more than
>> that.
>
> On further review, I think that we ought to make MarkBufferDirty() the
> caller's job, because sometimes we may need to xlog only if
> XLogIsNeeded(), but the buffer's got to get marked dirty either way.
> So I think the new function should just do step 5 - emit XLOG and set
> LSN/TLI.
>
> Proposed patch attached.

Whee, testing is fun.  Second try.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

log-newpage-buffer-v2.patch

Re: log_newpage header comment

From

Amit kapila

Date:

09 June 2012, 02:44:11

>On further review, I think that we ought to make MarkBufferDirty() the
>caller's job, because sometimes we may need to xlog only if
>XLogIsNeeded(), but the buffer's got to get marked dirty either way.

Incase the place where Xlog is not required, woudn't it fsync the data;
So in that case even MarkBufferDirty() will also be not required.

>So I think the new function should just do step 5 - emit XLOG and set
>LSN/TLI.

In the API log_newpage_buffer(), as buffer already contains the page to be logged, so can't it be assumed that the page
willbe initialized and no need to check
 
if PageIsNew before setting LSN/TLI.

________________________________________
From: pgsql-hackers-owner@postgresql.org [pgsql-hackers-owner@postgresql.org] on behalf of Robert Haas
[robertmhaas@gmail.com]
Sent: Friday, June 08, 2012 10:50 PM
To: Tom Lane
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] log_newpage header comment

On Fri, Jun 8, 2012 at 9:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> OK.  So what I'm thinking is that we should add a new function that
> takes a relfilenode and a buffer and steps 4-6 of what's described in
> transam/README: mark the buffer dirty, xlog it, and set the LSN and
> TLI.  We might want to have this function assert that it is in a
> critical section, for the avoidance of error.  Then anyone who wants
> to use it can do steps 1-3, call the function, and then finish up with
> steps 6-7.  I don't think we can cleanly encapsulate any more than
> that.

On further review, I think that we ought to make MarkBufferDirty() the
caller's job, because sometimes we may need to xlog only if
XLogIsNeeded(), but the buffer's got to get marked dirty either way.
So I think the new function should just do step 5 - emit XLOG and set
LSN/TLI.

Proposed patch attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: log_newpage header comment

From

Tom Lane

Date:

09 June 2012, 11:50:02

Robert Haas <robertmhaas@gmail.com> writes:
> Whee, testing is fun.  Second try.

I'm concerned by the fact that neither the original nor the new code
bother to test whether the relation is WAL-loggable.  It may be that
ginbuildempty cannot be invoked for temp tables, but it still seems
like an oversight waiting to bite you.  I'd be happier if there were
a RelationNeedsWAL test here.  Come to think of it, the other
foobuildempty functions aren't checking this either.

A related point is that most of the other existing calls to log_newpage
are covered by "if (XLogIsNeeded())" tests.  It's not immediately
obvious why these two shouldn't be.  After some reflection I think
that's correct, but probably the comments for log_newpage and
log_newpage_buffer need to explain the different WAL-is-needed tests
that apply to the two usages.

(I'm also thinking that the XLogIsNeeded macro is very poorly named,
as the situations it should be used in are far more narrow than the
macro name suggests.  Haven't consumed enough caffeine to think of
a better name, though.)
        regards, tom lane

Re: log_newpage header comment

From

Robert Haas

Date:

10 June 2012, 10:44:34

On Sat, Jun 9, 2012 at 1:43 AM, Amit kapila <amit.kapila@huawei.com> wrote:
>>On further review, I think that we ought to make MarkBufferDirty() the
>>caller's job, because sometimes we may need to xlog only if
>>XLogIsNeeded(), but the buffer's got to get marked dirty either way.
>
> Incase the place where Xlog is not required, woudn't it fsync the data;
> So in that case even MarkBufferDirty() will also be not required.

Uh... no.  The whole point of doing things in shared buffers is that
you don't have to write and fsync the buffers immediately.  Instead,
buffer evicting handles that stuff for you.

>>So I think the new function should just do step 5 - emit XLOG and set
>>LSN/TLI.
>
> In the API log_newpage_buffer(), as buffer already contains the page to be logged, so can't it be assumed that the
pagewill be initialized and no need to check

> if PageIsNew before setting LSN/TLI.

I don't see why it's any different from log_newpage() in that regard.
That data is initialized before being written, as well, but someone
contemplated the possible need to write a page of all zeros.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: log_newpage header comment

From

Robert Haas

Date:

10 June 2012, 10:55:24

On Sat, Jun 9, 2012 at 10:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Whee, testing is fun.  Second try.
>
> I'm concerned by the fact that neither the original nor the new code
> bother to test whether the relation is WAL-loggable.  It may be that
> ginbuildempty cannot be invoked for temp tables, but it still seems
> like an oversight waiting to bite you.  I'd be happier if there were
> a RelationNeedsWAL test here.  Come to think of it, the other
> foobuildempty functions aren't checking this either.

That's a good thing, because if they were, it would be wrong.
RelationNeedsWAL() tells you whether the main fork is WAL-logged, but
the buildempty routines exist for the purpose of constructing the init
fork, which should always be WAL-logged.

> A related point is that most of the other existing calls to log_newpage
> are covered by "if (XLogIsNeeded())" tests.  It's not immediately
> obvious why these two shouldn't be.  After some reflection I think
> that's correct, but probably the comments for log_newpage and
> log_newpage_buffer need to explain the different WAL-is-needed tests
> that apply to the two usages.
>
> (I'm also thinking that the XLogIsNeeded macro is very poorly named,
> as the situations it should be used in are far more narrow than the
> macro name suggests.  Haven't consumed enough caffeine to think of
> a better name, though.)

XLogIsNeeded() is, indeed, not a very good name: it's telling us
whether wal_level > minimal, and thus whether there might be a
standby.  When log_newpage() is used to rebuild a relation, we use WAL
bypass whenever possible, since the whole relation will be removed if
the transaction rolls back, but we must still log it if a standby
exists.  In contrast, the init forks need to make it to the standby
regardless.  I'm not sure that I agree that it's the job of
log_newpage() to explain to people calling it on what things they must
conditionalize WAL generation: after all, none of this is unique to
new-page records.  If we emit some other record while creating an
index on the primary, we can skip that when !XLogIsNeeded(), too.  Any
operation we perform on any relation fork other than the init fork can
be conditionalized on RelationNeedsWAL(), not just new-page records.
The charter of the function is just to avoid duplicating the
mumbo-jumbo that's required to emit the record.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: log_newpage header comment

From

Amit Kapila

Date:

11 June 2012, 00:43:26

>>Uh... no.  The whole point of doing things in shared buffers is that
>>you don't have to write and fsync the buffers immediately.  Instead,
>>buffer evicting handles that stuff for you.

So you mean to say that there exists operations where Xlog is not required
even though it marks the buffer as dirty for later eviction.

>>I don't see why it's any different from log_newpage() in that regard.
>>That data is initialized before being written, as well, but someone
>>contemplated the possible need to write a page of all zeros.

The comment above the code indicates that "the page is uninitialized".
Does the code consider page with all zero's as uninitialized or the comment
is not
appropriate.

-----Original Message-----
From: Robert Haas [mailto:robertmhaas@gmail.com] 
Sent: Sunday, June 10, 2012 7:14 PM
To: Amit kapila
Cc: Tom Lane; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] log_newpage header comment

On Sat, Jun 9, 2012 at 1:43 AM, Amit kapila <amit.kapila@huawei.com> wrote:
>>On further review, I think that we ought to make MarkBufferDirty() the
>>caller's job, because sometimes we may need to xlog only if
>>XLogIsNeeded(), but the buffer's got to get marked dirty either way.
>
> Incase the place where Xlog is not required, woudn't it fsync the data;
> So in that case even MarkBufferDirty() will also be not required.

Uh... no.  The whole point of doing things in shared buffers is that
you don't have to write and fsync the buffers immediately.  Instead,
buffer evicting handles that stuff for you.

>>So I think the new function should just do step 5 - emit XLOG and set
>>LSN/TLI.
>
> In the API log_newpage_buffer(), as buffer already contains the page to be
logged, so can't it be assumed that the page will be initialized and no need
to check
> if PageIsNew before setting LSN/TLI.

I don't see why it's any different from log_newpage() in that regard.
That data is initialized before being written, as well, but someone
contemplated the possible need to write a page of all zeros.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: log_newpage header comment

From

Robert Haas

Date:

11 June 2012, 12:05:40

On Sun, Jun 10, 2012 at 11:41 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
>>>Uh... no.  The whole point of doing things in shared buffers is that
>>>you don't have to write and fsync the buffers immediately.  Instead,
>>>buffer evicting handles that stuff for you.
>
> So you mean to say that there exists operations where Xlog is not required
> even though it marks the buffer as dirty for later eviction.

Correct.

>>>I don't see why it's any different from log_newpage() in that regard.
>>>That data is initialized before being written, as well, but someone
>>>contemplated the possible need to write a page of all zeros.
>
> The comment above the code indicates that "the page is uninitialized".
> Does the code consider page with all zero's as uninitialized or the comment
> is not
> appropriate.

Yes, all zeroes = uninitialized.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company