Thread: Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()

Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()

From
Heikki Linnakangas
Date:
(moving to pgsql-hackers)

Simon Riggs wrote:
> On Wed, 2010-04-07 at 06:12 +0000, Heikki Linnakangas wrote:
>> Log Message:
>> -----------
>> Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset() during
>> recovery. We might want to relax this in the future, but ThisTimeLineID
>> isn't currently correct in backends during recovery, so the filename
>> returned was wrong.
> 
> Any reason why we couldn't just do this:
> 
> if (RecoveryInProgress())
> {
>     volatile XLogCtlData *xlogctl = XLogCtl;
>     XLogFileName(xlogfilename, xlogctl->ThisTimeLineID, 
>             xlogid, xlogseg);
> }
> else
>     XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
> 
> 
> rather than preventing access to those functions completely?

Because if you do something like
pg_xlogfile_name(pg_last_xlog_receive_location()),
xloctl->ThisTimeLineId would not necessarily be the timeline
corresponding the last received location. Even with
pg_xlogfile_name(pg_last_xlog_replay_location()), there's a small race
condition between those calls; if a checkpoint record is replayed in
between that changes timeline, the returned filename doesn't correspond
the name of the file where the replayed WAL record was read from, as you
would expect.

This commit is a stop-gap solution until we figure out what exactly to
do about that. Masao-san wrote a patch that included the TLI in the
string returned by pg_last_xlog_receive/replay_location() (see
http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com
and
http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com),
but it still wasn't clear it did the right thing in corner-cases where
the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by
pg_last_receive_location() seems bogus, at least.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


On Wed, 2010-04-07 at 13:23 +0300, Heikki Linnakangas wrote:
> (moving to pgsql-hackers)
> 
> Simon Riggs wrote:
> > On Wed, 2010-04-07 at 06:12 +0000, Heikki Linnakangas wrote:
> >> Log Message:
> >> -----------
> >> Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset() during
> >> recovery. We might want to relax this in the future, but ThisTimeLineID
> >> isn't currently correct in backends during recovery, so the filename
> >> returned was wrong.
> > 
> > Any reason why we couldn't just do this:
> > 
> > if (RecoveryInProgress())
> > {
> >     volatile XLogCtlData *xlogctl = XLogCtl;
> >     XLogFileName(xlogfilename, xlogctl->ThisTimeLineID, 
> >             xlogid, xlogseg);
> > }
> > else
> >     XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
> > 
> > 
> > rather than preventing access to those functions completely?
> 
> Because if you do something like
> pg_xlogfile_name(pg_last_xlog_receive_location()),
> xloctl->ThisTimeLineId would not necessarily be the timeline
> corresponding the last received location. Even with
> pg_xlogfile_name(pg_last_xlog_replay_location()), there's a small race
> condition between those calls; if a checkpoint record is replayed in
> between that changes timeline, the returned filename doesn't correspond
> the name of the file where the replayed WAL record was read from, as you
> would expect.

If timelineId changed in normal operation, I'd be inclined to agree this
was a problem. It hardly ever changes, and can only change on standby
when that server is not yet streaming.

I'd rather have a function with a rare and documented weirdness, than no
function at all.

> This commit is a stop-gap solution until we figure out what exactly to
> do about that. Masao-san wrote a patch that included the TLI in the
> string returned by pg_last_xlog_receive/replay_location() (see
> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com
> and
> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com),
> but it still wasn't clear it did the right thing in corner-cases where
> the TLI changes. 

> Using GetRecoveryTargetTLI() for the tli returned by
> pg_last_receive_location() seems bogus, at least.

Agree with that, using the current value makes most sense 
xlogctl->ThisTimeLineID

-- Simon Riggs           www.2ndQuadrant.com



On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> This commit is a stop-gap solution until we figure out what exactly to
> do about that. Masao-san wrote a patch that included the TLI in the
> string returned by pg_last_xlog_receive/replay_location() (see
> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com
> and
> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com),
> but it still wasn't clear it did the right thing in corner-cases where
> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by
> pg_last_receive_location() seems bogus, at least.

Why? The tli of the last WAL record received is always the
recovery target tli currently. So using GetRecoveryTargetTLI()
for pg_last_xlog_receive_location() seems OK for me.
Am I missing something?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()

From
Heikki Linnakangas
Date:
Fujii Masao wrote:
> On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> This commit is a stop-gap solution until we figure out what exactly to
>> do about that. Masao-san wrote a patch that included the TLI in the
>> string returned by pg_last_xlog_receive/replay_location() (see
>> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com
>> and
>> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com),
>> but it still wasn't clear it did the right thing in corner-cases where
>> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by
>> pg_last_receive_location() seems bogus, at least.
> 
> Why? The tli of the last WAL record received is always the
> recovery target tli currently.

True.

Hmm, currently pg_last_xlog_receive_location() returns the last locationstreamed via streaming replication. Should that
bechanged so that it
 
also advances when a WAL segment is restored from archive? It seems
strange that pg_last_xlog_receive_location() can be smaller than
pg_last_xlog_replay_location().

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


On Thu, 2010-04-08 at 09:54 +0300, Heikki Linnakangas wrote:
> Fujii Masao wrote:
> > On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas
> > <heikki.linnakangas@enterprisedb.com> wrote:
> >> This commit is a stop-gap solution until we figure out what exactly to
> >> do about that. Masao-san wrote a patch that included the TLI in the
> >> string returned by pg_last_xlog_receive/replay_location() (see
> >> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com
> >> and
> >> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com),
> >> but it still wasn't clear it did the right thing in corner-cases where
> >> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by
> >> pg_last_receive_location() seems bogus, at least.
> > 
> > Why? The tli of the last WAL record received is always the
> > recovery target tli currently.
> 
> True.

Only in streaming mode. If you use the current TLI as I have suggested
then it will be correct in more cases.

> Hmm, currently pg_last_xlog_receive_location() returns the last location
>  streamed via streaming replication. Should that be changed so that it
> also advances when a WAL segment is restored from archive? It seems
> strange that pg_last_xlog_receive_location() can be smaller than
> pg_last_xlog_replay_location().

Yes, it should be changed.

-- Simon Riggs           www.2ndQuadrant.com



On Thu, Apr 8, 2010 at 4:06 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, 2010-04-08 at 09:54 +0300, Heikki Linnakangas wrote:
>> Fujii Masao wrote:
>> > On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas
>> > <heikki.linnakangas@enterprisedb.com> wrote:
>> >> This commit is a stop-gap solution until we figure out what exactly to
>> >> do about that. Masao-san wrote a patch that included the TLI in the
>> >> string returned by pg_last_xlog_receive/replay_location() (see
>> >> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com
>> >> and
>> >> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com),
>> >> but it still wasn't clear it did the right thing in corner-cases where
>> >> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by
>> >> pg_last_receive_location() seems bogus, at least.
>> >
>> > Why? The tli of the last WAL record received is always the
>> > recovery target tli currently.
>>
>> True.
>
> Only in streaming mode. If you use the current TLI as I have suggested
> then it will be correct in more cases.

pg_last_xlog_receive_location() might be executed also after archive
recovery ends. In this case, using the current tli seems not correct
because it's always different from the recovery target tli after recovery.

>> Hmm, currently pg_last_xlog_receive_location() returns the last location
>>  streamed via streaming replication. Should that be changed so that it
>> also advances when a WAL segment is restored from archive? It seems
>> strange that pg_last_xlog_receive_location() can be smaller than
>> pg_last_xlog_replay_location().
>
> Yes, it should be changed.

Should it advance when WAL file in pg_xlog is read? If not,
pg_last_xlog_receive_location() can be smaller than
pg_last_xlog_replay_location().

And, how far should it advance when WAL file is
partially-filled for some reasons?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Thu, 2010-04-08 at 16:41 +0900, Fujii Masao wrote:
> >> > Why? The tli of the last WAL record received is always the
> >> > recovery target tli currently.
> >>
> >> True.
> >
> > Only in streaming mode. If you use the current TLI as I have suggested
> > then it will be correct in more cases.
> 
> pg_last_xlog_receive_location() might be executed also after archive
> recovery ends. In this case, using the current tli seems not correct
> because it's always different from the recovery target tli after recovery.

Which is why the code I write says "if (RecoveryInProgress())"

> >> Hmm, currently pg_last_xlog_receive_location() returns the last location
> >>  streamed via streaming replication. Should that be changed so that it
> >> also advances when a WAL segment is restored from archive? It seems
> >> strange that pg_last_xlog_receive_location() can be smaller than
> >> pg_last_xlog_replay_location().
> >
> > Yes, it should be changed.
> 
> Should it advance when WAL file in pg_xlog is read? If not,
> pg_last_xlog_receive_location() can be smaller than
> pg_last_xlog_replay_location().
> 
> And, how far should it advance when WAL file is
> partially-filled for some reasons?

Just make pg_last_xlog_receive_location() do exactly the same thing as
pg_last_xlog_replay_location() when working with files. No need to try
to keep two things exactly in sync.

-- Simon Riggs           www.2ndQuadrant.com