Thread: Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
From
Heikki Linnakangas
Date:
(moving to pgsql-hackers) Simon Riggs wrote: > On Wed, 2010-04-07 at 06:12 +0000, Heikki Linnakangas wrote: >> Log Message: >> ----------- >> Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset() during >> recovery. We might want to relax this in the future, but ThisTimeLineID >> isn't currently correct in backends during recovery, so the filename >> returned was wrong. > > Any reason why we couldn't just do this: > > if (RecoveryInProgress()) > { > volatile XLogCtlData *xlogctl = XLogCtl; > XLogFileName(xlogfilename, xlogctl->ThisTimeLineID, > xlogid, xlogseg); > } > else > XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg); > > > rather than preventing access to those functions completely? Because if you do something like pg_xlogfile_name(pg_last_xlog_receive_location()), xloctl->ThisTimeLineId would not necessarily be the timeline corresponding the last received location. Even with pg_xlogfile_name(pg_last_xlog_replay_location()), there's a small race condition between those calls; if a checkpoint record is replayed in between that changes timeline, the returned filename doesn't correspond the name of the file where the replayed WAL record was read from, as you would expect. This commit is a stop-gap solution until we figure out what exactly to do about that. Masao-san wrote a patch that included the TLI in the string returned by pg_last_xlog_receive/replay_location() (see http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com and http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com), but it still wasn't clear it did the right thing in corner-cases where the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by pg_last_receive_location() seems bogus, at least. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
From
Simon Riggs
Date:
On Wed, 2010-04-07 at 13:23 +0300, Heikki Linnakangas wrote: > (moving to pgsql-hackers) > > Simon Riggs wrote: > > On Wed, 2010-04-07 at 06:12 +0000, Heikki Linnakangas wrote: > >> Log Message: > >> ----------- > >> Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset() during > >> recovery. We might want to relax this in the future, but ThisTimeLineID > >> isn't currently correct in backends during recovery, so the filename > >> returned was wrong. > > > > Any reason why we couldn't just do this: > > > > if (RecoveryInProgress()) > > { > > volatile XLogCtlData *xlogctl = XLogCtl; > > XLogFileName(xlogfilename, xlogctl->ThisTimeLineID, > > xlogid, xlogseg); > > } > > else > > XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg); > > > > > > rather than preventing access to those functions completely? > > Because if you do something like > pg_xlogfile_name(pg_last_xlog_receive_location()), > xloctl->ThisTimeLineId would not necessarily be the timeline > corresponding the last received location. Even with > pg_xlogfile_name(pg_last_xlog_replay_location()), there's a small race > condition between those calls; if a checkpoint record is replayed in > between that changes timeline, the returned filename doesn't correspond > the name of the file where the replayed WAL record was read from, as you > would expect. If timelineId changed in normal operation, I'd be inclined to agree this was a problem. It hardly ever changes, and can only change on standby when that server is not yet streaming. I'd rather have a function with a rare and documented weirdness, than no function at all. > This commit is a stop-gap solution until we figure out what exactly to > do about that. Masao-san wrote a patch that included the TLI in the > string returned by pg_last_xlog_receive/replay_location() (see > http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com > and > http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com), > but it still wasn't clear it did the right thing in corner-cases where > the TLI changes. > Using GetRecoveryTargetTLI() for the tli returned by > pg_last_receive_location() seems bogus, at least. Agree with that, using the current value makes most sense xlogctl->ThisTimeLineID -- Simon Riggs www.2ndQuadrant.com
Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
From
Fujii Masao
Date:
On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > This commit is a stop-gap solution until we figure out what exactly to > do about that. Masao-san wrote a patch that included the TLI in the > string returned by pg_last_xlog_receive/replay_location() (see > http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com > and > http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com), > but it still wasn't clear it did the right thing in corner-cases where > the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by > pg_last_receive_location() seems bogus, at least. Why? The tli of the last WAL record received is always the recovery target tli currently. So using GetRecoveryTargetTLI() for pg_last_xlog_receive_location() seems OK for me. Am I missing something? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
From
Heikki Linnakangas
Date:
Fujii Masao wrote: > On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> This commit is a stop-gap solution until we figure out what exactly to >> do about that. Masao-san wrote a patch that included the TLI in the >> string returned by pg_last_xlog_receive/replay_location() (see >> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com >> and >> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com), >> but it still wasn't clear it did the right thing in corner-cases where >> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by >> pg_last_receive_location() seems bogus, at least. > > Why? The tli of the last WAL record received is always the > recovery target tli currently. True. Hmm, currently pg_last_xlog_receive_location() returns the last locationstreamed via streaming replication. Should that bechanged so that it also advances when a WAL segment is restored from archive? It seems strange that pg_last_xlog_receive_location() can be smaller than pg_last_xlog_replay_location(). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
From
Simon Riggs
Date:
On Thu, 2010-04-08 at 09:54 +0300, Heikki Linnakangas wrote: > Fujii Masao wrote: > > On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas > > <heikki.linnakangas@enterprisedb.com> wrote: > >> This commit is a stop-gap solution until we figure out what exactly to > >> do about that. Masao-san wrote a patch that included the TLI in the > >> string returned by pg_last_xlog_receive/replay_location() (see > >> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com > >> and > >> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com), > >> but it still wasn't clear it did the right thing in corner-cases where > >> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by > >> pg_last_receive_location() seems bogus, at least. > > > > Why? The tli of the last WAL record received is always the > > recovery target tli currently. > > True. Only in streaming mode. If you use the current TLI as I have suggested then it will be correct in more cases. > Hmm, currently pg_last_xlog_receive_location() returns the last location > streamed via streaming replication. Should that be changed so that it > also advances when a WAL segment is restored from archive? It seems > strange that pg_last_xlog_receive_location() can be smaller than > pg_last_xlog_replay_location(). Yes, it should be changed. -- Simon Riggs www.2ndQuadrant.com
Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
From
Fujii Masao
Date:
On Thu, Apr 8, 2010 at 4:06 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Thu, 2010-04-08 at 09:54 +0300, Heikki Linnakangas wrote: >> Fujii Masao wrote: >> > On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas >> > <heikki.linnakangas@enterprisedb.com> wrote: >> >> This commit is a stop-gap solution until we figure out what exactly to >> >> do about that. Masao-san wrote a patch that included the TLI in the >> >> string returned by pg_last_xlog_receive/replay_location() (see >> >> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa4249304ba@mail.gmail.com >> >> and >> >> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf48db@mail.gmail.com), >> >> but it still wasn't clear it did the right thing in corner-cases where >> >> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by >> >> pg_last_receive_location() seems bogus, at least. >> > >> > Why? The tli of the last WAL record received is always the >> > recovery target tli currently. >> >> True. > > Only in streaming mode. If you use the current TLI as I have suggested > then it will be correct in more cases. pg_last_xlog_receive_location() might be executed also after archive recovery ends. In this case, using the current tli seems not correct because it's always different from the recovery target tli after recovery. >> Hmm, currently pg_last_xlog_receive_location() returns the last location >> streamed via streaming replication. Should that be changed so that it >> also advances when a WAL segment is restored from archive? It seems >> strange that pg_last_xlog_receive_location() can be smaller than >> pg_last_xlog_replay_location(). > > Yes, it should be changed. Should it advance when WAL file in pg_xlog is read? If not, pg_last_xlog_receive_location() can be smaller than pg_last_xlog_replay_location(). And, how far should it advance when WAL file is partially-filled for some reasons? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
From
Simon Riggs
Date:
On Thu, 2010-04-08 at 16:41 +0900, Fujii Masao wrote: > >> > Why? The tli of the last WAL record received is always the > >> > recovery target tli currently. > >> > >> True. > > > > Only in streaming mode. If you use the current TLI as I have suggested > > then it will be correct in more cases. > > pg_last_xlog_receive_location() might be executed also after archive > recovery ends. In this case, using the current tli seems not correct > because it's always different from the recovery target tli after recovery. Which is why the code I write says "if (RecoveryInProgress())" > >> Hmm, currently pg_last_xlog_receive_location() returns the last location > >> streamed via streaming replication. Should that be changed so that it > >> also advances when a WAL segment is restored from archive? It seems > >> strange that pg_last_xlog_receive_location() can be smaller than > >> pg_last_xlog_replay_location(). > > > > Yes, it should be changed. > > Should it advance when WAL file in pg_xlog is read? If not, > pg_last_xlog_receive_location() can be smaller than > pg_last_xlog_replay_location(). > > And, how far should it advance when WAL file is > partially-filled for some reasons? Just make pg_last_xlog_receive_location() do exactly the same thing as pg_last_xlog_replay_location() when working with files. No need to try to keep two things exactly in sync. -- Simon Riggs www.2ndQuadrant.com