Thread: xlog filename formatting functions in recovery
Hello, I've noticed recently that I can't seem to use the convenient xlog filename formatting functions while I'm in a standby. I don't see an incredibly obvious reason why that is the case, so here's a patch that simply removes the ban on being able to call these formatting functions. Perhaps I'm missing something, but it feels like this is an oversight from the days when there was no useful function to call that formatted a X/X-type record on a standby. -- fdr
Attachment
On Tue, Jul 3, 2012 at 1:24 AM, Daniel Farina <daniel@heroku.com> wrote: > Hello, > > I've noticed recently that I can't seem to use the convenient xlog > filename formatting functions while I'm in a standby. I don't see an > incredibly obvious reason why that is the case, so here's a patch that > simply removes the ban on being able to call these formatting > functions. > > Perhaps I'm missing something, but it feels like this is an oversight > from the days when there was no useful function to call that formatted > a X/X-type record on a standby. (added to commitfest: https://commitfest.postgresql.org/action/patch_view?id=888) -- fdr
<div class="moz-cite-prefix">On 03/07/12 20:24, Daniel Farina wrote:<br /></div><blockquote cite="mid:CAAZKuFbuYViM4nsT1L9xw9CT75fMo4909=zz3cX0+oyGXhc4aQ@mail.gmail.com"type="cite"><pre wrap="">Hello, I've noticed recently that I can't seem to use the convenient xlog filename formatting functions while I'm in a standby. I don't see an incredibly obvious reason why that is the case, so here's a patch that simply removes the ban on being able to call these formatting functions. Perhaps I'm missing something, but it feels like this is an oversight from the days when there was no useful function to call that formatted a X/X-type record on a standby. </pre><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap=""> </pre></blockquote><br /> I must add a +1 to that, gotta be my pet peeve with this otherwise fine functionality. Also itwould be really nice to have a single function that gave the current receive or active wal offset so things like Nagiosdidn't have to know the if db is a standby or not to work out said offset...<br /><br /> Regards<br /><br /> Mark<br/>
-----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Daniel Farina Sent: Tuesday, July 03, 2012 2:02 PM >(added to commitfest: > https://commitfest.postgresql.org/action/patch_view?id=888) It seems you have added it in current commit fest. Shouldn't it be added for next CF. With Regards, Amit Kapila.
On Tue, Jul 3, 2012 at 6:43 AM, Amit Kapila <amit.kapila@huawei.com> wrote: >>(added to commitfest: >> https://commitfest.postgresql.org/action/patch_view?id=888) > It seems you have added it in current commit fest. > Shouldn't it be added for next CF. Yep. The current CF has been closed to new submissions for two and a half weeks. On the substance of the patch, I believe the reason why this is currently disallowed is because the TLI is implicitly taken from the running system, and on the standby that might be the wrong value. I might be misremembering. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tuesday, July 3, 2012, Robert Haas wrote:
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Tue, Jul 3, 2012 at 6:43 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
>>(added to commitfest:
>> https://commitfest.postgresql.org/action/patch_view?id=888)
> It seems you have added it in current commit fest.
> Shouldn't it be added for next CF.
Yep. The current CF has been closed to new submissions for two and a
half weeks.
Might this be something to consder for 9.2, though? It could be considered a bugfix.
On the substance of the patch, I believe the reason why this is
currently disallowed is because the TLI is implicitly taken from the
running system, and on the standby that might be the wrong value.
I might be misremembering.
That is, however, assuming that this part is not true. I don't recall for sure, but I have a feeling it might be correct. In which case a much bigger patch is needed, and definitely not something for 9.2...
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Tue, Jul 3, 2012 at 8:16 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Tuesday, July 3, 2012, Robert Haas wrote: >> On Tue, Jul 3, 2012 at 6:43 AM, Amit Kapila <amit.kapila@huawei.com> >> wrote: >> >>(added to commitfest: >> >> https://commitfest.postgresql.org/action/patch_view?id=888) >> > It seems you have added it in current commit fest. >> > Shouldn't it be added for next CF. >> >> Yep. The current CF has been closed to new submissions for two and a >> half weeks. > > Might this be something to consder for 9.2, though? It could be considered a > bugfix. > >> On the substance of the patch, I believe the reason why this is >> currently disallowed is because the TLI is implicitly taken from the >> running system, and on the standby that might be the wrong value. >> >> I might be misremembering. > > That is, however, assuming that this part is not true. I don't recall for > sure, but I have a feeling it might be correct. In which case a much bigger > patch is needed, and definitely not something for 9.2... Even if we were to conclude that the argument about TLIs is not valid, I'd be very reluctant to slip something like this into 9.2, because we have no time left to recant if it later turns out that there's some other reason why it's not a good idea. Removing error checks is one of those things that you want to try to get done early in the release cycle, because the consequences are sometimes difficult to foresee, and you may not find out why it was a bad idea until users start complaining. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Jul 3, 2012 at 5:30 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jul 3, 2012 at 8:16 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Tuesday, July 3, 2012, Robert Haas wrote: >>> On Tue, Jul 3, 2012 at 6:43 AM, Amit Kapila <amit.kapila@huawei.com> >>> wrote: >>> >>(added to commitfest: >>> >> https://commitfest.postgresql.org/action/patch_view?id=888) >>> > It seems you have added it in current commit fest. >>> > Shouldn't it be added for next CF. >>> >>> Yep. The current CF has been closed to new submissions for two and a >>> half weeks. >> >> Might this be something to consder for 9.2, though? It could be considered a >> bugfix. >> >>> On the substance of the patch, I believe the reason why this is >>> currently disallowed is because the TLI is implicitly taken from the >>> running system, and on the standby that might be the wrong value. >>> >>> I might be misremembering. >> >> That is, however, assuming that this part is not true. I don't recall for >> sure, but I have a feeling it might be correct. In which case a much bigger >> patch is needed, and definitely not something for 9.2... > > Even if we were to conclude that the argument about TLIs is not valid, > I'd be very reluctant to slip something like this into 9.2, because we > have no time left to recant if it later turns out that there's some > other reason why it's not a good idea. Removing error checks is one > of those things that you want to try to get done early in the release > cycle, because the consequences are sometimes difficult to foresee, > and you may not find out why it was a bad idea until users start > complaining. Ah. The placement in that particular commitfest was an oversight. I'll move it. -- fdr
On 03.07.2012 15:13, Robert Haas wrote: > On the substance of the patch, I believe the reason why this is > currently disallowed is because the TLI is implicitly taken from the > running system, and on the standby that might be the wrong value. Yeah, I believe that's the reason. So the question is, what timeline should the functions use on a standby? With the patch as it is, they use 0: postgres=# select pg_xlogfile_name_offset('3/FF020000'); pg_xlogfile_name_offset ----------------------------------- (0000000000000003000000FF,131072) (1 row) There's a few different options: 1. current recovery_target_timeline (XLogCtl->recoveryTargetTLI) 2. current ThisTimeLineID, which is bumped every time a timeline-bumping checkpoint record is replayed. (this is not currently visible to backends, but we could easily add a shared memory variable for it) 3. curFileTLI. That is, the TLI of the current file that we're replaying. This is usually the same as ThisTimeLineID, except when replaying a WAL segment where the timeline changes 4. Something else? What do you use these functions for? Which option would make the most sense? - Heikki
On 21 September 2012 02:25, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 03.07.2012 15:13, Robert Haas wrote: >> >> On the substance of the patch, I believe the reason why this is >> currently disallowed is because the TLI is implicitly taken from the >> running system, and on the standby that might be the wrong value. > > > Yeah, I believe that's the reason. So the question is, what timeline should > the functions use on a standby? With the patch as it is, they use 0: > > postgres=# select pg_xlogfile_name_offset('3/FF020000'); > pg_xlogfile_name_offset > ----------------------------------- > (0000000000000003000000FF,131072) > (1 row) > > There's a few different options: > > 1. current recovery_target_timeline (XLogCtl->recoveryTargetTLI) > 2. current ThisTimeLineID, which is bumped every time a timeline-bumping > checkpoint record is replayed. (this is not currently visible to backends, > but we could easily add a shared memory variable for it) > 3. curFileTLI. That is, the TLI of the current file that we're replaying. > This is usually the same as ThisTimeLineID, except when replaying a WAL > segment where the timeline changes > 4. Something else? > > What do you use these functions for? Which option would make the most sense? I would say there is no sensible solution. So we keep pg_xlogfile_name_offset() banned in recovery, as it is now. We introduce pg_xlogfile_name_offset_timeline() where you have to manually specify the timeline, then introduce 3 functions that map onto the 3 options above, forcing the user to choose which one they mean. pg_recovery_target_timeline() pg_recovery_current_timeline() pg_reocvery_current_file_timeline() Usage would then be pg_xlogfile_name_offset_timeline( my_choice_of_timeline(), '3/FF020000') -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Sep 25, 2012 at 11:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 21 September 2012 02:25, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >> On 03.07.2012 15:13, Robert Haas wrote: >>> >>> On the substance of the patch, I believe the reason why this is >>> currently disallowed is because the TLI is implicitly taken from the >>> running system, and on the standby that might be the wrong value. >> >> >> Yeah, I believe that's the reason. So the question is, what timeline should >> the functions use on a standby? With the patch as it is, they use 0: >> >> postgres=# select pg_xlogfile_name_offset('3/FF020000'); >> pg_xlogfile_name_offset >> ----------------------------------- >> (0000000000000003000000FF,131072) >> (1 row) >> >> There's a few different options: >> >> 1. current recovery_target_timeline (XLogCtl->recoveryTargetTLI) >> 2. current ThisTimeLineID, which is bumped every time a timeline-bumping >> checkpoint record is replayed. (this is not currently visible to backends, >> but we could easily add a shared memory variable for it) >> 3. curFileTLI. That is, the TLI of the current file that we're replaying. >> This is usually the same as ThisTimeLineID, except when replaying a WAL >> segment where the timeline changes >> 4. Something else? >> >> What do you use these functions for? Which option would make the most sense? > > I would say there is no sensible solution. > > So we keep pg_xlogfile_name_offset() banned in recovery, as it is now. > > We introduce pg_xlogfile_name_offset_timeline() where you have to > manually specify the timeline, I agree to introduce such function, but what about extending pg_xlogfile_name and pg_xlogfile_name_offset so that they accept the timeline ID as the second argument, instead of adding new functions? If the second argument is omitted, current timeline ID is used to calculate the WAL file name. > then introduce 3 functions that map > onto the 3 options above, forcing the user to choose which one they > mean. > pg_recovery_target_timeline() > pg_recovery_current_timeline() > pg_reocvery_current_file_timeline() It seems to me that it's not easy for a user to understand the difference of those functions. Since I think that pg_xlogfile_name is likely to be used together with pg_last_xlog_receive_location and pg_last_xlog_replay_location in the standby, it seems better to introduce something like pg_last_xlog_receive_timeline and pg_last_xlog_replay_timeline. That is, a user would execute, for example, SELECT pg_xlogfile_name(pg_last_xlog_replay_location(), pg_last_xlog_replay_timeline()); Regards, -- Fujii Masao
Daniel, I assume you are submitting an updated version based on the feedback that has been provided. I will mark this patch returned with feedback in the current CF; please submit the next version to CF3. Thanks to the (rather numerous!) reviewers. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Oct 18, 2012 at 9:22 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Daniel, I assume you are submitting an updated version based on the > feedback that has been provided. I will mark this patch returned with > feedback in the current CF; please submit the next version to CF3. Thank you for reminding me, so the approach that seems basically reasonable is to add an overload for these functions to accept a timeline number, if I read this feedback correctly? -- fdr