Thread: Re: minor bug

Re: minor bug

From
Laurenz Albe
Date:
On Mon, 2023-01-16 at 19:59 +0100, Torsten Förtsch wrote:
> not sure if this is known behavior.
>
> Server version is 14.6 (Debian 14.6-1.pgdg110+1).
>
> In a PITR setup I have these settings:
>
> recovery_target_xid = '852381'
> recovery_target_inclusive = 'false'
>
> In the log file I see this message:
>
> LOG:  recovery stopping before commit of transaction 852381, time 2000-01-01 00:00:00+00
>
> But:
>
> postgres=# select * from pg_last_committed_xact();
>   xid   |           timestamp           | roident
> --------+-------------------------------+---------
>  852380 | 2023-01-16 18:00:35.054495+00 |       0
>
> So, the timestamp displayed in the log message is certainly wrong.

Redirected to -hackers.

If recovery stops at a WAL record that has no timestamp, you get this
bogus recovery stop time.  I think we should show the recovery stop time
only if time was the target, as in the attached patch.

Yours,
Laurenz Albe

Attachment

Re: minor bug

From
Michael Paquier
Date:
On Tue, Jan 17, 2023 at 10:42:03AM +0100, Laurenz Albe wrote:
> If recovery stops at a WAL record that has no timestamp, you get this
> bogus recovery stop time.  I think we should show the recovery stop time
> only if time was the target, as in the attached patch.

Good catch!  I'll try to take a look.
--
Michael

Attachment

Re: minor bug

From
Tom Lane
Date:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Mon, 2023-01-16 at 19:59 +0100, Torsten Förtsch wrote:
>> So, the timestamp displayed in the log message is certainly wrong.

> If recovery stops at a WAL record that has no timestamp, you get this
> bogus recovery stop time.  I think we should show the recovery stop time
> only if time was the target, as in the attached patch.

I don't think that is a tremendously useful definition: the user
already knows what recoveryStopTime is, or can find it out from
their settings easily enough.

I seem to recall that the original idea was to report the timestamp
of the commit/abort record we are stopping at.  Maybe my memory is
faulty, but I think that'd be significantly more useful than the
current behavior, *especially* when the replay stopping point is
defined by something other than time.

(Also, the wording of the log message suggests that that's what
the reported time is supposed to be.  I wonder if somebody messed
this up somewhere along the way.)

            regards, tom lane



Re: minor bug

From
Laurenz Albe
Date:
On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
> Laurenz Albe <laurenz.albe@cybertec.at> writes:
> > On Mon, 2023-01-16 at 19:59 +0100, Torsten Förtsch wrote:
> > > So, the timestamp displayed in the log message is certainly wrong.
>
> > If recovery stops at a WAL record that has no timestamp, you get this
> > bogus recovery stop time.  I think we should show the recovery stop time
> > only if time was the target, as in the attached patch.
>
> I don't think that is a tremendously useful definition: the user
> already knows what recoveryStopTime is, or can find it out from
> their settings easily enough.
>
> I seem to recall that the original idea was to report the timestamp
> of the commit/abort record we are stopping at.  Maybe my memory is
> faulty, but I think that'd be significantly more useful than the
> current behavior, *especially* when the replay stopping point is
> defined by something other than time.
>
> (Also, the wording of the log message suggests that that's what
> the reported time is supposed to be.  I wonder if somebody messed
> this up somewhere along the way.)

recoveryStopTime is set to recordXtime (the time of the xlog record)
a few lines above that patch, so this is useful information if it is
present.

I realized that my original patch might be a problem for translation;
here is an updated version that does not take any shortcuts.

Yours,
Laurenz Albe

Attachment

Re: minor bug

From
Tom Lane
Date:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
>> I seem to recall that the original idea was to report the timestamp
>> of the commit/abort record we are stopping at.  Maybe my memory is
>> faulty, but I think that'd be significantly more useful than the
>> current behavior, *especially* when the replay stopping point is
>> defined by something other than time.
>> (Also, the wording of the log message suggests that that's what
>> the reported time is supposed to be.  I wonder if somebody messed
>> this up somewhere along the way.)

> recoveryStopTime is set to recordXtime (the time of the xlog record)
> a few lines above that patch, so this is useful information if it is
> present.

Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
Digging in the git history, I see that this did use to work as
I remember: we always extracted the record time before printing it.
That was accidentally broken in refactoring in c945af80c.  I think
the correct fix is more like the attached.

            regards, tom lane

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5e65785306..c14d1f3ef6 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2548,8 +2548,13 @@ recoveryStopsBefore(XLogReaderState *record)
         stopsHere = (recordXid == recoveryTargetXid);
     }

-    if (recoveryTarget == RECOVERY_TARGET_TIME &&
-        getRecordTimestamp(record, &recordXtime))
+    /*
+     * Note: we must fetch recordXtime regardless of recoveryTarget setting.
+     * We don't expect getRecordTimestamp ever to fail, since we already know
+     * this is a commit or abort record; but test its result anyway.
+     */
+    if (getRecordTimestamp(record, &recordXtime) &&
+        recoveryTarget == RECOVERY_TARGET_TIME)
     {
         /*
          * There can be many transactions that share the same commit time, so

Re: minor bug

From
Laurenz Albe
Date:
On Wed, 2023-01-18 at 15:03 -0500, Tom Lane wrote:
> Laurenz Albe <laurenz.albe@cybertec.at> writes:
> > On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
> > > I seem to recall that the original idea was to report the timestamp
> > > of the commit/abort record we are stopping at.  Maybe my memory is
> > > faulty, but I think that'd be significantly more useful than the
> > > current behavior, *especially* when the replay stopping point is
> > > defined by something other than time.
> > > (Also, the wording of the log message suggests that that's what
> > > the reported time is supposed to be.  I wonder if somebody messed
> > > this up somewhere along the way.)
>
> > recoveryStopTime is set to recordXtime (the time of the xlog record)
> > a few lines above that patch, so this is useful information if it is
> > present.
>
> Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
> Digging in the git history, I see that this did use to work as
> I remember: we always extracted the record time before printing it.
> That was accidentally broken in refactoring in c945af80c.  I think
> the correct fix is more like the attached.

Yes, you are right.  Your patch looks fine to me.

Yours,
Laurenz Albe



Re: minor bug

From
Torsten Förtsch
Date:
If we never expect getRecordTimestamp to fail, then why put it in the if-condition?

getRecordTimestamp can fail if the record is not a restore point nor a commit or abort record. A few lines before in the same function there is this:

 /* Otherwise we only consider stopping before COMMIT or ABORT records. */
if (XLogRecGetRmid(record) != RM_XACT_ID)
    return false;

I guess that make sure getRecordTimestamp can never fail.

The way it is written in your patch invites it to be optimized out again. The only thing that prevents it is the comment.

Why not

(void)getRecordTimestamp(record, &recordXtime);
if (recoveryTarget == RECOVERY_TARGET_TIME)
...




On Wed, Jan 18, 2023 at 9:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
>> I seem to recall that the original idea was to report the timestamp
>> of the commit/abort record we are stopping at.  Maybe my memory is
>> faulty, but I think that'd be significantly more useful than the
>> current behavior, *especially* when the replay stopping point is
>> defined by something other than time.
>> (Also, the wording of the log message suggests that that's what
>> the reported time is supposed to be.  I wonder if somebody messed
>> this up somewhere along the way.)

> recoveryStopTime is set to recordXtime (the time of the xlog record)
> a few lines above that patch, so this is useful information if it is
> present.

Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
Digging in the git history, I see that this did use to work as
I remember: we always extracted the record time before printing it.
That was accidentally broken in refactoring in c945af80c.  I think
the correct fix is more like the attached.

                        regards, tom lane

Re: minor bug

From
Tom Lane
Date:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Wed, 2023-01-18 at 15:03 -0500, Tom Lane wrote:
>> Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
>> Digging in the git history, I see that this did use to work as
>> I remember: we always extracted the record time before printing it.
>> That was accidentally broken in refactoring in c945af80c.  I think
>> the correct fix is more like the attached.

> Yes, you are right.  Your patch looks fine to me.

Pushed then.  Thanks for the report!

            regards, tom lane



Re: minor bug

From
Tom Lane
Date:
=?UTF-8?Q?Torsten_F=C3=B6rtsch?= <tfoertsch123@gmail.com> writes:
> Why not

> (void)getRecordTimestamp(record, &recordXtime);
> if (recoveryTarget == RECOVERY_TARGET_TIME)
> ...

Could've done it like that, but I already pushed the other
version, and I don't think it's worth the trouble to change.

            regards, tom lane