Thread: BUG #16780: Inconsistent recovery_target_xid handling across platforms

BUG #16780: Inconsistent recovery_target_xid handling across platforms

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16780
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 13.1
Operating system:   Windows
Description:

While testing src/test/recovery with a non-zero epoch the inconsistency was
discovered.
The 003_recovery_targets.pl test with the slight modification passes on
Linux x86_64, but fails on Windows x64.
The modification is to append just after the $node_master->init:
command([ 'pg_resetwal', '--epoch=1', $node_master->data_dir ]);
And on Windows I get:
t/003_recovery_targets.pl ............ 2/9 Bailout called.  Further testing
stopped:  pg_ctl start failed
FAILED--Further testing stopped: pg_ctl start failed

regress_log_003_recovery_targets contains:
# pg_ctl start failed; logfile:
2020-12-18 12:25:11.597 MSK [1312] LOG:  invalid value for parameter
"recovery_target_xid": "4294967782"
2020-12-18 12:25:11.597 MSK [1312] FATAL:  configuration file
".../src/test/recovery/tmp_check/t_003_recovery_targets_standby_2_data/pgdata/postgresql.conf"
contains errors

So on Windows the recovery_target_xid parameter doesn't accept 64-bit
value.
As I see in check_recovery_target_xid() the strtoul() function is used to
check/convert the input value, that is later stored into 32-bit variable.
Thus, on platforms where unsigned long is 32-bit wide, you can't specify a
result of pg_current_xact_id() as recovery_target_xid, while on platforms
where unsigned long is 64-bit, you can, but high 32 bits are just ignored.


Re: BUG #16780: Inconsistent recovery_target_xid handling across platforms

From
Michael Paquier
Date:
On Fri, Dec 18, 2020 at 10:00:00AM +0000, PG Bug reporting form wrote:
> So on Windows the recovery_target_xid parameter doesn't accept 64-bit
> value.
> As I see in check_recovery_target_xid() the strtoul() function is used to
> check/convert the input value, that is later stored into 32-bit variable.
> Thus, on platforms where unsigned long is 32-bit wide, you can't specify a
> result of pg_current_xact_id() as recovery_target_xid, while on platforms
> where unsigned long is 64-bit, you can, but high 32 bits are just ignored.

In order to make the experience consistent across all platforms, we
can use pg_strtouint64() for the GUC parsing.  I agree that ignoring
the high bits can be confusing as recovery would stop once a XID
matches with a modulo of UINT32_MAX, but beginning to issue an error
on platforms where unsigned long is 8 bytes could also break existing
tools.  At the same time, WAL record headers and 2PC state data only
use 32-bit XIDs, so ignoring the high bits changes nothing at
recovery (XIDs don't go across more than one epoch in the WAL stream
AFAIK).  In short, I agree that there is room for improvement here and
that we should just allow the case to work by replacing the use of
strtoul() to pg_strtouint64().  I am ready to bet that a lot of users
setting a target XID rely on the return result of txid_current() or
pg_current_xact_id() similarly to the test failing here.  And once the
epoch increments, the problems begin.

Thoughts?
--
Michael

Attachment

Re: BUG #16780: Inconsistent recovery_target_xid handling across platforms

From
Michael Paquier
Date:
On Mon, Dec 21, 2020 at 04:13:17PM +0900, Michael Paquier wrote:
> In order to make the experience consistent across all platforms, we
> can use pg_strtouint64() for the GUC parsing.  I agree that ignoring
> the high bits can be confusing as recovery would stop once a XID
> matches with a modulo of UINT32_MAX, but beginning to issue an error
> on platforms where unsigned long is 8 bytes could also break existing
> tools.  At the same time, WAL record headers and 2PC state data only
> use 32-bit XIDs, so ignoring the high bits changes nothing at
> recovery (XIDs don't go across more than one epoch in the WAL stream
> AFAIK).  In short, I agree that there is room for improvement here and
> that we should just allow the case to work by replacing the use of
> strtoul() to pg_strtouint64().  I am ready to bet that a lot of users
> setting a target XID rely on the return result of txid_current() or
> pg_current_xact_id() similarly to the test failing here.  And once the
> epoch increments, the problems begin.

So, attached is a patch to make the logic more consistent that I would
like to apply down to 12.  Please let me know if there are any
objections or comments.
--
Michael

Attachment

Re: BUG #16780: Inconsistent recovery_target_xid handling across platforms

From
Michael Paquier
Date:
On Tue, Dec 22, 2020 at 02:02:46PM +0900, Michael Paquier wrote:
> So, attached is a patch to make the logic more consistent that I would
> like to apply down to 12.  Please let me know if there are any
> objections or comments.

I have looked at that again this morning, and applied a fix down to
9.6 with the change in the TAP tests based on pg_resetwal.  In 9.5,
pg_strtouint64() is not available, but this branch is old enough that
it did not seem worth the extra effort.
--
Michael

Attachment

Re: BUG #16780: Inconsistent recovery_target_xid handling across platforms

From
Alexander Lakhin
Date:
23.12.2020 07:44, Michael Paquier wrote:
> On Tue, Dec 22, 2020 at 02:02:46PM +0900, Michael Paquier wrote:
>> So, attached is a patch to make the logic more consistent that I would
>> like to apply down to 12.  Please let me know if there are any
>> objections or comments.
> I have looked at that again this morning, and applied a fix down to
> 9.6 with the change in the TAP tests based on pg_resetwal.  In 9.5,
> pg_strtouint64() is not available, but this branch is old enough that
> it did not seem worth the extra effort.
Thank you!
I think that with this change, if the check for epoch will be added
someday, it will work consistently too.

Best regards,
Alexander



Re: BUG #16780: Inconsistent recovery_target_xid handling across platforms

From
Michael Paquier
Date:
On Wed, Dec 23, 2020 at 09:00:00AM +0300, Alexander Lakhin wrote:
> I think that with this change, if the check for epoch will be added
> someday, it will work consistently too.

WAL record size has always been a sensitive topic, and this would add
4 extra bytes to each record.  So that's hard to say.
--
Michael

Attachment