On Thu, Oct 28, 2021 at 11:39:52AM -0400, Robert Haas wrote:
> I previously reported[1] that CreateReplicationSlot() was accessing
> the global variable ThisTimeLineID when it might not be initialized.
> Today I realized that the code in basebackup.c has the same problem.
> perform_base_backup() blithely uses the variable six times without
> doing anything at all to initialize it. In practice, it's always going
> to be initialized to some non-zero value, because pg_basebackup is
> always going to execute IDENTIFY_SYSTEM before it executes
> BASE_BACKUP, and that's going to set ThisTimeLineID as a side effect.
> But, if you hack pg_basebackup to not call IDENTIFY_SYSTEM before
> calling BASE_BACKUP, then you can reach this code with ThisTimeLineID
> == 0 using pg_basebackup -Xfetch. Even if you don't do that, you're
> only guaranteed that ThisTimeLineID is initialized to something, not
> that it's initialized to the correct thing. The timeline on the
> standby can change any time.
Yes, I agree that it is a good idea to cut the dependency of those
code paths with ThisTimeLineID, expecting IDENTIFY_SYSTEM to have done
the job beforehand. One argument in favor of your change, though I'd
like to think that nobody does so, is that users could run BASE_BACKUP
with a replication connection. So no need to hack pg_basebackup to be
able to finish with a WAL sender that has no TLI set in the backend.
Your patch seems correct to me.
--
Michael