Re: base backup client as auxiliary backend process - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: base backup client as auxiliary backend process
Date
Msg-id 20191107041630.GK1768@paquier.xyz
Whole thread Raw
In response to Re: base backup client as auxiliary backend process  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: base backup client as auxiliary backend process  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On Mon, Oct 28, 2019 at 09:30:52AM +0100, Peter Eisentraut wrote:
> On 2019-09-18 10:31, Michael Paquier wrote:
>> -    * Verify XLOG status looks valid.
>> +    * Check that contents look valid.
>>       */
>> -   if (ControlFile->state < DB_SHUTDOWNED ||
>> -       ControlFile->state > DB_IN_PRODUCTION ||
>> -       !XRecOffIsValid(ControlFile->checkPoint))
>> +   if (!XRecOffIsValid(ControlFile->checkPoint))
>>               ereport(FATAL,
>> Doesn't seem like a good idea to me to remove this sanity check for
>> normal deployments, but actually you moved that down in StartupXLOG().
>> It seems to me tha this is unrelated and could be a separate patch so
>> as the errors produced are more verbose.  I think that we should also
>> change that code to use a switch/case on ControlFile->state.
>
> Done.  Yes, this was really a change made to get more precise error messaged
> during debugging.  It could be committed separately.

If you wish to do so now, that's fine by me.

>> The current defaults of pg_basebackup have been thought so as the
>> backups taken have a good stability and so as monitoring is eased
>> thanks to --wal-method=stream and the use of replication slots.
>> Shouldn't the use of a least a temporary replication slot be mandatory
>> for the stability of the copy?  It seems to me that there is a good
>> argument for having a second process which streams WAL on top of the
>> main backup process, and just use a WAL receiver for that.
>
> Is this something that the walreceiver should be doing independent of this
> patch?

There could be an argument for switching a WAL receiver to use a
temporary replication slot by default.  Still, it seems to me that
this backup solution suffers from the same set of problems we have
spent years to fix with pg_basebackup with missing WAL files caused by
concurrent checkpoints removing things needed while the copy of the
main data folder and other tablespaces happens.

>> One problem which is not tackled here is what to do for the tablespace
>> map.  pg_basebackup has its own specific trick for that, and with that
>> new feature we may want something equivalent?  Not something to
>> consider as a first stage of course.
>
> The updated has support for tablespaces without mapping.  I'm thinking about
> putting the mapping specification into a GUC list somehow. Shouldn't be too
> hard.

That may become ugly if there are many tablespaces to take care of.
Another idea I can come up with would be to pass the new mapping to
initdb, still this requires an extra intermediate step to store the
new map, and then compare it with the mapping received at BASE_BACKUP
time.  But perhaps you are looking for an experience different than
pg_basebackup.  The first version of the patch does not actually
require that anyway..

>> No more strtol with base 10 stuff please :)
>
> Hmm, why not?  What's the replacement?

I was referring to this patch:
https://commitfest.postgresql.org/25/2272/
It happens that all our calls of strtol in core use a base of 10.  But
please just ignore this part.

ReceiveAndUnpackTarFile() is in both libpqwalreceiver.c and
pg_basebackup.c.  It would be nice to refactor that.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Problem during Windows service start
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Using multiple extended statistics for estimates