Thread: pg_rewind enhancements

pg_rewind enhancements

From
RKN Sai Krishna
Date:
Hi,

While using pg_rewind, I found that it is a bit difficult to use pg_rewind as it seems to copy even the configuration files and also remove some of the files created on the old primary which may not be present on the new primary. Similarly it copies files under the data directory of the new primary which may not be needed or which possibly could be junk files.

I would propose to have a couple of new command line arguments to pg_rewind. One, a comma separated list of files which should be preserved on the old primary, in other words which shouldn't be overwritten from the new primary. Second, a comma separated list of files which should be excluded while copying files from the new primary onto the old primary.

Would like to invite more thoughts from the hackers.

Regards,
RKN

Re: pg_rewind enhancements

From
Bharath Rupireddy
Date:
On Fri, Mar 4, 2022 at 7:50 PM RKN Sai Krishna
<rknsaiforpostgres@gmail.com> wrote:
>
> Hi,
>
> While using pg_rewind, I found that it is a bit difficult to use pg_rewind as it seems to copy even the configuration
filesand also remove some of the files created on the old primary which may not be present on the new primary.
Similarlyit copies files under the data directory of the new primary which may not be needed or which possibly could be
junkfiles. 

It's possible that the postgres vendors can have their own
files/directories in the data directory which they may not want to be
overwritten by the pg_rewind. Also, if the source server is
compromised (somebody put in some junk file) for whatever reasons,
nobody wants those files to pass over to the target server.

> I would propose to have a couple of new command line arguments to pg_rewind. One, a comma separated list of files
whichshould be preserved on the old primary, in other words which shouldn't be overwritten from the new primary. 

+1 from my end to have a new pg_rewind option such as --skip-file-list
or --skip-list which is basically a list of files that pg_rewind will
not overwrite in the target directory.

> Second, a comma separated list of files which should be excluded while copying files from the new primary onto the
oldprimary. 

I'm not sure how it is different from the above option
--skip-file-list or --skip-list?

Another idea I can think of is to be able to tell pg_rewind "don't
copy/bring in any non-postgres files/directories from source server to
target server". This requires pg_rewind to know what are
postgres/non-postgres files/directories. Probably, we could define a
static list of what a postgres files/directories constitute, but this
creates tight-coupling with the core, say a new directory or
configuration file gets added to the core, this static list in
pg_rewind needs to be updated. Having said that initdb.c already has
this sort of list [1], we need similar kind of structures and probably
another structure for files (postgresql.auto.conf, postgresql.conf,
pg_ident.conf, pg_hba.conf, postmaster.opts, backup_label,
standby.signal, recovery.signal etc.).

Above option seems an overkill, but --skip-file-list or --skip-list is
definitely an improvement IMO.

[1] static const char *const subdirs[] = {
    "global",
    "pg_wal/archive_status",
    "pg_commit_ts",
    "pg_dynshmem",

Regards,
Bharath Rupireddy.