Re: PATCH: Exclude unlogged tables from base backups - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: PATCH: Exclude unlogged tables from base backups
Date
Msg-id 20171213014804.GH4628@tamriel.snowman.net
Whole thread Raw
In response to Re: PATCH: Exclude unlogged tables from base backups  (Andres Freund <andres@anarazel.de>)
Responses Re: PATCH: Exclude unlogged tables from base backups  (David Steele <david@pgmasters.net>)
Re: PATCH: Exclude unlogged tables from base backups  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Andres,

* Andres Freund (andres@anarazel.de) wrote:
> On 2017-12-12 18:04:44 -0500, David Steele wrote:
> > If the forks are written out of order (i.e. main before init), which is
> > definitely possible, then I think worst case is some files will be backed up
> > that don't need to be.  The main fork is unlikely to be very large at that
> > point so it doesn't seem like a big deal.
> >
> > I don't see this as any different than what happens during recovery. The
> > unlogged forks are cleaned / re-inited before replay starts which is the
> > same thing we are doing here.
>
> It's quite different - in the recovery case there's no other write
> activity going on. But on a normally running cluster the persistence of
> existing tables can get changed, and oids can get recycled.  What
> guarantees that between the time you checked for the init fork the table
> hasn't been dropped, the oid reused and now a permanent relation is in
> its place?

We *are* actually talking about the recovery case here because this is a
backup that's happening and WAL replay will be happening after the
pg_basebackup is done and then the backup restored somewhere and PG
started up again.

If the persistence is changed then the table will be written into the
WAL, no?  All of the WAL generated during a backup (which is what we're
talking about here) has to be replayed after the restore is done and is
before the database is considered consistent, so none of this matters,
as far as I can see, because the drop table or alter table logged or
anything else will be in the WAL that ends up getting replayed.

If that's not correct, then isn't there a live issue here with how
backups are happening today with unlogged tables and online backups?

I don't think there is, because, as David points out, the unlogged
tables are cleaned up first and then WAL replay happens during recovery,
so the init fork will cause the relation to be overwritten, but then
later the logged 'drop table' and subsequent re-use of the relfilenode
to create a new table (or persistence change) will all be in the WAL and
will be replayed over top and will take care of this.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Transactions involving multiple postgres foreign servers
Next
From: "David G. Johnston"
Date:
Subject: Re: proposal: alternative psql commands quit and exit