Re: PATCH: Exclude additional directories in pg_basebackup - Mailing list pgsql-hackers

From Robert Haas
Subject Re: PATCH: Exclude additional directories in pg_basebackup
Date
Msg-id CA+TgmoZGX_oGLcacRMnAsfmwVxw6euXHTSD3qxbW3n_pAqav1g@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: Exclude additional directories in pg_basebackup  (David Steele <david@pgmasters.net>)
Responses Re: PATCH: Exclude additional directories in pg_basebackup  (David Steele <david@pgmasters.net>)
List pgsql-hackers
On Wed, Aug 17, 2016 at 2:50 PM, David Steele <david@pgmasters.net> wrote:
> Hi Robert,
>
> On 8/17/16 11:27 AM, Robert Haas wrote:
>> On Mon, Aug 15, 2016 at 3:39 PM, David Steele <david@pgmasters.net> wrote:
>>> Recently a hacker proposed a patch to add pg_dynshmem to the list of
>>> directories whose contents are excluded in pg_basebackup.  I wasn't able
>>> to find the original email despite several attempts.
>>>
>>> That patch got me thinking about what else could be excluded and after
>>> some investigation I found the following: pg_notify, pg_serial,
>>> pg_snapshots, pg_subtrans.  These directories are all cleaned, zeroed,
>>> or rebuilt on server start.
>>
>> Eh ... I doubt very much that it's safe to blow away the entire
>> contents of an SLRU between shutdown and startup, even if the data is
>> technically transient data that won't be needed again after the system
>> is reset.
>
> I've done pretty extensive testing in pgBackRest and haven't seen issues
> in any supported version (plus I audited each init() function for every
> version back to where it was introduced).  The patch also passes all the
> pg_basebackup TAP tests in master.
>
> If you are correct it may indicate a problem anyway. Consider a standby
> backup where the files in these directories may be incredibly stale
> since they are not replicated.  Once restored to a master should we
> trust anything in these files?
>
> pg_serial, pg_notify, pg_subtrans are not even fsync'd
> (SlruCtl->do_fsync = false).  It's hard to imagine there's anything of
> value in there or that it can be trusted if there is.

It's not just a question of whether the data has value; it's a
question of whether the SLRU code will handle the situation correctly
in all cases if the directory contains no files.  I don't think you
can draw a firm conclusion on that without reading the code.

> The files in pg_snapshot and pg_dynshmem are simply deleted on startup
> so that seems safe enough.

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Curing plpgsql's memory leaks for statement-lifespan values
Next
From: Claudio Freire
Date:
Subject: Re: Use pread and pwrite instead of lseek + write and read