Re: PATCH: Exclude additional directories in pg_basebackup - Mailing list pgsql-hackers

From David Steele
Subject Re: PATCH: Exclude additional directories in pg_basebackup
Date
Msg-id 7cd6011b-8f10-8062-a6b0-a9a6035d0ac7@pgmasters.net
Whole thread Raw
In response to Re: PATCH: Exclude additional directories in pg_basebackup  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 8/17/16 2:53 PM, Robert Haas wrote:
> On Wed, Aug 17, 2016 at 2:50 PM, David Steele <david@pgmasters.net> wrote:
>> Hi Robert,
>>
>> On 8/17/16 11:27 AM, Robert Haas wrote:
>>> On Mon, Aug 15, 2016 at 3:39 PM, David Steele <david@pgmasters.net> wrote:
>>>> Recently a hacker proposed a patch to add pg_dynshmem to the list of
>>>> directories whose contents are excluded in pg_basebackup.  I wasn't able
>>>> to find the original email despite several attempts.
>>>>
>>>> That patch got me thinking about what else could be excluded and after
>>>> some investigation I found the following: pg_notify, pg_serial,
>>>> pg_snapshots, pg_subtrans.  These directories are all cleaned, zeroed,
>>>> or rebuilt on server start.
>>>
>>> Eh ... I doubt very much that it's safe to blow away the entire
>>> contents of an SLRU between shutdown and startup, even if the data is
>>> technically transient data that won't be needed again after the system
>>> is reset.
>>
>> If you are correct it may indicate a problem anyway. Consider a standby
>> backup where the files in these directories may be incredibly stale
>> since they are not replicated.  Once restored to a master should we
>> trust anything in these files?
>>
>> pg_serial, pg_notify, pg_subtrans are not even fsync'd
>> (SlruCtl->do_fsync = false).  It's hard to imagine there's anything of
>> value in there or that it can be trusted if there is.
> 
> It's not just a question of whether the data has value; it's a
> question of whether the SLRU code will handle the situation correctly
> in all cases if the directory contains no files.  I don't think you
> can draw a firm conclusion on that without reading the code.

A good point, Robert.  I read the code but I should have shared my
findings in the original post. I'll only cover the the cases we have not
already decided were safe (those that explicitly delete files in their
init routines, pg_snapshots and pg_dynshmem).

First, SlruPhysicalWritePage() will create a file/page that does not
exist and SlruPhysicalReadPage() will return zeros for a file/page that
does not exist during recovery.  Not that I think the latter is
important for pg_serial, pg_notify, or pg_subtrans since they are not
WAL-logged.  As far as I can see the slru system is very resilient overall.

For pg_notify, AsyncShmemInit() explicitly deletes all files on startup.

For pg_subtrans, StartupSUBTRANS() explicitly zeroes all required pages
but does not flush them to disk.  Since SlruPhysicalWritePage() is
tolerant of missing files/pages this should be fine.

For pg_serial, OldSerXidInit() doesn't zero anything out since it
ignores the old transactions and they get truncated as the transaction
horizon moves.  The old data hangs around for a little while so it could
theoretically be used for debugging as Kevin notes.  Since pg_serial
would only be cleared after a restore I don't see that as a big issue.

-- 
-David
david@pgmasters.net



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Why we lost Uber as a user
Next
From: Tom Lane
Date:
Subject: Re: regexp_match() returning text