Re: PATCH: Exclude unlogged tables from base backups - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: PATCH: Exclude unlogged tables from base backups
Date
Msg-id CAD21AoAT8WgHYt0xXo7erdt=hcvbJrYtZCd96djKS_vZ8UD2Pw@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: Exclude unlogged tables from base backups  (David Steele <david@pgmasters.net>)
Responses Re: PATCH: Exclude unlogged tables from base backups
Re: PATCH: Exclude unlogged tables from base backups
List pgsql-hackers
On Fri, Jan 26, 2018 at 4:58 AM, David Steele <david@pgmasters.net> wrote:
> On 1/25/18 12:31 AM, Masahiko Sawada wrote:
>> On Thu, Jan 25, 2018 at 3:25 AM, David Steele <david@pgmasters.net> wrote:
>>>>
>>>> Here is the first review comments.
>>>>
>>>> +       unloggedDelim = strrchr(path, '/');
>>>>
>>>> I think it doesn't work fine on windows. How about using
>>>> last_dir_separator() instead?
>>>
>>> I think this function is OK on Windows -- we use it quite a bit.
>>> However, last_dir_separator() is clearer so I have changed it.
>>
>> Thank you for updating this. I was concerned about a separator
>> character '/' might not work fine on windows.
>
> Ah yes, I see what you mean now.
>
>> On Thu, Jan 25, 2018 at 6:23 AM, David Steele <david@pgmasters.net> wrote:
>>> On 1/24/18 4:02 PM, Adam Brightwell wrote:
>>> Actually, I was talking to Stephen about this it seems like #3 would be
>>> more practical if we just stat'd the init fork for each relation file
>>> found.  I doubt the stat would add a lot of overhead and we can track
>>> each unlogged relation in a hash table to reduce overhead even more.
>>>
>>
>> Can the readdir handle files that are added during the loop? I think
>> that we still cannot exclude a new unlogged relation if the relation
>> is added after we execute readdir first time. To completely eliminate
>> it we need a sort of lock that prevents to create new unlogged
>> relation from current backends. Or we need to do readdir loop multiple
>> times to see if no new relations were added during sending files.
>
> As far as I know readdir() is platform-dependent in terms of how it
> scans the dir and if files created after the opendir() will appear.
>
> It shouldn't matter, though, since WAL replay will recreate those files.

Yea, agreed.

>
>> If you're updating the patch to implement #3, this patch should be
>> marked as "Waiting on Author". After updated I'll review it again.
> Attached is a new patch that uses stat() to determine if the init fork
> for a relation file exists.  I decided not to build a hash table as it
> could use considerable memory and I didn't think it would be much faster
> than a simple stat() call.
>
> The reinit.c refactor has been removed since it was no longer needed.
> I'll submit the tests I wrote for reinit.c as a separate patch for the
> next CF.
>

Thank you for updating the patch! The patch looks good to me. But I
have a question; can we exclude temp tables as well? The pg_basebackup
includes even temp tables. But I don't think that it's necessary for
backups.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Next
From: Simon Riggs
Date:
Subject: Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions