Re: pg_rewind failure by file deletion in source server - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: pg_rewind failure by file deletion in source server
Date
Msg-id 55896B58.1000906@iki.fi
Whole thread Raw
In response to Re: pg_rewind failure by file deletion in source server  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: pg_rewind failure by file deletion in source server  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On 06/23/2015 05:03 PM, Fujii Masao wrote:
> On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> On 06/23/2015 07:51 AM, Michael Paquier wrote:
>>>
>>> So... Attached are a set of patches dedicated at fixing this issue:
>>
>>
>> Thanks for working on this!
>>
>>> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if
>>> path does not exist
>>> - 0002, same with pg_stat_file, returning NULL if file does not exist
>>> - 0003, same with pg_read_*file. I added them to all the existing
>>> functions for consistency.
>>> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs
>>> (thanks Robert for the naming!)
>>> - 0005, as things get complex, a set of regression tests aimed to
>>> covering those things. pg_tablespace_location is platform-dependent,
>>> so there are no tests for it.
>>> - 0006, the fix for pg_rewind, using what has been implemented before.
>>
>>
>> With thes patches, pg_read_file() will return NULL for any failure to open
>> the file, which makes pg_rewind to assume that the file doesn't exist in the
>> source server, and will remove the file from the destination. That's
>> dangerous, those functions should check specifically for ENOENT.
>
> I'm wondering if using pg_read_file() to copy the file from source server
> is reasonable. ISTM that it has two problems as follows.
>
> 1. It cannot read very large file like 1GB file. So if such large file was
>      created in source server after failover, pg_rewind would not be able
>      to copy the file. No?

pg_read_binary_file() handles large files just fine. It cannot return 
more than 1GB in one call, but you can call it several times and 
retrieve the file in chunks. That's what pg_rewind does, except for 
reading the control file, which is known to be small.

> 2. Many users may not allow a remote client to connect to the
>      PostgreSQL server as a superuser for some security reasons. IOW,
>      there would be no entry in pg_hba.conf for such connection.
>      In this case, pg_rewind always fails because pg_read_file() needs
>      superuser privilege. No?
>
> I'm tempting to implement the replication command version of
> pg_read_file(). That is, it reads and sends the data like BASE_BACKUP
> replication command does...

Yeah, that would definitely be nice. Peter suggested it back in January 
(http://www.postgresql.org/message-id/54AC4801.7050300@gmx.net). I think 
it's way too late to do that for 9.5, however. I'm particularly worried 
that if we design the required API in a rush, we're not going to get it 
right, and will have to change it again soon. That might be difficult in 
a minor release. Using pg_read_file() and friends is quite flexible, 
even though we just find out that they're not quite flexible enough 
right now (the ENOENT problem).

- Heikki




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: get_relation_info comment out of sync
Next
From: Robert Haas
Date:
Subject: Re: SSL TAP tests and chmod