Re: pg_rewind failure by file deletion in source server - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: pg_rewind failure by file deletion in source server
Date
Msg-id CAB7nPqRQiQCJU8NNK87tAfAeCWnWCo191QNVCjxhn07uh0d1Fw@mail.gmail.com
Whole thread Raw
In response to Re: pg_rewind failure by file deletion in source server  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: pg_rewind failure by file deletion in source server  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On Wed, Jun 24, 2015 at 1:40 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Tue, Jun 23, 2015 at 11:21 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> On 06/23/2015 05:03 PM, Fujii Masao wrote:
>>>
>>> On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinnaka@iki.fi>
>>> wrote:
>>>>
>>>> On 06/23/2015 07:51 AM, Michael Paquier wrote:
>>>>>
>>>>>
>>>>> So... Attached are a set of patches dedicated at fixing this issue:
>>>>
>>>>
>>>>
>>>> Thanks for working on this!
>>>>
>>>>> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if
>>>>> path does not exist
>>>>> - 0002, same with pg_stat_file, returning NULL if file does not exist
>>>>> - 0003, same with pg_read_*file. I added them to all the existing
>>>>> functions for consistency.
>>>>> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs
>>>>> (thanks Robert for the naming!)
>>>>> - 0005, as things get complex, a set of regression tests aimed to
>>>>> covering those things. pg_tablespace_location is platform-dependent,
>>>>> so there are no tests for it.
>>>>> - 0006, the fix for pg_rewind, using what has been implemented before.
>>>>
>>>>
>>>>
>>>> With thes patches, pg_read_file() will return NULL for any failure to
>>>> open
>>>> the file, which makes pg_rewind to assume that the file doesn't exist in
>>>> the
>>>> source server, and will remove the file from the destination. That's
>>>> dangerous, those functions should check specifically for ENOENT.
>>>
>>>
>>> I'm wondering if using pg_read_file() to copy the file from source server
>>> is reasonable. ISTM that it has two problems as follows.
>>>
>>> 1. It cannot read very large file like 1GB file. So if such large file was
>>>      created in source server after failover, pg_rewind would not be able
>>>      to copy the file. No?
>>
>>
>> pg_read_binary_file() handles large files just fine. It cannot return more
>> than 1GB in one call, but you can call it several times and retrieve the
>> file in chunks. That's what pg_rewind does, except for reading the control
>> file, which is known to be small.
>
> Yeah, you're right.
>
> I found that pg_rewind creates a temporary table to fetch the file in chunks.
> This would prevent pg_rewind from using the *hot standby* server as a source
> server at all. This is of course a limitation of pg_rewind, but we might want
> to alleviate it in the future.

This is something that a replication command could address properly.

>>> 2. Many users may not allow a remote client to connect to the
>>>      PostgreSQL server as a superuser for some security reasons. IOW,
>>>      there would be no entry in pg_hba.conf for such connection.
>>>      In this case, pg_rewind always fails because pg_read_file() needs
>>>      superuser privilege. No?
>>>
>>> I'm tempting to implement the replication command version of
>>> pg_read_file(). That is, it reads and sends the data like BASE_BACKUP
>>> replication command does...
>>
>>
>> Yeah, that would definitely be nice. Peter suggested it back in January
>> (http://www.postgresql.org/message-id/54AC4801.7050300@gmx.net). I think
>> it's way too late to do that for 9.5, however. I'm particularly worried that
>> if we design the required API in a rush, we're not going to get it right,
>> and will have to change it again soon. That might be difficult in a minor
>> release. Using pg_read_file() and friends is quite flexible, even though we
>> just find out that they're not quite flexible enough right now (the ENOENT
>> problem).
>
> I agree that it's too late to do what I said...
>
> But just using pg_read_file() cannot address the #2 problem that I pointed
> in my previous email. Also requiring a superuser privilege on pg_rewind
> really conflicts with the motivation why we added replication privilege.
>
> So we should change pg_read_file() so that even replication user can
> read the file?

From the security prospective, a replication user can take a base
backup so it can already retrieve easily the contents of PGDATA. Hence
I guess that it would be fine. However, what about cases where
pg_hba.conf authorizes access to a given replication user via psql and
blocks it for the replication protocol? We could say that OP should
not give out replication access that easily, but in this case the user
would have access to the content of PGDATA even if he should not. Is
that unrealistic?

> Or replication user version of pg_read_file() should be implemented?

You mean a new function? In what is it different from authorizing
pg_read_file usage for a replication user?

Honestly, I can live with this superuser restriction in 9.5. And come
back to the replication user restriction in 9.6 once things cool down
a bit.
-- 
Michael



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Should we back-patch SSL renegotiation fixes?
Next
From: Robert Haas
Date:
Subject: Re: Should we back-patch SSL renegotiation fixes?