Re: New trigger option of pg_standby - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: New trigger option of pg_standby
Date
Msg-id 49F01D79.8020805@enterprisedb.com
Whole thread Raw
In response to Re: New trigger option of pg_standby  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: New trigger option of pg_standby  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
Fujii Masao wrote:
> On Wed, Apr 22, 2009 at 4:27 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Fujii Masao wrote:
>>> On Tue, Apr 14, 2009 at 2:41 PM, Fujii Masao <masao.fujii@gmail.com>
>>> wrote:
>>>> I'd like to propose another simple idea; pg_standby deletes the
>>>> trigger file *whenever* the nextWALfile is a timeline history file.
>>>> A timeline history file is restored at the end of recovery, so it's
>>>> guaranteed that the trigger file is deleted whether nextWALfile
>>>> exists or not.
>>>>
>>>> A timeline history file is restored also at the beginning of
>>>> recovery, so the accidentally remaining trigger file is deleted
>>>> in early warm-standby as a side-effect of this idea.
>>> Here is the revised patch as above.
>>>
>>> If you notice something, please feel free to comment.
>> Ok, looking at this in more detail now. A couple of small things:
>>
>> We mustn't remove the trigger file immediately even in fast mode. As noted
>> elsewhere in this thread, we have the same bug in fast mode where pg_standby
>> gets stuck if you copy WAL files directly into pg_xlog.
> 
> Yes, there is the same problem also in fast mode. But, in fast
> mode, the trigger file has to be deleted immediately if it's found.
> Otherwise, recovery may fail as follows.
> 
> 1. pg_standby finds the trigger file for fast mode, and returns
>     non-zero without deleting the trigger file.
> 2. the startup process tries to read the WAL file from pg_xlog,
>     but it's not found.
> 3. the startup process tries to restore the last applied WAL file
>     using pg_standby.
> 4. (Again) pg_standby finds the trigger file for fast mode, and
>     returns non-zero without deleting the trigger file.
> 5. the startup process tries to read the last applied WAL file,
>     but it's not found.
>     (though the last applied file was of course restored before,
>      the restored one cannot be read here)
> 6. recovery fails because the last applied WAL file cannot be
>     read.
> 
> On the other hand, if pg_standby returns 0 also in fast mode
> when the requested file and trigger file exist, ISTM that there
> is not much difference between fast and smart mode; also in
> fast mode, all the available WAL files would be applied.

Hmm, pg_standby could truncate the trigger file, so that it acts like a 
smart trigger in the subsequent pg_standby invocations. Assuming the 
postgres user has write access to it; it probably does because it can 
delete it, but conceivably it has only read access on the file but write 
access on the directory it's in.

This is getting complicated, though. I guess it would be enough to 
document that you mustn't copy any extra files into pg_xlog if you use a 
fast trigger.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: citex regression fails with de.UTF8 locale
Next
From: Zdenek Kotala
Date:
Subject: Re: citex regression fails with de.UTF8 locale