Re: New trigger option of pg_standby - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: New trigger option of pg_standby
Date
Msg-id 3f0b79eb0904122252t1f4c5860v919d199f399ccf72@mail.gmail.com
Whole thread Raw
In response to Re: New trigger option of pg_standby  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: New trigger option of pg_standby  (Guillaume Smet <guillaume.smet@gmail.com>)
Re: New trigger option of pg_standby  (Simon Riggs <simon@2ndQuadrant.com>)
Re: New trigger option of pg_standby  (Fujii Masao <masao.fujii@gmail.com>)
Re: New trigger option of pg_standby  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Hi,

On Sat, Apr 11, 2009 at 1:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> Fujii-san,
>
> I like the new patch using the content of the file to determine the
> mode. Much easier to use at failover time.
>
> On Fri, 2009-04-10 at 12:47 +0900, Fujii Masao wrote:
>
>> > One problem with this patch is that in smart mode, the trigger file is not
>> > deleted. That's different from current pg_standby behavior, and makes
>> > accidental failovers after one failover more likely.
>>
>> Yes, it's because pg_standby cannot be sure when the trigger file
>> can be removed in smart mode. If the trigger file is deleted as soon
>> as it's found, just like in fast mode, pg_standby may keep waiting
>> for WAL file again.
>
> My understanding of smart mode is fairly simple:
>
>   if (triggered)
>   {
>        if (smartMode && nextWALfile+1 exists)
>                exit(0);
>        else
>        {
>                delete trigger file
>                exit(1);
>        }
>   }
>
> If you perform a file lookahead (the +1) as shown above then you avoid
> the problem Heikki observes.

Thanks for the suggestion!

A lookahead (the +1) may have pg_standby get stuck as follows.
Am I missing something?

1. the trigger file containing "smart" is created.
2. pg_standby is executed.   2-1. nextWALfile is restored.   2-2. the trigger file is deleted because nextWALfile+1
doesn'texist. 
3. the restored nextWALfile is applied.
4. pg_standby is executed again to restore nextWALfile+1.
5. pg_standby gets stuck because the trigger file and nextWALfile+1   don't exist.

But, a lookahead nextWALfile seems to work fine.

if (triggered)
{   if (smartMode && nextWALfile exists)       exit(0)   else   {       delete trigger file       exit(1)   }
}

1. the trigger file containing "smart" is created.
2. pg_standby is executed.   2-1. nextWALfile is restored.
3. the restored nextWALfile is applied.
4. pg_standby is executed again to restore nextWALfile+1.   4-1. the trigger file is deleted because nextWALfile+1
doesn'texist. 
5. the startup process fails to read nextWALfile+1.
6. pg_standby is executed again to re-fetch nextWALfile.   6-1. nextWALfile is restored.   6-2. pg_standby doesn't get
stuckbecause nextWALfile exists. 

Furthermore, pg_standby may have to check if nextWALfile exists
not only in archiveLocation but also in pg_xlog. Because, when
pg_xlog of the primary server can be read at failover, WAL files
in it may be copied to pg_xlog of the standby server to be applied.
(but, not sure if it's better to copy such files to pg_xlog instead of
archiveLocation in this case).

Comments?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: New trigger option of pg_standby
Next
From: Abhijit Menon-Sen
Date:
Subject: Re: [PATCH] Add a test for pg_get_functiondef()