On Mon, 2009-04-13 at 14:52 +0900, Fujii Masao wrote:
> A lookahead (the +1) may have pg_standby get stuck as follows.
> Am I missing something?
>
> 1. the trigger file containing "smart" is created.
> 2. pg_standby is executed.
> 2-1. nextWALfile is restored.
> 2-2. the trigger file is deleted because nextWALfile+1 doesn't exist.
> 3. the restored nextWALfile is applied.
> 4. pg_standby is executed again to restore nextWALfile+1.
This can't happen. (4) will never occur when (2-2) has occurred. A
non-zero error code means file not available which will cause recovery
to end and hence no requests for further WAL files are made.
It does *seem* as if there is a race condition there in that another WAL
file may arrive after we have taken the decision there are no more WAL
files, but it's not a problem. That could happen if we issue the trigger
while the master is still up, which is a mistake - why would we do that?
If we only issue the trigger once we are happy the master is down then
we don't get a problem.
So lets do it the next+1 way, when triggered.
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support