Re: 9.0beta2 - server crash when using HS + SR - Mailing list pgsql-hackers

From Greg Smith
Subject Re: 9.0beta2 - server crash when using HS + SR
Date
Msg-id 4C1574F0.2070304@2ndquadrant.com
Whole thread Raw
In response to Re: 9.0beta2 - server crash when using HS + SR  (Rafael Martinez <r.m.guerrero@usit.uio.no>)
Responses Re: 9.0beta2 - server crash when using HS + SR  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
Rafael Martinez wrote:
> A minimum and probably the only feasible thing for 9.0 will be to update
> the documentation. We need an entry in the hot-standby caveats section
> explaining that if you create a tablespace and the directory needed does
> not exist in the the standby, the standby will shutdown itself and will
> not be able to start until the directory is also created in the standby.
>   

This is not a Hot Standby problem, and it's been documented since at 
least http://www.postgresql.org/docs/8.2/static/warm-standby.html ; read 
25.2.1 "Planning" in the current 
http://developer.postgresql.org/pgdocs/postgres/warm-standby.html where 
it's spelled out quite clearly.

It's a mixed blessing that it's now possible to actually get a 
replicated server up so much more easily that people don't have to read 
that particular document quite as carefully now and still get something 
going.  But if there's a documentation change to made, it should be 
highlighting the warning already in that section better; it's not 
something appropriate for the Hot Standby caveats.  Since this is 
clearly documented already, and there are bigger problems to worry about 
for the current release, the real minimum action to perform here (and 
the only one I would consider reasonable) is to change nothing at this 
point for 9.0 here.  I'm sorry you missed where this was covered, but 
adding redundant documentation for basics like this invariably leads to 
the multiple copies becoming out of sync with one another as changes are 
made in the future.

> 1) PostgreSQL creates the directory needed for the tablespace if the
> user running postgres has privileges to do so at the OS level.
> 2) The standby discovers that the directory needed does not exist and
> pauses the recovering (without shutting down the server) in the WAL
> record that creates the tablespace. The standby will check periodically
> if the directory is created before starting the recovery process again.
>   

Given that the idea behind a tablespace is that you want to relocate it 
to a specific storage path, which may not map in the same way on the 
standby, your first idea will never get implemented; it's not something 
you want the server to guess about.  As for the second, I would rather 
see the standby go down--and hopefully set off some serious alarms for 
the DBA who has screwed up here--than to stay up in a dysfunctional 
polling state.  The very serious mistake made is far more likely to be 
discovered the way it's built right now.

I wouldn't be adverse to improving the error messages emitted when this 
happens by the server to make it more obvious what's gone wrong in 9.1.  
That's the only genuine improvement I'd see value in here, to cut down 
on other people running into what you did and being as confused by it.

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Command Prompt 8.4.4 PRMs compiled with debug/assert enabled
Next
From: Greg Smith
Date:
Subject: Re: Patch to show individual statement latencies in pgbench output