Re: Hot standby, slot ids and stuff - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Hot standby, slot ids and stuff
Date
Msg-id 1231500040.18005.350.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: Hot standby, slot ids and stuff  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Hot standby, slot ids and stuff  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Re: Hot standby, slot ids and stuff  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
> >> Simon Riggs <simon@2ndQuadrant.com> writes:
> >>> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> >>>> When a backend dies with FATAL, it writes an abort record before exiting.
> >>>>
> >>>> (I was under the impression it doesn't until few minutes ago myself, 
> >>>> when I actually read the shutdown code :-))
> >>> Not in all cases; keep reading :-)
> >> If it doesn't, that's a bug.  A FATAL exit is not supposed to leave the
> >> shared state corrupted, it's only supposed to be a forcible session
> >> termination.  Any open transaction should be rolled back.
> > 
> > Please look back at the earlier discussion we had on this exact point:
> > http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php
> 
> I think the running-xacts list we dump to WAL at every checkpoint is 
> enough to handle that. Just treat the dead transaction as in-progress 
> until the next running-xacts record. It's presumably extremely rare to 
> have a process die with FATAL, and not write an abort record.

I agree, but I'll wait for Tom to speak further.

> A related issue is that currently the recovery PANICs if it runs out of 
> recovery procs. I think that's not acceptable, regardless of whether we 
> use slotids or some other method to avoid it in normal operation, 
> because it means you can't recover at all if you set max_connections too 
> low in the standby (or in the primary, and you have to recover from 
> crash), or you run out of recovery procs because of an abort failed in 
> the primary like discussed on that thread. 

> The standby should just 
> fast-forward to the next running-xacts record in that case.

What do you mean by "fast forward"?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Zeugswetter Andreas OSB sIT
Date:
Subject: Re: Improving compressibility of WAL files
Next
From: Heikki Linnakangas
Date:
Subject: Re: Hot standby, slot ids and stuff