On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
> >> Simon Riggs <simon@2ndQuadrant.com> writes:
> >>> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> >>>> When a backend dies with FATAL, it writes an abort record before exiting.
> >>>>
> >>>> (I was under the impression it doesn't until few minutes ago myself,
> >>>> when I actually read the shutdown code :-))
> >>> Not in all cases; keep reading :-)
> >> If it doesn't, that's a bug. A FATAL exit is not supposed to leave the
> >> shared state corrupted, it's only supposed to be a forcible session
> >> termination. Any open transaction should be rolled back.
> >
> > Please look back at the earlier discussion we had on this exact point:
> > http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php
>
> I think the running-xacts list we dump to WAL at every checkpoint is
> enough to handle that. Just treat the dead transaction as in-progress
> until the next running-xacts record. It's presumably extremely rare to
> have a process die with FATAL, and not write an abort record.
I agree, but I'll wait for Tom to speak further.
> A related issue is that currently the recovery PANICs if it runs out of
> recovery procs. I think that's not acceptable, regardless of whether we
> use slotids or some other method to avoid it in normal operation,
> because it means you can't recover at all if you set max_connections too
> low in the standby (or in the primary, and you have to recover from
> crash), or you run out of recovery procs because of an abort failed in
> the primary like discussed on that thread.
> The standby should just
> fast-forward to the next running-xacts record in that case.
What do you mean by "fast forward"?
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support