Home > mailing lists

Re: clog_redo causing very long recovery time - Mailing list pgsql-hackers

From	Joe Conway
Subject	Re: clog_redo causing very long recovery time
Date	May 6, 2011 03:41:20
Msg-id	4DC36DD6.8030405@joeconway.com Whole thread Raw
In response to	Re: clog_redo causing very long recovery time (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: clog_redo causing very long recovery time (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On 05/05/2011 08:22 PM, Tom Lane wrote:
> Joseph Conway <mail@joeconway.com> writes:
>> The attached fix-clogredo diff is my proposal for a fix for this.
>
> That seems pretty grotty :-(
>
> I think a more elegant fix might be to just swap the order of the
> ExtendCLOG and ExtendSUBTRANS calls in GetNewTransactionId.  The
> reason that would help is that pg_subtrans isn't WAL-logged, so if
> we succeed doing ExtendSUBTRANS and then fail in ExtendCLOG, we
> won't have written any XLOG entry, and thus repeated failures will not
> result in repeated XLOG entries.  I seem to recall having considered
> exactly that point when the clog WAL support was first done, but the
> scenario evidently wasn't considered when subtransactions were stuck
> in :-(.

Yes, that does seem much nicer :-)

> It would probably also help to put in a comment admonishing people
> to not add stuff right there.  I see the SSI guys have fallen into
> the same trap.

Right -- I think another similar problem exists in GetNewMultiXactId
where ExtendMultiXactOffset could succeed and write an XLOG entry and
then  ExtendMultiXactMember could fail before advancing nextMXact. The
problem in this case is that they both write XLOG entries, so a simple
reversal doesn't help.

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support

pgsql-hackers by date:

From: Alvaro Herrera
Date: 06 May 2011, 03:29:43
Subject: Re: clog_redo causing very long recovery time

From: Tom Lane
Date: 06 May 2011, 04:01:14
Subject: Re: clog_redo causing very long recovery time

Re: clog_redo causing very long recovery time - Mailing list pgsql-hackers

Previous

Next