Re: Hard limit on WAL space used (because PANIC sucks) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Hard limit on WAL space used (because PANIC sucks)
Date
Msg-id 20140122154155.GJ21170@alap3.anarazel.de
Whole thread Raw
In response to Re: Hard limit on WAL space used (because PANIC sucks)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Hard limit on WAL space used (because PANIC sucks)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2014-01-21 21:42:19 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-01-21 19:45:19 -0500, Tom Lane wrote:
> >> I don't think that's a comparable case.  Incomplete actions are actions
> >> to be taken immediately, and which the replayer then has to complete
> >> somehow if it doesn't find the rest of the action in the WAL sequence.
> >> The only thing to be done with the records I'm proposing is to remember
> >> their contents (in some fashion) until it's time to apply them.  If you
> >> hit end of WAL you don't really have to do anything.
> 
> > Would that work for the promotion case as well? Afair there's the
> > assumption that everything >= TransactionXmin can be looked up in
> > pg_subtrans or in the procarray - which afaics wouldn't be the case with
> > your scheme? And TransactionXmin could very well be below such an
> > "incomplete commit"'s xids afaics.
> 
> Uh, what?  The behavior I'm talking about is *exactly the same*
> as what happens now.  The only change is that the data sent to the
> WAL file is laid out a bit differently, and the replay logic has
> to work harder to reassemble it before it can apply the commit or
> abort action.  If anything outside replay can detect a difference
> at all, that would be a bug.
> 
> Once again: the replayer is not supposed to act immediately on the
> subsidiary records.  It's just supposed to remember their contents
> so it can reattach them to the eventual commit or abort record,
> and then do what it does today to replay the commit or abort.

I (think) I get what you want to do, but splitting the record like that
nonetheless opens up behaviour that previously wasn't there. Imagine we
promote inbetween replaying the list of subxacts (only storing it in
memory) and the main commit record. Either we have something like the
incomplete action stuff doing something with the in-memory data, or we
are in a situation where there can be xids bigger than TransactionXmin
that are not in pg_subtrans and not in the procarray. Which I don't
think exists today since we either read the commit record in it's
entirety or not.
We'd also need to use the MyPgXact->delayChkpt mechanism to prevent
checkpoints from occuring inbetween those records, but we do that
already, so that seems rather uncontroversial.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Changeset Extraction v7.1
Next
From: Robert Haas
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance