Re: Rework the way multixact truncations work - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Rework the way multixact truncations work
Date
Msg-id 20150923184850.GK1573@awork2.anarazel.de
Whole thread Raw
In response to Re: Rework the way multixact truncations work  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Rework the way multixact truncations work  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Rework the way multixact truncations work  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-hackers
On 2015-09-23 15:03:05 -0300, Alvaro Herrera wrote:
> The comment on top of TrimMultiXact states that "no locks are needed
> here", but then goes on to grab a few locks.

Hm. Yea. Although that was the case before.

> It's a bit odd that SetMultiXactIdLimit has the "finishedStartup" test
> so low.  Why bother setting all those local variables only to bail
> out?

Hm. Doesn't seem to matter much to me, but I can change it.

> In MultiXactAdvanceOldest, the test for sawTruncationinCkptCycle seems
> reversed?
>         if (!MultiXactState->sawTruncationInCkptCycle)
> surely we should be doing truncation if it's set?

No, that's correct. If there was a checkpoint cycle where oldestMulti
advanced without seing a truncation record we need to perform a legacy
truncation.

> Honestly, I wonder whether this message
>             ereport(LOG,
>                     (errmsg("performing legacy multixact truncation"),
>                      errdetail("Legacy truncations are sometimes performed when replaying WAL from an older
primary."),
>                      errhint("Upgrade the primary, it is susceptible to data corruption.")));
> shouldn't rather be a PANIC.  (The main reason not to, I think, is that
> once you see this, there is no way to put the standby in a working state
> without recloning).

Huh? The behaviour in that case is still better than what we have in
9.3+ today (not delayed till the restartpoint). Don't see why that
should be a panic. That'd imo make it pretty much impossible to upgrade
a pair of primary/master where you normally upgrade the standby first?

This is all moot given Robert's objection to backpatching this to
9.3/4.

> If the find_multixact_start(oldestMulti) call in TruncateMultiXact
> fails, what recourse does the user have?  I wonder if the elog() should
> be a FATAL instead of just LOG.  It's not like it would work on a
> subsequent run, is it?

It currently only LOGs, I don't want to change that. The cases where we
currently know it's possible to hit this, it should be fixed by the next
set of emergency autovacuums (which we trigger).

Thanks for the look,

Andres



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: unclear about row-level security USING vs. CHECK
Next
From: Stephen Frost
Date:
Subject: Re: unclear about row-level security USING vs. CHECK