Re: ERROR: found multixact XX from before relminmxid YY - Mailing list pgsql-general

From Andres Freund
Subject Re: ERROR: found multixact XX from before relminmxid YY
Date
Msg-id 20181231060750.ollda4qdvfhngeqt@alap3.anarazel.de
Whole thread Raw
In response to Re: ERROR: found multixact XX from before relminmxid YY  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi,

On 2018-12-28 19:49:36 -0500, Tom Lane wrote:
> Mark Fletcher <markf@corp.groups.io> writes:
> > Starting yesterday morning, auto vacuuming of one of our postgresql 9.6.10
> > (CentOS 7) table's started failing:
> > ERROR:  found multixact 370350365 from before relminmxid 765860874
> > CONTEXT:  automatic vacuum of table "userdb.public.subs"
>
> Ugh.
>
> > Reading the various discussions about this error, the only solution I found
> > was here:
> > https://www.postgresql.org/message-id/CAGewt-ukbL6WL8cc-G%2BiN9AVvmMQkhA9i2TKP4-6wJr6YOQkzA%40mail.gmail.com
> > But no other reports of this solving the problem. Can someone verify that
> > if I do the mentioned fix (and I assume upgrade to 9.6.11) that will fix
> > the problem? And that it doesn't indicate table corruption?
>
> Yeah, SELECT FOR UPDATE should overwrite the broken xmax value and thereby
> fix it, I expect.

Right.

> However, I don't see anything in the release notes
> suggesting that we've fixed any related bugs since 9.6.10, so if this
> just appeared then we've still got a problem :-(.  Did anything
> interesting happen since your last successful autovacuum on that table?
> Database crashes, WAL-related parameter changes, that sort of thing?

I think it's entirely conceivable that the damage happened with earlier versions,
and just became visible now as the global horizon increased.

Greetings,

Andres Freund


pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: In which session context is a trigger run?
Next
From: Achilleas Mantzios
Date:
Subject: Re: logical replication resiliency