Re: multixacts woes - Mailing list pgsql-hackers

From Noah Misch
Subject Re: multixacts woes
Date
Msg-id 20150510174012.GA3618689@tornado.leadboat.com
Whole thread Raw
In response to multixacts woes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: multixacts woes
List pgsql-hackers
On Fri, May 08, 2015 at 02:15:44PM -0400, Robert Haas wrote:
> My colleague Thomas Munro and I have been working with Alvaro, and
> also with Kevin and Amit, to fix bug #12990, a multixact-related data
> corruption bug.

Thanks Alvaro, Amit, Kevin, Robert and Thomas for mobilizing to get this fixed.

> 1. I believe that there is still a narrow race condition that cause
> the multixact code to go crazy and delete all of its data when
> operating very near the threshold for member space exhaustion. See
> http://www.postgresql.org/message-id/CA+TgmoZiHwybETx8NZzPtoSjprg2Kcr-NaWGajkzcLcbVJ1pKQ@mail.gmail.com
> for the scenario and proposed fix.

For anyone else following along, Thomas's subsequent test verified this threat
beyond reasonable doubt:

http://www.postgresql.org/message-id/CAEepm=3C32VPJLOo45y0c3-3KWXNV2xM4jaPTSVjCRD2VG0Qgg@mail.gmail.com

> 2. We have some logic that causes autovacuum to run in spite of
> autovacuum=off when wraparound threatens.  My commit
> 53bb309d2d5a9432d2602c93ed18e58bd2924e15 provided most of the
> anti-wraparound protections for multixact members that exist for
> multixact IDs and for regular XIDs, but this remains an outstanding
> issue.  I believe I know how to fix this, and will work up an
> appropriate patch based on some of Thomas's earlier work.

That would be good to have, and its implementation should be self-contained.

> 3. It seems to me that there is a danger that some users could see
> extremely frequent anti-mxid-member-wraparound vacuums as a result of
> this work.  Granted, that beats data corruption or errors, but it
> could still be pretty bad.  The default value of
> autovacuum_multixact_freeze_max_age is 400000000.
> Anti-mxid-member-wraparound vacuums kick in when you exceed 25% of the
> addressable space, or 1073741824 total members.  So, if your typical
> multixact has more than 1073741824/400000000 = ~2.68 members, you're
> going to see more autovacuum activity as a result of this change.
> We're effectively capping autovacuum_multixact_freeze_max_age at
> 1073741824/(average size of your multixacts).  If your multixacts are
> just a couple of members (like 3 or 4) this is probably not such a big
> deal.  If your multixacts typically run to 50 or so members, your
> effective freeze age is going to drop from 400m to ~21.4m.  At that
> point, I think it's possible that relminmxid advancement might start
> to force full-table scans more often than would be required for
> relfrozenxid advancement.  If so, that may be a problem for some
> users.

I don't know whether this deserves prompt remediation, but if it does, I would
look no further than the hard-coded 25% figure.  We permit users to operate
close to XID wraparound design limits.  GUC maximums force an anti-wraparound
vacuum at no later than 93.1% of design capacity.  XID assignment warns at
99.5%, then stops at 99.95%.  PostgreSQL mandates a larger cushion for
pg_multixact/offsets, with anti-wraparound VACUUM by 46.6% and a stop at
50.0%.  Commit 53bb309d2d5a9432d2602c93ed18e58bd2924e15 introduced the
bulkiest mandatory cushion yet, an anti-wraparound vacuum when
pg_multixact/members is just 25% full.  The pgsql-bugs thread driving that
patch did reject making it GUC-controlled, essentially on the expectation that
25% should be adequate for everyone:

http://www.postgresql.org/message-id/CA+Tgmoap6-o_5ESu5X2mBRVht_F+KNoY+oO12OvV_WekSA=ezQ@mail.gmail.com
http://www.postgresql.org/message-id/20150506143418.GT2523@alvh.no-ip.org
http://www.postgresql.org/message-id/1570859840.1241196.1430928954257.JavaMail.yahoo@mail.yahoo.com

> What can we do about this?  Alvaro proposed back-porting his fix for
> bug #8470, which avoids locking a row if a parent subtransaction
> already has the same lock.

Like Andres and yourself, I would not back-patch it.

> Another thought that occurs to me is that if we had a freeze map, it
> would radically decrease the severity of this problem, because
> freezing would become vastly cheaper.  I wonder if we ought to try to
> get that into 9.5, even if it means holding up 9.5.

Declaring that a release will wait for a particular feature has consistently
ended badly for PostgreSQL, and this feature is just in the planning stages.
If folks are ready to hit the ground running on the project, I suggest they do
so; a non-WIP submission to the first 9.6 CF would be a big accomplishment.
The time to contemplate slipping it into 9.5 comes after the patch is done.

If these aggressive ideas earn more than passing consideration, the 25%
threshold should become user-controllable after all.



pgsql-hackers by date:

Previous
From: José Luis Tallón
Date:
Subject: Re: multixacts woes
Next
From: Andrew Dunstan
Date:
Subject: Re: multixacts woes