Re: ERROR: multixact X from before cutoff Y found to be still running - Mailing list pgsql-bugs
From | Jeremy Schneider |
---|---|
Subject | Re: ERROR: multixact X from before cutoff Y found to be still running |
Date | |
Msg-id | 32134d69-1999-6518-e1d6-e268a24cb8aa@amazon.com Whole thread Raw |
In response to | Re: ERROR: multixact X from before cutoff Y found to be still running (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: ERROR: multixact X from before cutoff Y found to be still running
Re: ERROR: multixact X from before cutoff Y found to be still running Re: ERROR: multixact X from before cutoff Y found to be still running Re: ERROR: multixact X from before cutoff Y found to be still running |
List | pgsql-bugs |
On 9/4/19 21:01, Thomas Munro wrote:
On this particular production system, autovacuum_multixact_freeze_max_age is the default value of 400 million and it is not overridden for any tables. Looks to me like this was just workload driven. There are a number of FKs and those seem to be a likely candidate to me.I suppose this requires you to run out of member space (for example many backends key sharing the same FK) or maybe just set autovacuum_multixact_freeze_max_age quite low, and then prolong the life of a multixact for longer.
The really interesting thing about this case is that the only long-running connection was the autovacuum that had been running since Sunday. While we were investigating yesterday, the autovacuum process died without advancing relfrozenxid (users configured this system with poor logging, so it's not known whether autovac terminated from error or from a user who logged on to the system). As soon as the autovacuum process died, we stopped getting the "multixact X from before cutoff Y" errors.Does the problem fix itself once you close the transaction that's in the oldest multixact, ie holding back GetOldestMultiXact() from advancing?
It really appears that it was the autovacuum process itself that was providing the oldest running multixact which caused errors on yesterday's attempts to vacuum other tables - even though I though vacuum processes were ignored by that code. I'll have to take another look at some point.
Vacuum cost parameters had been adjusted after Sunday, so the original autovacuum would have used default settings. Naturally, a new autovacuum process started up right away. This new process - definitely using adjusted cost parameters - completed the vacuum of the large table with 5 passes (index_vacuum_count) in a couple hours. Maintenance work memory was already at the max; there were many hundreds of millions of dead tuples that still remained to be cleaned up.
The size of the large table (heap only) was about 75% of the memory on the server, and the table had three indexes each about half the size of the table. The storage was provisioned at just over 10k IOPS; at this rate you could read all three indexes from the storage one block at a time in about an hour. (And Linux should be reading more than a block at a time.)
It is not known whether the original autovacuum failed to completely vacuum the large table in 3 days because of cost settings alone or because there's another latent bug somewhere in the autovacuum code that put it into some kind of loop (but if autovac hit the error above then the PID would have terminated). We didn't manage to get a pstack.
That's my take as well. I don't think there's any data corruption risk here.Since VACUUM errors out, we don't corrupt data, right? Everyone else is still going to see the multixact as running and do the right thing because vacuum never manages to (bogusly) freeze the tuple.
If anyone else ever hits this in the future, I think it's safe to just kill the oldest open session. The error should go away and there shouldn't be any risk of damage to the database.
Thanks for taking a look!Both patches prevent mxactLimit from being newer than the oldest running multixact. The v1 patch uses the most aggressive setting possible: the oldest running multi; the v2 uses the least aggressive of the 'safe' and oldest running multi. At first glance it seems like the second one is better: it only does something different if we're in the dangerous scenario you identified, but otherwise it sticks to the safe limit, which generates less IO.
-Jeremy
-- Jeremy Schneider Database Engineer Amazon Web Services
pgsql-bugs by date: