Re: pg_multixact not getting truncated - Mailing list pgsql-hackers
From | Jim Nasby |
---|---|
Subject | Re: pg_multixact not getting truncated |
Date | |
Msg-id | 545A6F29.7080704@BlueTreble.com Whole thread Raw |
In response to | Re: pg_multixact not getting truncated (Josh Berkus <josh@agliodbs.com>) |
List | pgsql-hackers |
On 11/3/14, 7:40 PM, Josh Berkus wrote: > On 11/03/2014 05:24 PM, Josh Berkus wrote: >> BTW, the reason I started poking into this was a report from a user that >> they have a pg_multixact directory which is 21GB in size, and is 2X the >> size of the database. >> >> Here's XID data: >> >> Latest checkpoint's NextXID: 0/1126461940 >> Latest checkpoint's NextOID: 371135838 >> Latest checkpoint's NextMultiXactId: 162092874 >> Latest checkpoint's NextMultiOffset: 778360290 >> Latest checkpoint's oldestXID: 945761490 >> Latest checkpoint's oldestXID's DB: 370038709 >> Latest checkpoint's oldestActiveXID: 1126461940 >> Latest checkpoint's oldestMultiXid: 123452201 >> Latest checkpoint's oldestMulti's DB: 370038709 >> >> Oldest mxid file is 29B2, newest is 3A13 >> >> No tables had a relminmxid of 1 (some of the system tables were 0, >> though), and the data from pg_class and pg_database is consistent. > > More tidbits: > > I just did a quick check on customer systems (5 of them). This only > seems to be happening on customer systems where I specifically know > there is a high number of FK lock waits (the system above gets around > 1000 per minute that we know of). Other systems with higher transaction > rates don't exhibit this issue; I checked a 9.3.5 database which > normally needs to do XID wraparound once every 10 days, and it's > pg_multixact is only 48K (it has one file, 0000). > > Note that pg_clog on the bad machine is only 64K in size. > > How many IDs are there per mxid file? #define MULTIXACT_OFFSETS_PER_PAGE (BLCKSZ / sizeof(MultiXactOffset)) So for 8k blocks, there are 2k offsets (really MultiXactIds) per page, 32 pages per SLRU segment. Your file names aren'tmaking sense to me. :( If I'm doing the math correctly, 29B2 is MXID 699 531 264 and 3A13 is 974 323 712. You're onlylooking in pg_multixact/members/, yes? Relevant code starts in vacuum.c/vac_update_datfrozenxid() If there's any rows in pg_class for tables/matviews/toast with either relfrozenxid > next XID or relminmxid > next MXID thenthe code *silently* pulls the plug right there. IMO we should at least issue a warning. That you see relminxid advancing tells me this isn't the case here. ForceTransactionIdLimitUpdate() is a bit suspect in that it only looks at xidVacLimit, but if it were breaking then you wouldn'tsee pg_database minmxid advancing. Looking through TruncateMultiXact, I don't see anything that could prevent truncation, unless the way we're handing MultiXactIDwraparound is broken (which I don't see any indication of). Can you post the contents of pg_multixact/members/? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
pgsql-hackers by date: