Re: pg_multixact not getting truncated - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: pg_multixact not getting truncated
Date
Msg-id 54582AB4.8060105@agliodbs.com
Whole thread Raw
In response to pg_multixact not getting truncated  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On 11/03/2014 05:06 PM, Alvaro Herrera wrote:
> Josh Berkus wrote:
>> Hackers,
>>
>> I'm looking at a couple of high-transaction-rate and high-FK-conflict
>> rate servers where pg_multixact has grown to be more than 1GB in size.
>> One such server doesn't appear to be having any notable issues with
>> vacuuming, and the oldest mxid on the system is about 47m old. VACUUM
>> FREEZEing the oldest databases did not cause the pg_multixact dir to get
>> smaller --- it may have even caused it to get larger.
>>
>> Why would pg_multixact not be truncating?  Does it never truncate files
>> with aborted multixacts in them?  Might we have another multixact bug?
> 
> Have a look at the MultiXactId values in pg_controldata, datminmxid in
> pg_database, and relminmxid in pg_class.  They should advance as you
> VACUUM FREEZE.  If it's stuck at 1, you might be in pg_upgrade trouble
> if you used 9.3.4 or earlier to upgrade.

They're advancing.  I also know for a fact that this system was not
pg_upgraded, because I set it up.  It's always been on 9.3.5.

So after some vacuum freezing and a checkpoint, the oldestMultiXid
advanced by 9 million, about 20% of the gap with current multiXid.
However, that did not delete 20% of the files; before the Freeze there
were 4414 files, and now there are 4195.  So advancing 9m mxids resulted
in only deleting ~~130 files.  So I think we definitely have an issue here.

BTW, the reason I started poking into this was a report from a user that
they have a pg_multixact directory which is 21GB in size, and is 2X the
size of the database.

Here's XID data:

Latest checkpoint's NextXID:          0/1126461940
Latest checkpoint's NextOID:          371135838
Latest checkpoint's NextMultiXactId:  162092874
Latest checkpoint's NextMultiOffset:  778360290
Latest checkpoint's oldestXID:        945761490
Latest checkpoint's oldestXID's DB:   370038709
Latest checkpoint's oldestActiveXID:  1126461940
Latest checkpoint's oldestMultiXid:   123452201
Latest checkpoint's oldestMulti's DB: 370038709

Oldest mxid file is 29B2, newest is 3A13

No tables had a relminmxid of 1 (some of the system tables were 0,
though), and the data from pg_class and pg_database is consistent.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Pipelining executions to postgresql server
Next
From: Demai Ni
Date:
Subject: [PG-XC] how to pass a value(String) from coordinator node to all data nodes