Re: too many clog files - Mailing list pgsql-performance

From Matt Smiley
Subject Re: too many clog files
Date
Msg-id 48C6F7A5.D078.0028.0@rentrak.com
Whole thread Raw
In response to Re: too many clog files  ("Matt Smiley" <mss@rentrak.com>)
Responses Re: too many clog files
List pgsql-performance
Alvaro Herrera wrote:
> Move the old clog files back where they were, and run VACUUM FREEZE in
> all your databases.  That should clean up all the old pg_clog files, if
> you're really that desperate.

Has anyone actually seen a CLOG file get removed under 8.2 or 8.3?  How about 8.1?

I'm probably missing something, but looking at src/backend/commands/vacuum.c (under 8.2.9 and 8.3.3), it seems like
vac_truncate_clog()scans through *all* tuples of pg_database looking for the oldest datfrozenxid.  Won't that always be
template0,which as far as I know can never be vacuumed (or otherwise connected to)? 

postgres=# select datname, datfrozenxid, age(datfrozenxid), datallowconn from pg_database order by age(datfrozenxid),
datname; 
     datname      | datfrozenxid |   age    | datallowconn
------------------+--------------+----------+--------------
 template1        |     36347792 |     3859 | t
 postgres         |     36347733 |     3918 | t
 mss_test         |     36347436 |     4215 | t
 template0        |          526 | 36351125 | f
(4 rows)

I looked at several of my 8.2 databases' pg_clog directories, and they all have all the sequentially numbered segments
(0000through current segment).  Would it be reasonable for vac_truncate_clog() to skip databases where datallowconn is
false(i.e. template0)?  Looking back to the 8.1.13 code, it does exactly that: 
                if (!dbform->datallowconn)
                        continue;

Also, Duan, if you have lots of files under pg_clog, you may be burning through transactions faster than necessary.  Do
yourapplications leave autocommit turned on?  And since no one else mentioned it, as a work-around for a small
filesystemyou can potentially shutdown your database, move the pg_clog directory to a separate filesystem, and create a
symlinkto it under your PGDATA directory.  That's not a solution, just a mitigation. 



pgsql-performance by date:

Previous
From: "Scott Marlowe"
Date:
Subject: Re: How to measure IO performance?
Next
From: "Mark Wong"
Date:
Subject: Effects of setting linux block device readahead size