Re: Where art thou pg_clog? - Mailing list pgsql-general

From Casey Duncan
Subject Re: Where art thou pg_clog?
Date
Msg-id DBA1FE21-4783-499E-9065-4C646F93DF01@pandora.com
Whole thread Raw
In response to Re: Where art thou pg_clog?  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Where art thou pg_clog?  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-general
On Feb 15, 2007, at 5:21 PM, Alvaro Herrera wrote:

> Casey Duncan wrote:
>
>>>> Interestingly I can manually vacuum that table in all of the
>>>> databases on this machine without provoking the error.
>>>
>>> Except template0 I presume?  Is this autovacuum running in template0
>>> perchance?  I note that 800 million transactions have passed
>>> since the
>>> Xid in the error message was current.
>>
>> Wouldn't you know it! A little farther back up in the log file:
>>
>> 2007-02-15 14:20:48.480 PST LOG:  autovacuum: processing database
>> "template0"
>> 2007-02-15 14:20:48.480 PST DEBUG:  StartTransaction
>> 2007-02-15 14:20:48.480 PST DEBUG:  name: unnamed; blockState:
>> DEFAULT; state: INPROGR, xid/subid/cid: 3429052629/1/0, nestlvl: 1,
>> children: <>
>> 2007-02-15 14:20:48.481 PST DEBUG:  autovacuum: VACUUM FREEZE whole
>> database
>> 2007-02-15 14:20:48.481 PST DEBUG:  CommitTransaction
>> 2007-02-15 14:20:48.481 PST DEBUG:  name: unnamed; blockState:
>> STARTED; state: INPROGR, xid/subid/cid: 3429052629/1/0, nestlvl: 1,
>> children: <>
>
> This is a bug we fixed in 8.1.7.  I suggest you update to the
> latest of
> the 8.1 series, to get that fix among others.

ok, great.

> To fix the problem, set pg_database.datallowconn=true for template0,
> then connect to it and do a VACUUM FREEZE.  Then set
> datallowconn=false
> again.

Do you mean to do this after upgrading to 8.1.8? If I try than in
8.1.5, I get (unsurprisingly):

% psql -U postgres template0 -c "vacuum freeze"
ERROR:  could not access status of transaction 2565134864
DETAIL:  could not open file "pg_clog/098E": No such file or directory

>
>> I'm curious how template0 got stomped on. Certainly nothing's been
>> changing it. Of course it might just be some random bug so the fact
>> it landed on a file for template0 could be completely arbitrary.
>
> The problem is that all databases are vacuumed every so many
> transactions, to avoid Xid wraparound problems; even non connectable
> databases.  The problem is that a bug in autovacuum caused that vacuum
> operation to neglect using the FREEZE flag; this negligence makes it
> leave non-permanent Xids in the tables, leading to the problem you're
> seeing.

Ironically we were earlier bitten by the bug that autovacuum didn't
do the cluster-wide vacuum until too late. Now we got bitten by the
fact that did do the cluster-wide vacuum. Talk about damned-if-you-do-
and-damned-if-you-don't! 8^)

ok, this is a much better sounding explanation than "random data
corruption" ;^)

Thanks!

-Casey

pgsql-general by date:

Previous
From: vanessa
Date:
Subject: How to use slash commands in a function
Next
From: Alvaro Herrera
Date:
Subject: Re: Where art thou pg_clog?