Re: Template zero xid issue - Mailing list pgsql-general

From Keaton Adams
Subject Re: Template zero xid issue
Date
Msg-id 0B34A6972BF39E4CB465A64DBBAD2BB902261DCB@mxlhq-exch01.corp.mxlogic.com
Whole thread Raw
In response to Re: Template zero xid issue  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Template zero xid issue
List pgsql-general
We needed the DB up right away; one of the operations guys connected
through a standalone backend to vacuum template0 and now the DB is up
again:

"sudo -u postgres /mxl/var/pgsql/bin/postgres -D /mxl/var/pgsql/data -O
-P template0

    Once I got in, I just ran a VACUUM VERBOSE ANALYZE; to "reset"
the XID counter."



So just to be clear on this statement:

"You misunderstand what you are looking at --- that's the cluster-wide
XID consumption rate, not any one database's."

In the docs it says that "the value stored in the datfrozenxid field is
the freeze cutoff XID that was used by the last database-wide vacuum
operation. All normal XIDs older than this cutoff XID are guaranteed to
have been replaced by FrozenXID within that database." The query to
check is:

select datname, age(datfrozenxid) from pg_database;

"The age column measures the number of transactions from the cutoff XID
to the current transaction's XID.  When the age approaches two billion,
the database must be vacuumed again to avoid risk of wraparound
failures."

So, if XID is database instance (cluster-wide), that would explain why
the actual increment of the age value is the same across all DB's, even
though the actual XID numbers are different per db?

bash-3.00$ date; psql -dpostgres -frun.sql
Mon Aug  6 11:20:03 MDT 2007
   datname    |    age
--------------+------------
 postgres     | 1075005030
 mxl          | 1075004446
 .
 .
 .
 template1    | 1074931654
 template0    | 1074740691
(7 rows)

bash-3.00$ date; psql -dpostgres -frun.sql
Mon Aug  6 11:23:53 MDT 2007
   datname    |    age
--------------+------------
 postgres     | 1075046568
 mxl          | 1075045984
 .
 .
 .
 template1    | 1074973192
 template0    | 1074782229
(7 rows)

Postgres increment: 41,538
Mxl increment: 41,538
Template 1 increment: 41,538
Template 0 increment: 41,538

But, the actual reset of the FrozenXID needs to take place at the
individual DB level, and that's where the full-db vacuum comes into
play, correct?

And the bug in 8.1.4 is allowing the FrozenXID on template0 to increment
when it shouldn't be doing so, and since the normal vacuum ignores a
read-only db such as template0, that is why we had to force a vacuum
after the DB shut down?

Do I have this right?

Thanks again,

Keaton


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, August 06, 2007 11:02 AM
To: Joshua D. Drake
Cc: Keaton Adams; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Template zero xid issue

"Joshua D. Drake" <jd@commandprompt.com> writes:
> 2. IIRC there is a bug in 8.1.4 that causes template0 to increment
when
> it shouldn't. Update the template0 to allow connections, vacuum full,
> dissallow connections, and then upgrade.

That should be "vacuum freeze" but otherwise this is the correct
recovery process.  However, before destroying the evidence I'd like to
find out how he got into this state ...

            regards, tom lane

pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: new line in psotgres
Next
From: "Ed L."
Date:
Subject: 8.1.2 select for update issue