Thread: Oldest xmin is far in the past
Hi,
Coming back from vacation, I find a PostgreSQL database cluster (Pg version: 9.0.4)
with 5 databases, which for the past 5 days have logged these messages:
WARNING: oldest xmin is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
The cluster has been running continously for over 9 months with autovacuum enabled.
Doing a:
vacuumdb -h <host> -U postgres -v -a
vacuums the tables, and also emits the same messages while vacuuming:
WARNING: oldest xmin is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
Checking with:
"SELECT datname, age(datfrozenxid) FROM pg_database;"
vacuum did not reduce the age of datfrozenxid:
datname | age
------------+-----------
template1 | 208249541
template0 | 208249541
postgres | 208249541
db1 | 208249541
db2 | 208249541
db3 | 208249541
db4 | 208249541
db5 | 208249541
Running
"SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r' order by age(relfrozenxid) desc;"
on each database, I see that the age is still high for some of the shared catalogs:
**pg_class:
relname | age
-------------------------+-----------
pg_db_role_setting | 208249542
pg_authid | 208249542
pg_database | 208249542
pg_shdescription | 208249542
pg_shdepend | 208249542
pg_auth_members | 208249542
pg_pltemplate | 208249542
pg_tablespace | 208249542
Max age of usergenerated tables (ie not pg_ tables):
max age(relfrozenxid)
db1: 169036643
db2: 208249548
db3: 169036553
db4: 191294331
db5: 64496585
Looking at 'db2' in the cluster, I see that in fact _every relation_ listed with:
"SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r' order by age(relfrozenxid) desc;"
have
age(relfrozenxid) = 208249548 (at the time of running the query)
Could the problem be in this 'db2' ?
However running 'vacuumdb -h <host> -U postgres -a' does not seem to fix the problem.
Using:
"select procpid, datname,usename,query_start from pg_stat_activity order by query_start;"
I cannot find any 'idle in transaction' connected, and I can see one 'idle' connection
(but not 'idle in transaction') which is almost 24 hours old, the rest is from early this morning.
Also:
ps auxww|grep 'trans'
finds no 'idle in transactions'.
So, I'm looking for ways to fix this 'wraparound' problem for this database cluster.
Are any of these options candidates for solving the wraparound issue ?
OPTION 1) Shutdown all application brokers connected to the databases and then do another
vacuumdb -h <host> -U postgres -v -a
OPTION 2) Shutdown all application brokers connected, stop the database cluster, start the database cluster
and then do another
vacuumdb -h <host> -U postgres -v -a
OPTION 3) Close all brokers connected to the database,
do a full dump of db2 (which has user tables with highest xid age),
drop database db2
restore db2
OPTION 4) Close all brokers connected to the database,
do a full dump of all 5 databases,
rename the the datadb cluster dir: mv /dbcluster /dbcluster.old
do a new initdb /dbcluster
restore the 5 databases to new /dbcluster
Or are there anything else I could try to fix this ?
Thank you
Christian
Coming back from vacation, I find a PostgreSQL database cluster (Pg version: 9.0.4)
with 5 databases, which for the past 5 days have logged these messages:
WARNING: oldest xmin is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
The cluster has been running continously for over 9 months with autovacuum enabled.
Doing a:
vacuumdb -h <host> -U postgres -v -a
vacuums the tables, and also emits the same messages while vacuuming:
WARNING: oldest xmin is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
Checking with:
"SELECT datname, age(datfrozenxid) FROM pg_database;"
vacuum did not reduce the age of datfrozenxid:
datname | age
------------+-----------
template1 | 208249541
template0 | 208249541
postgres | 208249541
db1 | 208249541
db2 | 208249541
db3 | 208249541
db4 | 208249541
db5 | 208249541
Running
"SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r' order by age(relfrozenxid) desc;"
on each database, I see that the age is still high for some of the shared catalogs:
**pg_class:
relname | age
-------------------------+-----------
pg_db_role_setting | 208249542
pg_authid | 208249542
pg_database | 208249542
pg_shdescription | 208249542
pg_shdepend | 208249542
pg_auth_members | 208249542
pg_pltemplate | 208249542
pg_tablespace | 208249542
Max age of usergenerated tables (ie not pg_ tables):
max age(relfrozenxid)
db1: 169036643
db2: 208249548
db3: 169036553
db4: 191294331
db5: 64496585
Looking at 'db2' in the cluster, I see that in fact _every relation_ listed with:
"SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r' order by age(relfrozenxid) desc;"
have
age(relfrozenxid) = 208249548 (at the time of running the query)
Could the problem be in this 'db2' ?
However running 'vacuumdb -h <host> -U postgres -a' does not seem to fix the problem.
Using:
"select procpid, datname,usename,query_start from pg_stat_activity order by query_start;"
I cannot find any 'idle in transaction' connected, and I can see one 'idle' connection
(but not 'idle in transaction') which is almost 24 hours old, the rest is from early this morning.
Also:
ps auxww|grep 'trans'
finds no 'idle in transactions'.
So, I'm looking for ways to fix this 'wraparound' problem for this database cluster.
Are any of these options candidates for solving the wraparound issue ?
OPTION 1) Shutdown all application brokers connected to the databases and then do another
vacuumdb -h <host> -U postgres -v -a
OPTION 2) Shutdown all application brokers connected, stop the database cluster, start the database cluster
and then do another
vacuumdb -h <host> -U postgres -v -a
OPTION 3) Close all brokers connected to the database,
do a full dump of db2 (which has user tables with highest xid age),
drop database db2
restore db2
OPTION 4) Close all brokers connected to the database,
do a full dump of all 5 databases,
rename the the datadb cluster dir: mv /dbcluster /dbcluster.old
do a new initdb /dbcluster
restore the 5 databases to new /dbcluster
Or are there anything else I could try to fix this ?
Thank you
Christian
Christian Rosnes <christian.rosnes@gmail.com> writes: > Hi, > > Coming back from vacation, I find a PostgreSQL database cluster (Pg version: 9.0.4) > with 5 databases, which for the past 5 days have logged these messages: > > ? WARNING:? oldest xmin is far in the past > ? HINT:? Close open transactions soon to avoid wraparound problems. > > The cluster has been running continously for over 9 months with autovacuum enabled. <snip>> > I cannot find any 'idle in transaction' connected, and I can see one 'idle' connection > (but not 'idle in transaction') which is almost 24 hours old, the rest is from early this morning. > See if there is a prepared transaction still open. Query pg_prepared_xacts. HTH -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 732.216.7255
On Sat, Jun 30, 2012 at 3:55 PM, Jerry Sievers <gsievers19@comcast.net> wrote:
Hi,
Many thanks for your reply.
Doing a:
psql -U postgres -d postgres -c "select * from pg_prepared_xacts;"
I see that there are 72 transactions (across the 5 databases in this cluster) with the 'prepared' values listed as various dates in april 2012,
ie. over 2 months old.
I guess the solution could that for each of the 'gid' values listed from the query above, do a:
psql -U postgres -d postgres -c "ROLLBACK PREPARED '<gid>';"
And then do a "vacuumdb -a" to see if it fixes the problem.
Thank again,
Christian
Christian Rosnes <christian.rosnes@gmail.com> writes:> ? WARNING:? oldest xmin is far in the past
> Hi,
>
> Coming back from vacation, I find a PostgreSQL database cluster (Pg version: 9.0.4)
> with 5 databases, which for the past 5 days have logged these messages:
>
> ? HINT:? Close open transactions soon to avoid wraparound problems.><snip>>
> The cluster has been running continously for over 9 months with autovacuum enabled.See if there is a prepared transaction still open. Query pg_prepared_xacts.
> I cannot find any 'idle in transaction' connected, and I can see one 'idle' connection
> (but not 'idle in transaction') which is almost 24 hours old, the rest is from early this morning.
>
Hi,
Many thanks for your reply.
Doing a:
psql -U postgres -d postgres -c "select * from pg_prepared_xacts;"
I see that there are 72 transactions (across the 5 databases in this cluster) with the 'prepared' values listed as various dates in april 2012,
ie. over 2 months old.
I guess the solution could that for each of the 'gid' values listed from the query above, do a:
psql -U postgres -d postgres -c "ROLLBACK PREPARED '<gid>';"
And then do a "vacuumdb -a" to see if it fixes the problem.
Thank again,
Christian
On Sat, Jun 30, 2012 at 4:32 PM, Jerry Sievers <jerry.sievers@comcast.net> wrote:
Hi again,
Removing the prepared transactions from the various databases in the cluster,
and then doing 'vacuumdb -a' , have reduced the xid age:
"SELECT datname, age(datfrozenxid) FROM pg_database;"
datname | age
------------+-----------
template1 | 15111324
template0 | 15110823
postgres | 117559433
db1 | 50012676
db2 | 50018385
db3 | 50010002
db4 | 125206156
db5 | 123204503
> ? psql -U postgres -d postgres -c "select * from? pg_prepared_xacts;">> ?
> I see that there are 72 transactions (across the 5 databases in this cluster) with the 'prepared' values listed as
> various dates in april 2012,
> ie. over 2 months old.
>
> I guess the solution could that for each of the 'gid' values listed from the query above, do a:
> ?? psql -U postgres -d postgres -c "ROLLBACK PREPARED '<gid>';"
> ??> And then do a "vacuumdb -a" to see if it fixes the problem.It should resolve the problem of your DB moving towards wrap-around.
Hi again,
Removing the prepared transactions from the various databases in the cluster,
and then doing 'vacuumdb -a' , have reduced the xid age:
"SELECT datname, age(datfrozenxid) FROM pg_database;"
datname | age
------------+-----------
template1 | 15111324
template0 | 15110823
postgres | 117559433
db1 | 50012676
db2 | 50018385
db3 | 50010002
db4 | 125206156
db5 | 123204503
That said; your system may be very bloated and require some of the
aggressive measures that you already highlighted in your original memo
to solve for that.
I could schedule a service window where a logical backup/restore
is combined with a PostgreSQL major version update.
Again, thanks so much for your help in resolving our wraparound situation.
Christian
is combined with a PostgreSQL major version update.
Again, thanks so much for your help in resolving our wraparound situation.
Christian