Thread: Is it safe to reset relfrozenxid without using vacuum?
I have a heavily partitioned DW type database that has over 5,000 tables in it. Data is only ever inserted, read and then truncated after some period of time. Once the table is truncated, the constraints are changed and the table is reused. This works well until Postgres hits the autovacuum_freeze_max_age, which I currently have set @ 1billion). Since these tables are only ever inserted to and truncated, they are not normally vacuumed(which is what I want, since data is typically going to be truncated before needing to be vacuumed).
Unfortunately, since truncate does not change the relfrozenxid, once the autovacuum_freeze_max_age is reached, suddenly all tables in the schema need vacuuming at once. When this occurs, the autovacuum process gives priority to vacuuming all the tables(2TB+ of data), and query performance degenerates. More significantly, as new data comes into emptied partition tables they are not analyzed in a timely fashion and very poor query plans result. The DB remains in this vacuuming state for up to 3 weeks.
What I would like to do, is just vacuum the table when the truncate code is executed(This is currently done using PGSql functions), but I can't issue the vacuum call from within a transaction. Given that, I was wondering if it is "safe" to update pg_class directly for the table being truncated with a query like:
UPDATE pg_class SET relfrozenxid = ( select relfrozenxid from pg_class where age(relfrozenxid) in (select min(age(relfrozenxid)) from pg_class where relkind = 'r') limit 1) WHERE relname = '<table being truncated>';
Is there a better way of doing this?
For that matter, would it be reasonable to have the relfrozenxid reset on a successful truncate?
I am running Postgres 8.2.4
Regards...Mark Sherwood
Messenger wants to send you on a trip. Enter today.
Unfortunately, since truncate does not change the relfrozenxid, once the autovacuum_freeze_max_age is reached, suddenly all tables in the schema need vacuuming at once. When this occurs, the autovacuum process gives priority to vacuuming all the tables(2TB+ of data), and query performance degenerates. More significantly, as new data comes into emptied partition tables they are not analyzed in a timely fashion and very poor query plans result. The DB remains in this vacuuming state for up to 3 weeks.
What I would like to do, is just vacuum the table when the truncate code is executed(This is currently done using PGSql functions), but I can't issue the vacuum call from within a transaction. Given that, I was wondering if it is "safe" to update pg_class directly for the table being truncated with a query like:
UPDATE pg_class SET relfrozenxid = ( select relfrozenxid from pg_class where age(relfrozenxid) in (select min(age(relfrozenxid)) from pg_class where relkind = 'r') limit 1) WHERE relname = '<table being truncated>';
Is there a better way of doing this?
For that matter, would it be reasonable to have the relfrozenxid reset on a successful truncate?
I am running Postgres 8.2.4
Regards...Mark Sherwood
Messenger wants to send you on a trip. Enter today.
Arctic Toucan wrote: > > I have a heavily partitioned DW type database that has over 5,000 > tables in it. Data is only ever inserted, read and then truncated > after some period of time. Once the table is truncated, the > constraints are changed and the table is reused. This works well until > Postgres hits the autovacuum_freeze_max_age, which I currently have > set @ 1billion). Since these tables are only ever inserted to and > truncated, they are not normally vacuumed(which is what I want, since > data is typically going to be truncated before needing to be > vacuumed). AFAICS this should be safe. In fact, in 8.3 TRUNCATE advances relfrozenxid. (Perhaps you should consider upgrading if possible.) > UPDATE pg_class SET relfrozenxid = ( select relfrozenxid from pg_class > where age(relfrozenxid) in (select min(age(relfrozenxid)) from > pg_class where relkind = 'r') limit 1) WHERE relname = '<table being > truncated>'; Tou could just obtain the xid of the transaction that's going to do the import (for example by creating a temp table and getting it's xmin from pg_class) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > AFAICS this should be safe. In fact, in 8.3 TRUNCATE advances > relfrozenxid. (Perhaps you should consider upgrading if possible.) > ... > Tou could just obtain the xid of the transaction that's going to do the > import (for example by creating a temp table and getting it's xmin from > pg_class) That seems a bit risky. 8.3 resets relfrozenxid to RecentXmin, not the current transaction's XID. The OP's thought of taking the max existing relfrozenxid should be safe though. Or I guess you could make a temp table and take the relfrozenxid, rather than the xmin, from its pg_class entry. regards, tom lane
On a related note...
When I put this change in place, it should handle tables properly going forward and on a new install, but it looks like I have several instances with "clones" of this DB where there are 1000's of tables all with the same relfrozenxid and within just a few million transactions of the autovacuum_freeze_age. So the vast majority of those tables are still going to need to be vacuumed at the same time since the change to the truncation logic won't have been in place long enough to fix the relfrozenxids.
If I "know" that there are no more inserts going into those partitioned tables, can I do a bulk change of their relfrozenxids setting them back 500million transactions without causing problems? This will mean that the relfrozenxid is not representative of the row versions, but does that matter in this case(Essentially static stables)?
> To: alvherre@commandprompt.com
> CC: arctic_toucan@hotmail.com; pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Is it safe to reset relfrozenxid without using vacuum?
> Date: Tue, 18 Nov 2008 13:07:16 -0500
> From: tgl@sss.pgh.pa.us
>
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > AFAICS this should be safe. In fact, in 8.3 TRUNCATE advances
> > relfrozenxid. (Perhaps you should consider upgrading if possible.)
> > ...
> > Tou could just obtain the xid of the transaction that's going to do the
> > import (for example by creating a temp table and getting it's xmin from
> > pg_class)
>
> That seems a bit risky. 8.3 resets relfrozenxid to RecentXmin, not the
> current transaction's XID. The OP's thought of taking the max existing
> relfrozenxid should be safe though.
>
> Or I guess you could make a temp table and take the relfrozenxid, rather
> than the xmin, from its pg_class entry.
>
> regards, tom lane
When I put this change in place, it should handle tables properly going forward and on a new install, but it looks like I have several instances with "clones" of this DB where there are 1000's of tables all with the same relfrozenxid and within just a few million transactions of the autovacuum_freeze_age. So the vast majority of those tables are still going to need to be vacuumed at the same time since the change to the truncation logic won't have been in place long enough to fix the relfrozenxids.
If I "know" that there are no more inserts going into those partitioned tables, can I do a bulk change of their relfrozenxids setting them back 500million transactions without causing problems? This will mean that the relfrozenxid is not representative of the row versions, but does that matter in this case(Essentially static stables)?
> To: alvherre@commandprompt.com
> CC: arctic_toucan@hotmail.com; pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Is it safe to reset relfrozenxid without using vacuum?
> Date: Tue, 18 Nov 2008 13:07:16 -0500
> From: tgl@sss.pgh.pa.us
>
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > AFAICS this should be safe. In fact, in 8.3 TRUNCATE advances
> > relfrozenxid. (Perhaps you should consider upgrading if possible.)
> > ...
> > Tou could just obtain the xid of the transaction that's going to do the
> > import (for example by creating a temp table and getting it's xmin from
> > pg_class)
>
> That seems a bit risky. 8.3 resets relfrozenxid to RecentXmin, not the
> current transaction's XID. The OP's thought of taking the max existing
> relfrozenxid should be safe though.
>
> Or I guess you could make a temp table and take the relfrozenxid, rather
> than the xmin, from its pg_class entry.
>
> regards, tom lane
Arctic Toucan wrote: > If I "know" that there are no more inserts going into those > partitioned tables, can I do a bulk change of their relfrozenxids > setting them back 500million transactions without causing problems? > This will mean that the relfrozenxid is not representative of the row > versions, but does that matter in this case(Essentially static > stables)? The safest most current value you can use is that of the oldest transaction currently running (also known as RecentXmin in the code). If you choose anything older than that you're safe too. I don't think you can obtain RecentXmin in SQL (short of writing a C function) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Tuesday 18 November 2008 16:46:23 Alvaro Herrera wrote: > Arctic Toucan wrote: > > If I "know" that there are no more inserts going into those > > partitioned tables, can I do a bulk change of their relfrozenxids > > setting them back 500million transactions without causing problems? > > This will mean that the relfrozenxid is not representative of the row > > versions, but does that matter in this case(Essentially static > > stables)? > > The safest most current value you can use is that of the oldest > transaction currently running (also known as RecentXmin in the code). > If you choose anything older than that you're safe too. > > I don't think you can obtain RecentXmin in SQL (short of writing a C > function) > Hmm, I have a very similar problem on some of our larger dbs with years of archival data. I'm of the thought that someday I will have a problem that my db would grow large enough that it takes longer to vacuum the archival data (for purposes of advancing relfrozenxid, the data is never modified) than it takes to run through 2 billion transactions. Sounds like the solution might be to write such a C function to just update this directly and run that in cron, and avoid the vacuum mess. -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com