Thread: Per-table freeze limit proposal

Per-table freeze limit proposal

From
Alvaro Herrera
Date:
Hackers,

As you've probably heard too many times already, I'm thinking in
improving vacuum, so we can keep track of the freeze Xid on a table
level, rather than database level.  Hopefully this will eliminate the
need for database-wide vacuums.

In fact this seems pretty easy to do.  Add a field to pg_class, tell
VACUUM to update it using the determined freezeLimit, and that's it.
(Note that if we ever implement partial vacuum, it won't be able to
update the freeze point.  But that was true before anyway.)

We also need to teach autovacuum to update pg_database.datfreezexid,
using the minimum from pg_class.  (I don't think it's a good idea to
seqscan pg_class to find out the minimum on each VACUUM call.) So, an
autovacuum iteration would issue all needed VACUUM/ANALYZE calls, then
get the minimum freezexid from pg_class to update pg_database.  This
way, GetNewTransactionId can continue checking pg_database.datfreezexid
as the hard limit for issuing warnings for Xid wraparound.

Does anyone see a need for anything other than the autovacuum process to
be updating pg_database.datfreezexid?  Of course, if autovacuum is not
in use, things would continue as now, that is, manual database-wide
VACUUM calls updating pg_database.datfreezexid.  But note that you can
mark all tables as disabled on pg_autovacuum, issue your manuals VACUUM
calls as needed (from cron or whatever), and use autovacuum to set
pg_database.datfreezexid -- so autovacuum would in fact do nothing
except set the freeze limit.

The problem is, this seems so awfully simple that I fear I am missing
something ...  Otherwise, does this sound like a plan?

-- 
Alvaro Herrera -- Valdivia, Chile         Architect, www.EnterpriseDB.com
The easiest way to resolve [trivial code guidelines disputes] is to fire
one or both of the people involved.                      (Damian Conway)


Re: Per-table freeze limit proposal

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> In fact this seems pretty easy to do.  Add a field to pg_class, tell
> VACUUM to update it using the determined freezeLimit, and that's it.

I think that it'd be worth fixing things so that the recorded value
is not the freeze cutoff value (as now), but the actual lowest
not-frozen XID present anywhere in the table.  The present code does not
do that because it's painful to track across multiple tables, but on a
per-table basis it seems easy.  In particular this rule allows you to
set a sane value for the pg_class field when the table is created (ie,
current transaction's XMIN, rather than a billion less).

> (Note that if we ever implement partial vacuum, it won't be able to
> update the freeze point.  But that was true before anyway.)

Sure.

> We also need to teach autovacuum to update pg_database.datfreezexid,
> using the minimum from pg_class.

No, no, no.  autovacuum is not a required part of the system and it's
not going to become so any time soon.  Updating the pg_database entry
will have to be the responsibility of VACUUM itself.  It's not that
terrible: you don't have to scan pg_class unless you see that the
pg_class.relfreezexid value you are replacing is equal to
pg_database.datfreezexid, and with the exact computation suggested
above, that won't be a common occurrence.
        regards, tom lane


Re: Per-table freeze limit proposal

From
"Jim Buttafuoco"
Date:
while you are at it, can you put in some audit timestamps as to when the vacuum occurred (full vs not full). 


---------- Original Message -----------
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
To: Hackers <pgsql-hackers@postgresql.org>
Sent: Wed, 14 Sep 2005 22:14:23 -0400
Subject: [HACKERS] Per-table freeze limit proposal

> Hackers,
> 
> As you've probably heard too many times already, I'm thinking in
> improving vacuum, so we can keep track of the freeze Xid on a table
> level, rather than database level.  Hopefully this will eliminate the
> need for database-wide vacuums.
> 
> In fact this seems pretty easy to do.  Add a field to pg_class, tell
> VACUUM to update it using the determined freezeLimit, and that's it.
> (Note that if we ever implement partial vacuum, it won't be able to
> update the freeze point.  But that was true before anyway.)
> 
> We also need to teach autovacuum to update pg_database.datfreezexid,
> using the minimum from pg_class.  (I don't think it's a good idea to
> seqscan pg_class to find out the minimum on each VACUUM call.) So, an
> autovacuum iteration would issue all needed VACUUM/ANALYZE calls, then
> get the minimum freezexid from pg_class to update pg_database.  This
> way, GetNewTransactionId can continue checking pg_database.datfreezexid
> as the hard limit for issuing warnings for Xid wraparound.
> 
> Does anyone see a need for anything other than the autovacuum process to
> be updating pg_database.datfreezexid?  Of course, if autovacuum is not
> in use, things would continue as now, that is, manual database-wide
> VACUUM calls updating pg_database.datfreezexid.  But note that you can
> mark all tables as disabled on pg_autovacuum, issue your manuals VACUUM
> calls as needed (from cron or whatever), and use autovacuum to set
> pg_database.datfreezexid -- so autovacuum would in fact do nothing
> except set the freeze limit.
> 
> The problem is, this seems so awfully simple that I fear I am missing
> something ...  Otherwise, does this sound like a plan?
> 
> -- 
> Alvaro Herrera -- Valdivia, Chile         Architect, www.EnterpriseDB.com
> The easiest way to resolve [trivial code guidelines disputes] is to fire
> one or both of the people involved.                      (Damian Conway)
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq
------- End of Original Message -------



Re: Per-table freeze limit proposal

From
Alvaro Herrera
Date:
On Wed, Sep 14, 2005 at 11:30:52PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > In fact this seems pretty easy to do.  Add a field to pg_class, tell
> > VACUUM to update it using the determined freezeLimit, and that's it.
> 
> I think that it'd be worth fixing things so that the recorded value
> is not the freeze cutoff value (as now), but the actual lowest
> not-frozen XID present anywhere in the table.

Cool.  I wonder if the exact figure should be

min(lowest non-frozen Xid in table, GetOldestXmin(false))

just in case a long-running transaction inserts a new tuple after the
vacuum is done.  Also GetOldestXmin should be the value used for empty
tables.  For shared relations, we'd use GetOldestXmin(true).

Also, in light of this, it seems a bad idea to use the name "freezexid"
for the pg_class column; I would name it relminxid or something like
that (suggestions welcome).  Not sure about renaming the pg_database
column -- I don't see why not.

-- 
Alvaro Herrera -- Valdivia, Chile         Architect, www.EnterpriseDB.com
"La Primavera ha venido. Nadie sabe como ha sido" (A. Machado)


Re: Per-table freeze limit proposal

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Cool.  I wonder if the exact figure should be
> min(lowest non-frozen Xid in table, GetOldestXmin(false))

Actually just min(lowest Xid in table, RecentXmin).  You only need to be
sure there are no running transactions older than what you put into the
field; their xmins are not at issue.

> Also, in light of this, it seems a bad idea to use the name "freezexid"
> for the pg_class column; I would name it relminxid or something like
> that (suggestions welcome).

Works for me.
        regards, tom lane


Re: Per-table freeze limit proposal

From
"Jim C. Nasby"
Date:
It would also be very handy to be able to tell how many transactions (or
inserts/updates/deletes) have occured since the last vacuum. Presumably
autovacuum needs to know this already, but is it exposed?

On Thu, Sep 15, 2005 at 07:46:26AM -0400, Jim Buttafuoco wrote:
> while you are at it, can you put in some audit timestamps as to when the vacuum occurred (full vs not full). 
> 
> 
> ---------- Original Message -----------
> From: Alvaro Herrera <alvherre@alvh.no-ip.org>
> To: Hackers <pgsql-hackers@postgresql.org>
> Sent: Wed, 14 Sep 2005 22:14:23 -0400
> Subject: [HACKERS] Per-table freeze limit proposal
> 
> > Hackers,
> > 
> > As you've probably heard too many times already, I'm thinking in
> > improving vacuum, so we can keep track of the freeze Xid on a table
> > level, rather than database level.  Hopefully this will eliminate the
> > need for database-wide vacuums.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461