Re: [HACKERS] Per-table freeze limit proposal - Mailing list pgsql-patches

From Alvaro Herrera
Subject Re: [HACKERS] Per-table freeze limit proposal
Date
Msg-id 20051115024052.GA18588@surnet.cl
Whole thread Raw
Responses Re: [HACKERS] Per-table freeze limit proposal  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-patches
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Cool.  I wonder if the exact figure should be
> > min(lowest non-frozen Xid in table, GetOldestXmin(false))
>
> Actually just min(lowest Xid in table, RecentXmin).  You only need to be
> sure there are no running transactions older than what you put into the
> field; their xmins are not at issue.

Ok, patch attached.  Two new columns in pg_class store two Xids: the
"relminxid" is the one in the equation above.  The relvacuumxid is the
OldestXmin.  Two columns in pg_database replace the previous two,
datminxid is the minimum of all relminxid in the database, and
datvacuumxid is the minimum of all relvacuumxid.  (datfreezexid is no
more.)

So we can check the Xid wrap horizon using datminxid, and truncate clog
using datvacuumxid.  (Actually I was going to post the patch yesterday
without the datvacuumxid/relvacuumxid part, when I noticed that I was
truncating clog on datminxid which seemed a bad idea.)

Additionally I made DROP TABLE invalidate datminxid and datvacuumxid
when the table with the minimum is dropped.  New values for the
pg_database columns are calculated by scanning pg_class at the next
vacuum when any of them is invalid, or when the table which was holding
the minimum back is vacuumed.  New values for the pg_class columns are
updated every vacuum, as appropiate.

The whole thing is pretty fragile is somebody manually updates a
catalog.  But we tell people not to do that, so it should be their
fault, right?


I discovered one problem with the whole approach.  Per this patch, we
only store normal Xids in relminxid/relvacuumxid.  So if a table is
completely frozen, we will store RecentXmin.  We do this because it
would actually be unsafe to store, say, FrozenXid: if another
transaction stores/changes a tuple while we are vacuuming it, the Frozen
mark wouldn't be correct and thus the table could be corrupted if a Xid
wraparound happens (which is why we use RecentXmin in the first place:
to cope with the possibility of someone else using the table during the
vacuum.)

The problem comes when this is done to template1, and it is copied to
another database after some millions of transactions have come and go --
it will seem like the database has suffered wraparound.  We would need
to vacuum it completely after copied for the stats to be accurate.

I'm not sure what to do about that.  I think storing FrozenXid may not
actually be a totally bad idea.  Comments?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

pgsql-patches by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: drop if exists
Next
From: Andreas Seltenreich
Date:
Subject: Re: TODO item -- Improve psql's handling of multi-line queries