Thread: Min Xid problem proposal

Min Xid problem proposal

From
Alvaro Herrera
Date:
Anybody remembers my patch to allow tracking the minimum Xid present in
a table, allowing to update the freeze xid on a per-table basis?  The
motivation behind it was to remove the requirement of database-wide
vacuums.

The problem I found with it was that it required all tables to be
vacuumed at least once every billion transactions, even frozen tables,
because there was the danger that somebody may insert new tuples into
the table without marking that fact in the catalogs (thus minxid would
remain FrozenTransactionId but reality would be different.)

My proposal to solve that problem, is to make any transaction that
inserts or modifies tuples in a table that is marked as frozen, unfreeze
it first.  The problem I had last time was finding a good spot in the
code for doing so.  I'm now proposing to do it in the parser, in
setTargetTable().  This routine currently opens the target relation and
acquires RowExclusiveLock on it.  At this point we can check its
relminxid, and if it's FreezeTransactionId, open pg_class and change the
value there.

The problem with this is that it seems to turn a possibly innocuous
insert into an operation that needs to open pg_class.  But in the case
of a relation not in the relcache, it will happen anyway, so it's not
really all that serious.  (The assumption here is that an unfreeze event
is at least as unlikely as a cache miss or a cache invalidation.)

The attached patch implements this idea.  (Of course it doesn't work
standalone; it requires the rest of my min-xid patch.)  I attach it here
separately because it's small and the proposal can be reviewed
independently of the rest of the patch, which is quite bigger.

Note that there's a FIXME on heap_unfreeze() saying that the shared
invalidation would not occur if the transaction aborts.  This comment
comes verbatim from vacuum.c's vac_update_relstats(); however I made a
small experiment and it seems to be false.  I'm not sure about it, but
it seems to me to be critical to send a invalidation message so that
other backends will notice immediately when we unfreeze a relation.

Comments?

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"La victoria es para quien se atreve a estar solo"

Attachment

Re: Min Xid problem proposal

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> My proposal to solve that problem, is to make any transaction that
> inserts or modifies tuples in a table that is marked as frozen, unfreeze
> it first.  The problem I had last time was finding a good spot in the
> code for doing so.  I'm now proposing to do it in the parser, in
> setTargetTable().

My god, no.  Do you have any idea how many paths for updates you've missed?
(Think about prepared plans for starters.)

Furthermore, you can't do this in the way you propose (non-WAL-logged
update to pg_class).  What if the system crashes without ever having
written this update to disk?  The inserted tuples might have made it ---
whether they're committed or not doesn't matter, you've still blown it.

I don't see any very good argument for allowing this mechanism to set
minxid = FrozenXid in the first place.  If there are only frozenXid in
the table, set minxid = current XID.  That eliminates the entire problem
at a stroke.

(Yes, I know what you are going to say.  The idea of freezing a table
and then never having to vacuum it at all is NOT worth the cost of
putting in a mechanism that would guarantee its safety.)
        regards, tom lane


Re: Min Xid problem proposal

From
Alvaro Herrera
Date:
Tom Lane wrote:

> I don't see any very good argument for allowing this mechanism to set
> minxid = FrozenXid in the first place.  If there are only frozenXid in
> the table, set minxid = current XID.  That eliminates the entire problem
> at a stroke.

Ok, so I shall go back to the original patch, which did exactly this.
Is it OK for applying?

(I'm using RecentXmin instead of current XID though, because a
currently-running transaction could insert tuples in the table I just
vacuumed.)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Min Xid problem proposal

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Ok, so I shall go back to the original patch, which did exactly this.
> Is it OK for applying?

I haven't looked at it ... when did you post it exactly?

> (I'm using RecentXmin instead of current XID though, because a
> currently-running transaction could insert tuples in the table I just
> vacuumed.)

Duh, right.
        regards, tom lane


Re: Min Xid problem proposal

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Ok, so I shall go back to the original patch, which did exactly this.
> > Is it OK for applying?
> 
> I haven't looked at it ... when did you post it exactly?

From: Alvaro Herrera <alvherre@commandprompt.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Patches <pgsql-patches@postgresql.org>
Date: Mon, 14 Nov 2005 23:40:52 -0300
Subject: Re: [PATCHES] [HACKERS] Per-table freeze limit proposal

I have a version that applies cleanly to current CVS tip.  Do I post it
again?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Min Xid problem proposal

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I have a version that applies cleanly to current CVS tip.  Do I post it
> again?

No need unless you think the changes are significant.  I'll try to look
over the patch soon.
        regards, tom lane


Re: Min Xid problem proposal

From
Simon Riggs
Date:
On Fri, 2005-12-09 at 12:32 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > My proposal to solve that problem, is to make any transaction that
> > inserts or modifies tuples in a table that is marked as frozen, unfreeze
> > it first.  The problem I had last time was finding a good spot in the
> > code for doing so.  I'm now proposing to do it in the parser, in
> > setTargetTable().
> 
> My god, no.  Do you have any idea how many paths for updates you've missed?
> (Think about prepared plans for starters.)
> 
> Furthermore, you can't do this in the way you propose (non-WAL-logged
> update to pg_class).  What if the system crashes without ever having
> written this update to disk?  The inserted tuples might have made it ---
> whether they're committed or not doesn't matter, you've still blown it.
> 
> I don't see any very good argument for allowing this mechanism to set
> minxid = FrozenXid in the first place.  If there are only frozenXid in
> the table, set minxid = current XID.  That eliminates the entire problem
> at a stroke.
> 
> (Yes, I know what you are going to say.  The idea of freezing a table
> and then never having to vacuum it at all is NOT worth the cost of
> putting in a mechanism that would guarantee its safety.)

>From what's been said VACUUM FREEZE will not alter the fact that a
frozen table will need vacuuming again in the future and so cannot ever
be read-only. I can't really see any reason to run VACUUM FREEZE...

If you want to make a table read-only forever, we need a separate
command to do that, ISTM.

ALTER TABLE ... READONLY 
could set minXid = FrozenTransactionId, indicating no further VACUUMs
required, ever. We can then disallow INSERT/UPDATE/DELETE against the
table in the permissions layer.

Best Regards, Simon Riggs



Re: Min Xid problem proposal

From
Alvaro Herrera
Date:
Simon Riggs wrote:

> From what's been said VACUUM FREEZE will not alter the fact that a
> frozen table will need vacuuming again in the future and so cannot ever
> be read-only. I can't really see any reason to run VACUUM FREEZE...

Yeah.

> If you want to make a table read-only forever, we need a separate
> command to do that, ISTM.

Let's get this goose cooked and then we can improve it.  This patch has
been waiting on my queue for too long.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.