Thread: Re: [HACKERS] Per-table freeze limit proposal

Re: [HACKERS] Per-table freeze limit proposal

From

Alvaro Herrera

Date:

14 November 2005, 22:56:03

Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Cool.  I wonder if the exact figure should be
> > min(lowest non-frozen Xid in table, GetOldestXmin(false))
>
> Actually just min(lowest Xid in table, RecentXmin).  You only need to be
> sure there are no running transactions older than what you put into the
> field; their xmins are not at issue.

Ok, patch attached.  Two new columns in pg_class store two Xids: the
"relminxid" is the one in the equation above.  The relvacuumxid is the
OldestXmin.  Two columns in pg_database replace the previous two,
datminxid is the minimum of all relminxid in the database, and
datvacuumxid is the minimum of all relvacuumxid.  (datfreezexid is no
more.)

So we can check the Xid wrap horizon using datminxid, and truncate clog
using datvacuumxid.  (Actually I was going to post the patch yesterday
without the datvacuumxid/relvacuumxid part, when I noticed that I was
truncating clog on datminxid which seemed a bad idea.)

Additionally I made DROP TABLE invalidate datminxid and datvacuumxid
when the table with the minimum is dropped.  New values for the
pg_database columns are calculated by scanning pg_class at the next
vacuum when any of them is invalid, or when the table which was holding
the minimum back is vacuumed.  New values for the pg_class columns are
updated every vacuum, as appropiate.

The whole thing is pretty fragile is somebody manually updates a
catalog.  But we tell people not to do that, so it should be their
fault, right?

I discovered one problem with the whole approach.  Per this patch, we
only store normal Xids in relminxid/relvacuumxid.  So if a table is
completely frozen, we will store RecentXmin.  We do this because it
would actually be unsafe to store, say, FrozenXid: if another
transaction stores/changes a tuple while we are vacuuming it, the Frozen
mark wouldn't be correct and thus the table could be corrupted if a Xid
wraparound happens (which is why we use RecentXmin in the first place:
to cope with the possibility of someone else using the table during the
vacuum.)

The problem comes when this is done to template1, and it is copied to
another database after some millions of transactions have come and go --
it will seem like the database has suffered wraparound.  We would need
to vacuum it completely after copied for the stats to be accurate.

I'm not sure what to do about that.  I think storing FrozenXid may not
actually be a totally bad idea.  Comments?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

vacuum-minxid-2.patch

Re: [HACKERS] Per-table freeze limit proposal

From

Simon Riggs

Date:

15 November 2005, 18:36:39

On Mon, 2005-11-14 at 23:40 -0300, Alvaro Herrera wrote:

> The whole thing is pretty fragile is somebody manually updates a
> catalog.  But we tell people not to do that, so it should be their
> fault, right?

Hmmmm...sounds scary. Cool ideas in the patch though.

> I discovered one problem with the whole approach.  Per this patch, we
> only store normal Xids in relminxid/relvacuumxid.  So if a table is
> completely frozen, we will store RecentXmin.  We do this because it
> would actually be unsafe to store, say, FrozenXid: if another
> transaction stores/changes a tuple while we are vacuuming it, the Frozen
> mark wouldn't be correct and thus the table could be corrupted if a Xid
> wraparound happens (which is why we use RecentXmin in the first place:
> to cope with the possibility of someone else using the table during the
> vacuum.)

Yep. And because VACUUM FULL FREEZE is no longer possible.

> The problem comes when this is done to template1, and it is copied to
> another database after some millions of transactions have come and go --
> it will seem like the database has suffered wraparound.  We would need
> to vacuum it completely after copied for the stats to be accurate.

I don't understand the issue, can you explain more? I see no problem. If
an identical copy gives a problem then surely template1 should also.

> I'm not sure what to do about that.  I think storing FrozenXid may not
> actually be a totally bad idea.  Comments?

Its not a totally bad idea, but it has some risk, which where
transactions are concerned is not really acceptable.

Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with
a table lock and skip all that moving data around.

Best Regards, Simon Riggs

Re: [HACKERS] Per-table freeze limit proposal

From

Alvaro Herrera

Date:

15 November 2005, 21:09:30

Simon Riggs wrote:
> On Mon, 2005-11-14 at 23:40 -0300, Alvaro Herrera wrote:
>
> > The whole thing is pretty fragile is somebody manually updates a
> > catalog.  But we tell people not to do that, so it should be their
> > fault, right?
>
> Hmmmm...sounds scary. Cool ideas in the patch though.

Yeah, well, actually the problem is solved very easily by setting the
pg_database tuple manually, either to InvalidTransactionId or to the
minimum computed from pg_class.

> > The problem comes when this is done to template1, and it is copied to
> > another database after some millions of transactions have come and go --
> > it will seem like the database has suffered wraparound.  We would need
> > to vacuum it completely after copied for the stats to be accurate.
>
> I don't understand the issue, can you explain more? I see no problem. If
> an identical copy gives a problem then surely template1 should also.

Actually, template1 has the problem too.  The scenario is this:

- template1 is freezed.  datminxid <- X
- a long time passes, say INT_MAX * 0.75 transactions
- a new database D is created, which coming from template1 has datminxid=X
- the Xid counter is past the vacuum horizon for D.datminxid, so the
  system determines that the Xid counter could be wrapped already.
- The system automatically decides to stop accepting new transactions.

In fact there's no problem because in D, just like in template1, all
tuples are frozen.  How should we mark this on the catalogs?  I don't
see any way.

Note that setting relminxid = FrozenTransactionId is bogus in any case,
because even if we correctly lock and freeze the table, the next
transaction after the vacuum could insert a new tuple into the table.
But we don't want INSERT to be checking pg_class.relminxid!  (Or do we?)

Now, restating the problem, certainly template1 has the problem too.  In
fact we have a bigger problem: we are forcing all tables to be vacuumed
every so often, even if they have been completely frozen before!  This
is because setting relminxid = Frozen is really bogus.

> > I'm not sure what to do about that.  I think storing FrozenXid may not
> > actually be a totally bad idea.  Comments?
>
> Its not a totally bad idea, but it has some risk, which where
> transactions are concerned is not really acceptable.
>
> Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with
> a table lock and skip all that moving data around.

Doesn't work either because of the argument above.

What about assuming that if somebody executes a database-wide FREEZE, he
knows what he is doing and thus we can mark datminxid as
FrozenTransactionId?

Sadly, I see all this as proof that the whole idea doesn't work.  It
seems better than the current state of the system, where we rely on the
user to do certain things, or on pgstat which is inherently inexact.
But there is a big hole in the whole reasoning which hasn't been filled
yet.

Any ideas welcome.  The idea of any insert/delete/update operation
checking a bit in the Relation and resetting relminxid to
TopTransactionId if it's marked Frozen is the only one I have right now.
What do people think about it?

--
Alvaro Herrera                                http://www.PlanetPostgreSQL.org
"No hay cielo posible sin hundir nuestras raíces
 en la profundidad de la tierra"                        (Malucha Pinto)

Re: [HACKERS] Per-table freeze limit proposal

From

Simon Riggs

Date:

16 November 2005, 04:31:07

On Tue, 2005-11-15 at 21:58 -0300, Alvaro Herrera wrote:
> In fact there's no problem because in D, just like in template1, all
> tuples are frozen.  How should we mark this on the catalogs?  I don't
> see any way.

All tuples might be frozen or might not be, the point is you don't know.
That's why you can't use FrozenTransactionId.

> > Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with
> > a table lock and skip all that moving data around.
>
> Doesn't work either because of the argument above.
>
> What about assuming that if somebody executes a database-wide FREEZE, he
> knows what he is doing and thus we can mark datminxid as
> FrozenTransactionId?

If you lock the table before FREEZE then you will guarantee that all
rows will be frozen and you really can then set FrozenTransactionId.

Making VACUUM FREEZE take full table locks seems like a very useful
thing to me, and it would solve your problems also.

Best Regards, Simon Riggs

Re: [HACKERS] Per-table freeze limit proposal

From

Simon Riggs

Date:

16 November 2005, 05:53:56

On Wed, 2005-11-16 at 08:31 +0000, Simon Riggs wrote:
> On Tue, 2005-11-15 at 21:58 -0300, Alvaro Herrera wrote:
> > In fact there's no problem because in D, just like in template1, all
> > tuples are frozen.  How should we mark this on the catalogs?  I don't
> > see any way.
>
> All tuples might be frozen or might not be, the point is you don't know.
> That's why you can't use FrozenTransactionId.
>
> > > Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with
> > > a table lock and skip all that moving data around.
> >
> > Doesn't work either because of the argument above.
> >
> > What about assuming that if somebody executes a database-wide FREEZE, he
> > knows what he is doing and thus we can mark datminxid as
> > FrozenTransactionId?
>
> If you lock the table before FREEZE then you will guarantee that all
> rows will be frozen and you really can then set FrozenTransactionId.
>
> Making VACUUM FREEZE take full table locks seems like a very useful
> thing to me, and it would solve your problems also.

Thinking some more, when initdb issues VACUUM FREEZE we know for certain
that nobody else is issuing commands against the database at that point,
which is equivalent to a table lock. So we should be able to have a
VACUUM FREEZE detect that and if so, set FrozenTransactionId.

In normal concurrent running, I would like VACUUM FREEZE to issue a full
table SHARE lock to ensure that we can set FrozenTransactionId for that
also. Otherwise we will not be able to move frozen tables to read only
media.

Best Regards, Simon Riggs

Re: [HACKERS] Per-table freeze limit proposal

From

Alvaro Herrera

Date:

16 November 2005, 06:52:53

Simon Riggs wrote:
> On Wed, 2005-11-16 at 08:31 +0000, Simon Riggs wrote:

> > All tuples might be frozen or might not be, the point is you don't know.
> > That's why you can't use FrozenTransactionId.

> Thinking some more, when initdb issues VACUUM FREEZE we know for certain
> that nobody else is issuing commands against the database at that point,
> which is equivalent to a table lock. So we should be able to have a
> VACUUM FREEZE detect that and if so, set FrozenTransactionId.
>
> In normal concurrent running, I would like VACUUM FREEZE to issue a full
> table SHARE lock to ensure that we can set FrozenTransactionId for that
> also. Otherwise we will not be able to move frozen tables to read only
> media.

You missed one point however.  Even if VACUUM FREEZE freezes all tuples,
any transaction following that one is able to insert non-frozen tuples
into the table.  At that instant, having marked the table with Frozen is
bogus, no matter what amount of locks you took on it.

We can only do that (mark the table Frozen) if and only if all following
transactions are forced to mark the table "unfrozen" as soon as they
change it.  Are we going to do that?

Note that trying to unfreeze a table might be difficult -- we might need
to obtain a lock on pg_class after we have opened and locked the target
relation, leading to possible deadlock on INSERT.  Yikes.

--
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 17.7", W 73º 14' 26.8"
"Ciencias políticas es la ciencia de entender por qué
 los políticos actúan como lo hacen"  (netfunny.com)

Re: [HACKERS] Per-table freeze limit proposal

From

Simon Riggs

Date:

16 November 2005, 09:55:24

On Wed, 2005-11-16 at 07:52 -0300, Alvaro Herrera wrote:
> Simon Riggs wrote:
> > On Wed, 2005-11-16 at 08:31 +0000, Simon Riggs wrote:
>
> > > All tuples might be frozen or might not be, the point is you don't know.
> > > That's why you can't use FrozenTransactionId.
>
> > Thinking some more, when initdb issues VACUUM FREEZE we know for certain
> > that nobody else is issuing commands against the database at that point,
> > which is equivalent to a table lock. So we should be able to have a
> > VACUUM FREEZE detect that and if so, set FrozenTransactionId.
> >
> > In normal concurrent running, I would like VACUUM FREEZE to issue a full
> > table SHARE lock to ensure that we can set FrozenTransactionId for that
> > also. Otherwise we will not be able to move frozen tables to read only
> > media.
>
> You missed one point however.  Even if VACUUM FREEZE freezes all tuples,
> any transaction following that one is able to insert non-frozen tuples
> into the table.  At that instant, having marked the table with Frozen is
> bogus, no matter what amount of locks you took on it.

(OK I think we are getting there now, after my usual comms errors.)

We need something stronger than VACUUM FREEZE then. Perhaps an ALTER
TABLE READONLY. That would do a FREEZE and place a permanent table share
lock, so we wouldn't need to set/unset the Frozen state. We'd do that as
a permissions thing, rather than an actual lock. That way copies of the
data could still be taken with ease and the copies would not themselves
be READONLY.

Not sure what you'd call it to make a whole database readonly all at
once...but whatever we call it we know initdb wants to run it on
template1.

Then your original thought becomes fully viable.

This is particularly important because I see the need to be able to
freeze older data partitions and migrate them to readonly media as part
of very high volume data applications.

Best Regards, Simon Riggs