Thread: Re: [HACKERS] Per-table freeze limit proposal
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Cool. I wonder if the exact figure should be > > min(lowest non-frozen Xid in table, GetOldestXmin(false)) > > Actually just min(lowest Xid in table, RecentXmin). You only need to be > sure there are no running transactions older than what you put into the > field; their xmins are not at issue. Ok, patch attached. Two new columns in pg_class store two Xids: the "relminxid" is the one in the equation above. The relvacuumxid is the OldestXmin. Two columns in pg_database replace the previous two, datminxid is the minimum of all relminxid in the database, and datvacuumxid is the minimum of all relvacuumxid. (datfreezexid is no more.) So we can check the Xid wrap horizon using datminxid, and truncate clog using datvacuumxid. (Actually I was going to post the patch yesterday without the datvacuumxid/relvacuumxid part, when I noticed that I was truncating clog on datminxid which seemed a bad idea.) Additionally I made DROP TABLE invalidate datminxid and datvacuumxid when the table with the minimum is dropped. New values for the pg_database columns are calculated by scanning pg_class at the next vacuum when any of them is invalid, or when the table which was holding the minimum back is vacuumed. New values for the pg_class columns are updated every vacuum, as appropiate. The whole thing is pretty fragile is somebody manually updates a catalog. But we tell people not to do that, so it should be their fault, right? I discovered one problem with the whole approach. Per this patch, we only store normal Xids in relminxid/relvacuumxid. So if a table is completely frozen, we will store RecentXmin. We do this because it would actually be unsafe to store, say, FrozenXid: if another transaction stores/changes a tuple while we are vacuuming it, the Frozen mark wouldn't be correct and thus the table could be corrupted if a Xid wraparound happens (which is why we use RecentXmin in the first place: to cope with the possibility of someone else using the table during the vacuum.) The problem comes when this is done to template1, and it is copied to another database after some millions of transactions have come and go -- it will seem like the database has suffered wraparound. We would need to vacuum it completely after copied for the stats to be accurate. I'm not sure what to do about that. I think storing FrozenXid may not actually be a totally bad idea. Comments? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
On Mon, 2005-11-14 at 23:40 -0300, Alvaro Herrera wrote: > The whole thing is pretty fragile is somebody manually updates a > catalog. But we tell people not to do that, so it should be their > fault, right? Hmmmm...sounds scary. Cool ideas in the patch though. > I discovered one problem with the whole approach. Per this patch, we > only store normal Xids in relminxid/relvacuumxid. So if a table is > completely frozen, we will store RecentXmin. We do this because it > would actually be unsafe to store, say, FrozenXid: if another > transaction stores/changes a tuple while we are vacuuming it, the Frozen > mark wouldn't be correct and thus the table could be corrupted if a Xid > wraparound happens (which is why we use RecentXmin in the first place: > to cope with the possibility of someone else using the table during the > vacuum.) Yep. And because VACUUM FULL FREEZE is no longer possible. > The problem comes when this is done to template1, and it is copied to > another database after some millions of transactions have come and go -- > it will seem like the database has suffered wraparound. We would need > to vacuum it completely after copied for the stats to be accurate. I don't understand the issue, can you explain more? I see no problem. If an identical copy gives a problem then surely template1 should also. > I'm not sure what to do about that. I think storing FrozenXid may not > actually be a totally bad idea. Comments? Its not a totally bad idea, but it has some risk, which where transactions are concerned is not really acceptable. Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with a table lock and skip all that moving data around. Best Regards, Simon Riggs
Simon Riggs wrote: > On Mon, 2005-11-14 at 23:40 -0300, Alvaro Herrera wrote: > > > The whole thing is pretty fragile is somebody manually updates a > > catalog. But we tell people not to do that, so it should be their > > fault, right? > > Hmmmm...sounds scary. Cool ideas in the patch though. Yeah, well, actually the problem is solved very easily by setting the pg_database tuple manually, either to InvalidTransactionId or to the minimum computed from pg_class. > > The problem comes when this is done to template1, and it is copied to > > another database after some millions of transactions have come and go -- > > it will seem like the database has suffered wraparound. We would need > > to vacuum it completely after copied for the stats to be accurate. > > I don't understand the issue, can you explain more? I see no problem. If > an identical copy gives a problem then surely template1 should also. Actually, template1 has the problem too. The scenario is this: - template1 is freezed. datminxid <- X - a long time passes, say INT_MAX * 0.75 transactions - a new database D is created, which coming from template1 has datminxid=X - the Xid counter is past the vacuum horizon for D.datminxid, so the system determines that the Xid counter could be wrapped already. - The system automatically decides to stop accepting new transactions. In fact there's no problem because in D, just like in template1, all tuples are frozen. How should we mark this on the catalogs? I don't see any way. Note that setting relminxid = FrozenTransactionId is bogus in any case, because even if we correctly lock and freeze the table, the next transaction after the vacuum could insert a new tuple into the table. But we don't want INSERT to be checking pg_class.relminxid! (Or do we?) Now, restating the problem, certainly template1 has the problem too. In fact we have a bigger problem: we are forcing all tables to be vacuumed every so often, even if they have been completely frozen before! This is because setting relminxid = Frozen is really bogus. > > I'm not sure what to do about that. I think storing FrozenXid may not > > actually be a totally bad idea. Comments? > > Its not a totally bad idea, but it has some risk, which where > transactions are concerned is not really acceptable. > > Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with > a table lock and skip all that moving data around. Doesn't work either because of the argument above. What about assuming that if somebody executes a database-wide FREEZE, he knows what he is doing and thus we can mark datminxid as FrozenTransactionId? Sadly, I see all this as proof that the whole idea doesn't work. It seems better than the current state of the system, where we rely on the user to do certain things, or on pgstat which is inherently inexact. But there is a big hole in the whole reasoning which hasn't been filled yet. Any ideas welcome. The idea of any insert/delete/update operation checking a bit in the Relation and resetting relminxid to TopTransactionId if it's marked Frozen is the only one I have right now. What do people think about it? -- Alvaro Herrera http://www.PlanetPostgreSQL.org "No hay cielo posible sin hundir nuestras raíces en la profundidad de la tierra" (Malucha Pinto)
On Tue, 2005-11-15 at 21:58 -0300, Alvaro Herrera wrote: > In fact there's no problem because in D, just like in template1, all > tuples are frozen. How should we mark this on the catalogs? I don't > see any way. All tuples might be frozen or might not be, the point is you don't know. That's why you can't use FrozenTransactionId. > > Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with > > a table lock and skip all that moving data around. > > Doesn't work either because of the argument above. > > What about assuming that if somebody executes a database-wide FREEZE, he > knows what he is doing and thus we can mark datminxid as > FrozenTransactionId? If you lock the table before FREEZE then you will guarantee that all rows will be frozen and you really can then set FrozenTransactionId. Making VACUUM FREEZE take full table locks seems like a very useful thing to me, and it would solve your problems also. Best Regards, Simon Riggs
On Wed, 2005-11-16 at 08:31 +0000, Simon Riggs wrote: > On Tue, 2005-11-15 at 21:58 -0300, Alvaro Herrera wrote: > > In fact there's no problem because in D, just like in template1, all > > tuples are frozen. How should we mark this on the catalogs? I don't > > see any way. > > All tuples might be frozen or might not be, the point is you don't know. > That's why you can't use FrozenTransactionId. > > > > Perhaps we should reinstate VACUUM FULL FREEZE to do just a FREEZE with > > > a table lock and skip all that moving data around. > > > > Doesn't work either because of the argument above. > > > > What about assuming that if somebody executes a database-wide FREEZE, he > > knows what he is doing and thus we can mark datminxid as > > FrozenTransactionId? > > If you lock the table before FREEZE then you will guarantee that all > rows will be frozen and you really can then set FrozenTransactionId. > > Making VACUUM FREEZE take full table locks seems like a very useful > thing to me, and it would solve your problems also. Thinking some more, when initdb issues VACUUM FREEZE we know for certain that nobody else is issuing commands against the database at that point, which is equivalent to a table lock. So we should be able to have a VACUUM FREEZE detect that and if so, set FrozenTransactionId. In normal concurrent running, I would like VACUUM FREEZE to issue a full table SHARE lock to ensure that we can set FrozenTransactionId for that also. Otherwise we will not be able to move frozen tables to read only media. Best Regards, Simon Riggs
Simon Riggs wrote: > On Wed, 2005-11-16 at 08:31 +0000, Simon Riggs wrote: > > All tuples might be frozen or might not be, the point is you don't know. > > That's why you can't use FrozenTransactionId. > Thinking some more, when initdb issues VACUUM FREEZE we know for certain > that nobody else is issuing commands against the database at that point, > which is equivalent to a table lock. So we should be able to have a > VACUUM FREEZE detect that and if so, set FrozenTransactionId. > > In normal concurrent running, I would like VACUUM FREEZE to issue a full > table SHARE lock to ensure that we can set FrozenTransactionId for that > also. Otherwise we will not be able to move frozen tables to read only > media. You missed one point however. Even if VACUUM FREEZE freezes all tuples, any transaction following that one is able to insert non-frozen tuples into the table. At that instant, having marked the table with Frozen is bogus, no matter what amount of locks you took on it. We can only do that (mark the table Frozen) if and only if all following transactions are forced to mark the table "unfrozen" as soon as they change it. Are we going to do that? Note that trying to unfreeze a table might be difficult -- we might need to obtain a lock on pg_class after we have opened and locked the target relation, leading to possible deadlock on INSERT. Yikes. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 17.7", W 73º 14' 26.8" "Ciencias políticas es la ciencia de entender por qué los políticos actúan como lo hacen" (netfunny.com)
On Wed, 2005-11-16 at 07:52 -0300, Alvaro Herrera wrote: > Simon Riggs wrote: > > On Wed, 2005-11-16 at 08:31 +0000, Simon Riggs wrote: > > > > All tuples might be frozen or might not be, the point is you don't know. > > > That's why you can't use FrozenTransactionId. > > > Thinking some more, when initdb issues VACUUM FREEZE we know for certain > > that nobody else is issuing commands against the database at that point, > > which is equivalent to a table lock. So we should be able to have a > > VACUUM FREEZE detect that and if so, set FrozenTransactionId. > > > > In normal concurrent running, I would like VACUUM FREEZE to issue a full > > table SHARE lock to ensure that we can set FrozenTransactionId for that > > also. Otherwise we will not be able to move frozen tables to read only > > media. > > You missed one point however. Even if VACUUM FREEZE freezes all tuples, > any transaction following that one is able to insert non-frozen tuples > into the table. At that instant, having marked the table with Frozen is > bogus, no matter what amount of locks you took on it. (OK I think we are getting there now, after my usual comms errors.) We need something stronger than VACUUM FREEZE then. Perhaps an ALTER TABLE READONLY. That would do a FREEZE and place a permanent table share lock, so we wouldn't need to set/unset the Frozen state. We'd do that as a permissions thing, rather than an actual lock. That way copies of the data could still be taken with ease and the copies would not themselves be READONLY. Not sure what you'd call it to make a whole database readonly all at once...but whatever we call it we know initdb wants to run it on template1. Then your original thought becomes fully viable. This is particularly important because I see the need to be able to freeze older data partitions and migrate them to readonly media as part of very high volume data applications. Best Regards, Simon Riggs