Re: RFC: Making TRUNCATE more "MVCC-safe" - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: RFC: Making TRUNCATE more "MVCC-safe" |
Date | |
Msg-id | CA+TgmobSDtyABRFvN=++m8zPUjoYSmm73Y99K1KoGyKOaq_rew@mail.gmail.com Whole thread Raw |
In response to | Re: RFC: Making TRUNCATE more "MVCC-safe" (Noah Misch <noah@leadboat.com>) |
Responses |
Re: RFC: Making TRUNCATE more "MVCC-safe"
Re: RFC: Making TRUNCATE more "MVCC-safe" |
List | pgsql-hackers |
On Fri, Feb 10, 2012 at 11:46 PM, Noah Misch <noah@leadboat.com> wrote: > On Fri, Feb 10, 2012 at 01:59:18PM -0500, Robert Haas wrote: >> On Fri, Feb 10, 2012 at 6:42 AM, Noah Misch <noah@leadboat.com> wrote: >> > I like the design you have chosen. ?It would find applications beyond >> > TRUNCATE, so your use of non-specific naming is sound. ?For example, older >> > snapshots will see an empty table "t" after "CREATE TABLE t AS SELECT 1" >> > commits; that's a comparable MVCC anomaly. ?Some of our non-MVCC-safe commands >> > should perhaps just become MVCC-safe, but there will always be use cases for >> > operations that shortcut MVCC. ?When one truly does want that, your proposal >> > for keeping behavior consistent makes plenty of sense. >> >> I guess I'm not particularly excited by the idea of trying to make >> TRUNCATE MVCC-safe. I notice that the example involves the REPEATABLE >> READ isolation level, which is already known to be busted in a variety >> of ways; that's why we now have SERIALIZABLE, and why most people use >> READ COMMITTED. Are there examples of this behavior at other >> isolation levels? > > I've yet to see an MVCC anomaly that one can reproduce at REPEATABLE READ and > not at READ COMMITTED. They tend to be narrow race conditions at READ > COMMITTED, yet easy to demonstrate at REPEATABLE READ. Related: > http://archives.postgresql.org/pgsql-performance/2011-02/msg00451.php Yeah. Well, that's actually an interesting example, because it illustrates how general this problem is. We could potentially get ourselves into a situation where just about every system catalog table needs an xmin field to store the point at which the object came into existence - or for that matter, was updated. But it's not quite the same as the xmin of the row itself, because some updates might be judged not to matter. There could also be intermediate cases where updates are invalidating for some purposes but not others. I think we'd better get our hands around more of the problem space before we start trying to engineer solutions. > Incidentally, people use READ COMMITTED because they don't question the > default, not because they know hazards of REPEATABLE READ. I don't know the > bustedness you speak of; could we improve the documentation to inform folks? The example that I remember was related to SELECT FOR UPDATE/SELECT FOR SHARE. The idea of those statements is that you want to prevent the row from being updated or deleted until some other concurrent action is complete; for example, in the case of a foreign key, we'd like to prevent the referenced row from being deleted or updated in the relevant columns until the inserting transaction is committed. But it doesn't work, because when the updating or deleting process gets done with the lock wait, they are still using the same snapshot as before, and merrily do exactly the the thing that the lock-wait was supposed to prevent. If an actual UPDATE is used, it's safe (I think): anyone who was going to UPDATE or DELETE the row will fail with some kind of serialization error. But a SELECT FOR UPDATE that commits is treated more like an UPDATE that rolls back: it's as if the lock never existed. Someone (Florian?) proposed a patch to change this, but it seemed problematic for reasons I no longer exactly remember. When using an actual foreign key, we work around this by taking a new snapshot to cross-check that things haven't changed under us, but user-level code can't do that. At READ COMMITTED, depending on the situation, either the fact that we take new snapshots pretty frequently or the EPQ machinery sometimes make things work sensibly anyway, and at SERIALIZABLE, SSI prevents these kinds of anomalies. But REPEATABLE READ has no protection. I wish I could find the thread where we discussed this before. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: