Thread: Problems with ALTER DOMAIN patch
I've been looking at the recently-committed ALTER DOMAIN patch, and I think it's got some serious if not fatal problems. Specifically, the approach to adding/dropping constraints associated with domains doesn't work. 1. Insufficient locking, guise 1: there's no protection against someone else dropping a column or whole table between the time you find a pg_attribute entry in get_rels_with_domain and the time you actually process it in AlterDomainNotNull or AlterDomainAddConstraint. The code appears to think that holding RowExclusiveLock on pg_attribute will protect it somehow, but that doesn't (and shouldn't) do any such thing. This will result in at best an elog and at worst coredump when you try to scan the no-longer-present table or column. 2. Insufficient locking, guise 2: there's no protection against someone else adding a column or table while you're processing an ALTER DOMAIN, either. This means that constraint checks will be missed. Example: << backend 1 >> regression=# create domain mydom int4; CREATE DOMAIN regression=# begin; BEGIN regression=# alter domain mydom set not null; ALTER DOMAIN << don't commit yet; in backend 2 do >> regression=# create table foo (f1 mydom); CREATE TABLE regression=# insert into foo values(null); INSERT 149688 1 << now in backend 1: >> regression=# commit; COMMIT << now in backend 2: >> regression=# insert into foo values(null); ERROR: Domain mydom does not allow NULL values regression=# select * from foo;f1 ---- (1 row) Not a very watertight domain constraint, is it? The begin/commit is not necessary to cause a failure, it just makes it easy to make the window for failure wide enough to hit in a manually entered example. 3. Too much locking, guise 1: the ALTER DOMAIN command will acquire exclusive lock on every table that it scans, and will hold all these locks until it commits. This can easily result in deadlocks --- against other ALTER DOMAIN commands, or just against any random other transaction that is unlucky enough to try to write any two tables touched by the ALTER DOMAIN. AFAICS you don't need an exclusive lock, you just want to prevent updates of the table until the domain changes are committed, so ShareLock would be sufficient; that would reduce but not eliminate the risk of deadlock. 4. Too much locking, guise 2: the ExclusiveLock acquired on pg_class by get_rels_with_domain has no useful effect, since it's released again at the end of the scan; it does manage to shut down most sorts of schema changes while get_rels_with_domain runs, however. This is bad enough, but: 5. Performance sucks. In the regression database on my machine, "alter domain mydom set not null" takes over six seconds --- that's for a freshly created domain that's not used *anywhere*. This can be blamed entirely on the inefficient implementation of get_rels_with_domain. In a database with more tables performance would get much worse; it's basically O(N^2). And it's holding ExclusiveLock on pg_class the whole time :-(. (A reasonably efficient way to make the same search would be to use pg_depend to look for columns that depend on the domain type --- this might find a few indirect dependencies, but it would certainly be lots faster than repeated seqscans over pg_attribute.) 6. Permission bogosity: as per discussion yesterday, ownership of a schema does not grant ownership rights on contained objects. 7. No mechanism for causing constraint changes to actually propagate after they are made. This is more a fault of the design of the domain constraint patch than it is of the alter patch, but nonetheless alter is what exposes it. The problem is particularly acute because you chose to insert a domain's constraint expressions into coercion operations at expression parsing time, which is far too early. A stored rule that has a coerce-to-domain operation in it will have a frozen idea of what constraints it should be enforcing. Probably the expression tree should just have a "CoerceToDomain foo" node in it, and at executor startup this node would have to look to the pg_type entry for foo to see exactly what it should be enforcing at the moment. Some of these are fixable, but I don't actually see any fix for point 2 short of creating some entirely new locking convention. Currently, only relations can be locked, but you'd really need an enforceable lock on the type itself to make a watertight solution, I think. Since we've never had any sort of supported ALTER TYPE operation before, the issue hasn't come up before ... regards, tom lane
On Tue, 2002-12-10 at 12:39, Tom Lane wrote: > I've been looking at the recently-committed ALTER DOMAIN patch, and I > think it's got some serious if not fatal problems. Specifically, the > approach to adding/dropping constraints associated with domains doesn't > work. > > 1. Insufficient locking, guise 1: there's no protection against someone > else dropping a column or whole table between the time you find a Ok.. I obviously have to spend some time to figure out how locking works and exactly what it affects. I had incorrectly assumed that since dropping a column required removal of the pg_attribute entry, that holding a RowExclusive on it would prevent that. > 2. Insufficient locking, guise 2: there's no protection against someone > else adding a column or table while you're processing an ALTER DOMAIN, > either. This means that constraint checks will be missed. Example: Locking the entry in pg_type doesn't prevent that? Afterall, something does a test to see if the type exists prior to allowing the client to add it. > 3. Too much locking, guise 1: the ALTER DOMAIN command will acquire > exclusive lock on every table that it scans, and will hold all these > locks until it commits. This can easily result in deadlocks --- against > other ALTER DOMAIN commands, or just against any random other > transaction that is unlucky enough to try to write any two tables > touched by the ALTER DOMAIN. AFAICS you don't need an exclusive lock, > you just want to prevent updates of the table until the domain changes > are committed, so ShareLock would be sufficient; that would reduce but > not eliminate the risk of deadlock. I noticed a completed TODO item that allows multiple locks to be obtained simultaneously, and had intended on using that for this -- but was having a hard time tracking down an example. > 4. Too much locking, guise 2: the ExclusiveLock acquired on pg_class by > get_rels_with_domain has no useful effect, since it's released again > at the end of the scan; it does manage to shut down most sorts of schema > changes while get_rels_with_domain runs, however. This is bad enough, > but: Yeah... Trying to transfer the lock to the attributes -- which as you've shown doesn't do what I thought. > 5. Performance sucks. In the regression database on my machine, "alter > domain mydom set not null" takes over six seconds --- that's for a > freshly created domain that's not used *anywhere*. This can be blamed > entirely on the inefficient implementation of get_rels_with_domain. Yes, I need to (and intend to) redo this with dependencies, but hadn't figured out how. I'm surprised it took 6 seconds though. I hardly notice any delay on a database with ~30 tables in it. > 6. Permission bogosity: as per discussion yesterday, ownership of a > schema does not grant ownership rights on contained objects. Patch submitted yesterday to correct this. > 7. No mechanism for causing constraint changes to actually propagate > after they are made. This is more a fault of the design of the domain > constraint patch than it is of the alter patch, but nonetheless alter is > what exposes it. The problem is particularly acute because you chose to > insert a domain's constraint expressions into coercion operations at > expression parsing time, which is far too early. A stored rule that has > a coerce-to-domain operation in it will have a frozen idea of what > constraints it should be enforcing. Probably the expression tree should > just have a "CoerceToDomain foo" node in it, and at executor startup > this node would have to look to the pg_type entry for foo to see exactly > what it should be enforcing at the moment. Thanks for the explanations. I'll see if I can 1) fix my poor knowledge of locking, 2) Add to my notes that I need to test stuff with Rules from now on, and 3) correct the above items. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Rod Taylor <rbt@rbt.ca> writes: >> 2. Insufficient locking, guise 2: there's no protection against someone >> else adding a column or table while you're processing an ALTER DOMAIN, >> either. This means that constraint checks will be missed. Example: > Locking the entry in pg_type doesn't prevent that? If there were such a thing as "locking the entry in pg_type", it might prevent that, but (a) there isn't, and (b) your code wouldn't invoke it if there were. Reading a row should surely not be tantamount to invoking an exclusive lock on it. In any case, other backends might have the pg_type entry in their syscaches, in which case their references to the type would be quite free of any actual read of the pg_type row that might fall foul of your hypothetical lock. To make this work in a reliable way, there needs to be some concept of acquiring a lock on the type as an entity, in the same way that LockRelation acquires a lock on a relation as an entity --- which has only the loosest possible connection to the notion of a lock on the relation's pg_class row. We have no such locks on types at present, but I think it may be time to invent 'em. regards, tom lane
On Tue, 2002-12-10 at 22:56, Tom Lane wrote: > Rod Taylor <rbt@rbt.ca> writes: > >> 2. Insufficient locking, guise 2: there's no protection against someone > >> else adding a column or table while you're processing an ALTER DOMAIN, > >> either. This means that constraint checks will be missed. Example: > > > Locking the entry in pg_type doesn't prevent that? > > If there were such a thing as "locking the entry in pg_type", it might > prevent that, but (a) there isn't, and (b) your code wouldn't invoke it > if there were. Reading a row should surely not be tantamount to > invoking an exclusive lock on it. Hrm... Yes.. I came to that conclusion while walking home. My concepts of locking, and what actually happens in PostgreSQL are two completely different things. > In any case, other backends might have the pg_type entry in their > syscaches, in which case their references to the type would be quite > free of any actual read of the pg_type row that might fall foul of > your hypothetical lock. So... Basically I'm cooked. > relation's pg_class row. We have no such locks on types at present, > but I think it may be time to invent 'em. I'd be happy to use them once created. Thanks again for the help. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Rod Taylor wrote: > > relation's pg_class row. We have no such locks on types at present, > > but I think it may be time to invent 'em. > > I'd be happy to use them once created. > > Thanks again for the help. Where does that leave the patch _until_ they are created? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Where does that leave the patch _until_ they are created? I'd say "it's under death sentence unless fixed before 7.4 release". I don't want to back it out in toto right now, because that will interfere with other edits I'm in process of making (and also Rod included some necessary fixes to the domain-constraint patch in the alter-domain patch; which wasn't too clean of him but it's done). For now, please put "fix or disable ALTER DOMAIN" on the must-do- before-7.4 part of TODO. regards, tom lane
Rod Taylor <rbt@rbt.ca> writes: > On Tue, 2002-12-10 at 22:56, Tom Lane wrote: >> relation's pg_class row. We have no such locks on types at present, >> but I think it may be time to invent 'em. > I'd be happy to use them once created. I think you misunderstood me ;=) ... that was a none-too-subtle suggestion that *you* should go invent 'em, seeing as how you're the one pushing the feature that makes 'em necessary. The lock manager itself deals with lock tags that could be almost anything. We currently only use lock tags that represent relations or specific pages in relations, but I see no reason that there couldn't also be lock tags representing types --- or other basic catalog entries. (I am trying hard to repress the thought that we may already need locking on other classes of entities as well.) What we need now is a little thought about exactly how to represent these different lock tags (should be easy), and about what semantics to assign to different lock modes applied to pg_type entities (perhaps not so easy). regards, tom lane
On Wed, 2002-12-11 at 00:05, Tom Lane wrote: > Rod Taylor <rbt@rbt.ca> writes: > > On Tue, 2002-12-10 at 22:56, Tom Lane wrote: > >> relation's pg_class row. We have no such locks on types at present, > >> but I think it may be time to invent 'em. > > > I'd be happy to use them once created. > > I think you misunderstood me ;=) ... that was a none-too-subtle Nope... I understood. But it could take quite a while. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Rod Taylor wrote: -- Start of PGP signed section. > On Wed, 2002-12-11 at 00:05, Tom Lane wrote: > > Rod Taylor <rbt@rbt.ca> writes: > > > On Tue, 2002-12-10 at 22:56, Tom Lane wrote: > > >> relation's pg_class row. We have no such locks on types at present, > > >> but I think it may be time to invent 'em. > > > > > I'd be happy to use them once created. > > > > I think you misunderstood me ;=) ... that was a none-too-subtle > > Nope... I understood. But it could take quite a while. I have an idea. Rather than doing some complex locking for types, why don't we just restrict ALTER DOMAIN to cases where we are the only one attached to the database, as seen in dropdb(). Seems like an easy way to get the feature in without adding a new locking method. Also, it would allow the regression test to work too because no one is attached to 'regression' at the time of the test. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I have an idea. Rather than doing some complex locking for types, why > don't we just restrict ALTER DOMAIN to cases where we are the only one > attached to the database, as seen in dropdb(). Yech! > would allow the regression test to work too because no one is attached > to 'regression' at the time of the test. No, we'd just get random reports of unreproducible failures. FWIW, I frequently attach to the regression database with a manual psql while the regression tests are running. It's a useful thing to do --- I've more than once found bugs that way. regards, tom lane
It is an idea if no better one can be found, unless we don't want ALTER DOMAIN at all, which doesn't seem good. --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I have an idea. Rather than doing some complex locking for types, why > > don't we just restrict ALTER DOMAIN to cases where we are the only one > > attached to the database, as seen in dropdb(). > > Yech! > > > would allow the regression test to work too because no one is attached > > to 'regression' at the time of the test. > > No, we'd just get random reports of unreproducible failures. > > FWIW, I frequently attach to the regression database with a manual psql > while the regression tests are running. It's a useful thing to do --- > I've more than once found bugs that way. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Wed, 2002-12-11 at 19:00, Bruce Momjian wrote: > It is an idea if no better one can be found, unless we don't want ALTER > DOMAIN at all, which doesn't seem good. I'll make a proposal for 'Object' locks as suggested, and we'll see where we go from there. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc