Thread: Reducing some DDL Locks to ShareLock

Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
It seems possible to change some DDL commands/subcommands to use a
ShareLock rather than an AccessExclusiveLock. Enclosed patch implements
this reduction for CREATE TRIGGER, CREATE RULE and ALTER TABLE.

We can relax locking as long as we can verify that the changes effect
DML statements *only*. I've marked the commands/subcommands this can be
applied to with an "s" (for ShareLock) and noted others that must remain
"x" (for AccessExclusiveLock).

s    CREATE RULE (only non-ON SELECT rules)

s    CREATE TRIGGER

ALTER TABLE
x    RENAME [ COLUMN ] column TO new_column
x    RENAME TO new_name
x    SET SCHEMA new_schema
x    ADD [ COLUMN ] column type [ column_constraint [ ... ] ]
x    DROP [ COLUMN ] column [ RESTRICT | CASCADE ]
x    ALTER [ COLUMN ] column TYPE type [ USING expression ]
x    ALTER [ COLUMN ] column SET DEFAULT expression
x    ALTER [ COLUMN ] column DROP DEFAULT
s    ALTER [ COLUMN ] column SET NOT NULL
x    ALTER [ COLUMN ] column DROP NOT NULL
s    ALTER [ COLUMN ] column SET STATISTICS integer
s    ALTER [ COLUMN ] column SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }
s    ADD table_constraint
x    DROP CONSTRAINT constraint_name [ RESTRICT | CASCADE ]
s    DISABLE TRIGGER [ trigger_name | ALL | USER ]
s    ENABLE TRIGGER [ trigger_name | ALL | USER ]
s    ENABLE REPLICA TRIGGER trigger_name
s    ENABLE ALWAYS TRIGGER trigger_name
x    DISABLE RULE rewrite_rule_name
x    ENABLE RULE rewrite_rule_name
x    ENABLE REPLICA RULE rewrite_rule_name
x    ENABLE ALWAYS RULE rewrite_rule_name
s    CLUSTER ON index_name
s    SET WITHOUT CLUSTER
x    SET WITHOUT OIDS
s    SET ( storage_parameter = value [, ... ] )
s    RESET ( storage_parameter [, ... ] )
x    INHERIT parent_table
x    NO INHERIT parent_table
x    OWNER TO new_owner
x    SET TABLESPACE new_tablespace

If an ALTER TABLE has more than one sub-command we take the highest lock
to cover the execution of all sub-commands, since we want to avoid lock
upgrades and deadlocks.

e.g.
ALTER TABLE foo ADD PRIMARY KEY (col1); will take ShareLock
whereas
ALTER TABLE foo ADD PRIMARY KEY (col1), INHERIT bar; will require an
AccessExclusiveLock

I've checked definitions of these subcommands here
http://developer.postgresql.org/pgdocs/postgres/sql-altertable.html
As noted, Rules can be ON SELECT, so we can't enable/disable them
without an AccessExclusiveLock. Other than that, many of the subcommands
don't change the actual table data, they just change settings for future
DML changes to the table.

This will
* allow ALTER TABLE ADD FOREIGN KEY to run in parallel with pg_restore
* allow trigger-based replication systems to add triggers more easily
* reduce the number of AccessExclusiveLocks that have to be dealt with
in Hot Standby mode.

Couple of points of note:

* Patch implements this by changing pg_class.reltriggers so it only has
two values, 0 or 1. Do we want to change this to a boolean, or leave as
an int2 for backwards compatibility? Code works either way. Enclosed
patch keeps it as an int2 for now, easily changed on request.

* transformIndexStatement() and DefineIndex() both take the wrong level
of locks in existing code, when called from ALTER TABLE. Not sure
whether this should be fixed in conjunction with these changes or not.
Doesn't seem to make any difference, since the "wrong level" is taken
after the initial lock is taken, so we never actually notice.

* ATRewriteTables() uses the wrong lock level in existing code when we
call validateForeignKeyConstraint(). It uses RowShareLock(), which is
not sufficient to stop DML when checking using the SELECT in
RI_Initial_Check(). Again, at present we take an AccessExclusiveLock
first, so not a problem now. Changed to ShareLock in patch.

Patch passes make check on an assert build, plus manual observation of
lock order and locks held. I've also introduced two new lock functions
LockHeldByMe() and RelationLockedByMe() to allow Assert() checking of
lock levels held. These are added in key points in ALTER TABLE code to
ensure no mistakes are made about required lock levels.

 doc/src/sgml/mvcc.sgml              |   17 !!
 src/backend/commands/tablecmds.c    |  278 ++++++!!!!!!!!!!!!!!!!!!!!!
 src/backend/commands/trigger.c      |   88 -!!!!!!!!!!
 src/backend/parser/parse_utilcmd.c  |   21 !!
 src/backend/rewrite/rewriteDefine.c |   10 !
 src/backend/storage/lmgr/lmgr.c     |   12 +
 src/backend/storage/lmgr/lock.c     |   30 +++
 src/backend/utils/adt/ri_triggers.c |    2
 src/include/commands/tablecmds.h    |    2
 src/include/storage/lmgr.h          |    2
 src/include/storage/lock.h          |    1
 11 files changed, 107 insertions(+), 11 deletions(-), 345 mods(!)

Patch is overall fairly straightforward:
* Assess required lock level
* Pass down decision through APIs to allow recursing to inherited tables
* Change comments at various places (or not)
* Trigger code needs to work around no knowing how many triggers are
present, so relcache now handled similarly to Rules

Patch correctly handles lock levels in cases such a reindex of indexes
following ALTER TYPE of an indexed column. It might appear from the code
that this had been missed, but tracing code shows it is correctly set.

There is still scope for an "ALTER TABLE CONCURRENTLY" command as has
been suggested. This proposal neither blocks that nor is blocked by it.
Not every application can be changed to accept new syntax, nor is it
desirable for every application to issue these commands only in
top-level transactions.

There are probably even more tweaks that we could do than the above
list, but the above seems fairly straightforward for now. We can always
do more later.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> It seems possible to change some DDL commands/subcommands to use a
> ShareLock rather than an AccessExclusiveLock. Enclosed patch implements
> this reduction for CREATE TRIGGER, CREATE RULE and ALTER TABLE.

What happens when two transactions try to do one of these things
concurrently to the same table?
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Mon, 2008-10-06 at 18:57 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > It seems possible to change some DDL commands/subcommands to use a
> > ShareLock rather than an AccessExclusiveLock. Enclosed patch implements
> > this reduction for CREATE TRIGGER, CREATE RULE and ALTER TABLE.
> 
> What happens when two transactions try to do one of these things
> concurrently to the same table?

It should be very similar to CREATE INDEX. If they hold ShareLocks then
their locks do not conflict at relation level.

Updates of their pg_class entry should be handled non-transactionally,
just as they are with CREATE INDEX. This is the reason for the change
from reltriggers being an exact count to being a boolean, i.e. it is
actually now relhastriggers = true | false. Multiple updaters will then
change the value to the same thing.

It was an excellent question because that aspect isn't handled correctly
in the enclosed patch for subcommands, other than index-creating
constraints. 

My main focus is on these commands
* CREATE TRIGGER
* ALTER TABLE ..  ADD PRIMARY KEY
* ALTER TABLE ..  ADD FOREIGN KEY

because those are the most painful ones. We could make it work against
more, but we'd need to rewrite lots and lots of catalog update code.

So I'll make CreateTrigger() use index_update_stats() so we get the
atomic update correct.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 07:21 +0100, Simon Riggs wrote:

> It was an excellent question because that aspect isn't handled correctly
> in the enclosed patch for subcommands, other than index-creating
> constraints.
>
> My main focus is on these commands
> * CREATE TRIGGER
> * ALTER TABLE ..  ADD PRIMARY KEY
> * ALTER TABLE ..  ADD FOREIGN KEY
 * CREATE RULE

> because those are the most painful ones. We could make it work against
> more, but we'd need to rewrite lots and lots of catalog update code.

OK, patch updated, tested, working.

Points of note.

1. I've left the infrastructure there for further changes, but we can
rip that out if desired.

2. Also need to decide whether we want pg_class.reltriggers as int2 (as
implemented here) or switch to relhastriggers as boolean.

3. The patch introduces a slight weirdness: if you create two FKs on the
same column at the same time you end up with two constraints with
identical names. Drop constraint then removes them both, though in other
respects they are both valid, just not uniquely. CREATE INDEX avoids
this by way of the unique index on relname. The equivalent index on
pg_constraint is not unique, though *cannot* be made unique without
breaking some corner cases of table inheritance.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: Reducing some DDL Locks to ShareLock

From
Andrew Dunstan
Date:

Simon Riggs wrote:
>
> My main focus is on these commands
> * CREATE TRIGGER
> * ALTER TABLE ..  ADD PRIMARY KEY
> * ALTER TABLE ..  ADD FOREIGN KEY
>
> because those are the most painful ones. We could make it work against
> more, but we'd need to rewrite lots and lots of catalog update code.
>
>
>   

Anything that scans the table is a prime candidate. In particular, for 
parallel pg_dump, ALTER TABLE ... ADD UNIQUE is important, as well as 
possibly other table constraints.

cheers

andrew


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 08:30 -0400, Andrew Dunstan wrote:

> Simon Riggs wrote:
> >
> > My main focus is on these commands
> > * CREATE TRIGGER
> > * ALTER TABLE ..  ADD PRIMARY KEY
> > * ALTER TABLE ..  ADD FOREIGN KEY
> >
> > because those are the most painful ones. We could make it work against
> > more, but we'd need to rewrite lots and lots of catalog update code.

> Anything that scans the table is a prime candidate. In particular, for 
> parallel pg_dump, ALTER TABLE ... ADD UNIQUE is important, as well as 
> possibly other table constraints.

Yes, I should have mentioned: today's patch (v5) does ADD UNIQUE also.

I tested a concurrent mix of ALTER PK, FK and CREATE INDEX, all fine.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
"Robert Haas"
Date:
> 3. The patch introduces a slight weirdness: if you create two FKs on the
> same column at the same time you end up with two constraints with
> identical names. Drop constraint then removes them both, though in other
> respects they are both valid, just not uniquely. CREATE INDEX avoids
> this by way of the unique index on relname. The equivalent index on
> pg_constraint is not unique, though *cannot* be made unique without
> breaking some corner cases of table inheritance.

Urk... this seems pretty undesirable.

...Robert


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 10:05 -0400, Robert Haas wrote:
> > 3. The patch introduces a slight weirdness: if you create two FKs on the
> > same column at the same time you end up with two constraints with
> > identical names. Drop constraint then removes them both, though in other
> > respects they are both valid, just not uniquely. CREATE INDEX avoids
> > this by way of the unique index on relname. The equivalent index on
> > pg_constraint is not unique, though *cannot* be made unique without
> > breaking some corner cases of table inheritance.
> 
> Urk... this seems pretty undesirable.

OK, but please say what behaviour you would like in its place. 

Or are you saying you dislike this so much that you would prefer not to
be able to run ALTER TABLE concurrently?

Also, how often do you think this will be a problem? This only happens
when you have two FKs on the exact same column set. That happens seldom,
if ever, AFAICS.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Tue, 2008-10-07 at 10:05 -0400, Robert Haas wrote:
>>> 3. The patch introduces a slight weirdness: if you create two FKs on the
>>> same column at the same time you end up with two constraints with
>>> identical names. Drop constraint then removes them both, though in other
>>> respects they are both valid, just not uniquely. CREATE INDEX avoids
>>> this by way of the unique index on relname. The equivalent index on
>>> pg_constraint is not unique, though *cannot* be made unique without
>>> breaking some corner cases of table inheritance.
>> 
>> Urk... this seems pretty undesirable.

> OK, but please say what behaviour you would like in its place. 

I wonder whether this could be helped if we refactored pg_constraint.
The lack of a useful pkey for it has been annoying me for awhile,
and I think it stems from a misguided choice to put table and domain
constraints into the same catalog.  Suppose that

* table constraints go into pg_relation_constraint, with a unique key
on (conrelid, conname)

* domain constraints go into pg_domain_constraint, with a unique key
on (contypid, conname)

* pg_constraint can still exist as a union view, for client
compatibility

Then the unique key would prevent concurrent creation of
identically-named constraints for the same relation.

I'm confused by your comment about inheritance --- what cases are
you thinking this would break?
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
"Robert Haas"
Date:
>> Urk... this seems pretty undesirable.
>
> OK, but please say what behaviour you would like in its place.
>
> Or are you saying you dislike this so much that you would prefer not to
> be able to run ALTER TABLE concurrently?

Personally, yes.  I work mostly with small databases where ease of
management is a lot more important than increased concurrency, and
"constraints almost always have unique names but you're not allowed to
rely on that in any queries or code because there is this one wierd
case that you probably will never hit where it might not be true"
doesn't sound like ease of management to me.  However, I hope we're
not forced into that choice, because this sounds like a great feature
otherwise.

....Robert


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 10:35 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Tue, 2008-10-07 at 10:05 -0400, Robert Haas wrote:
> >>> 3. The patch introduces a slight weirdness: if you create two FKs on the
> >>> same column at the same time you end up with two constraints with
> >>> identical names. Drop constraint then removes them both, though in other
> >>> respects they are both valid, just not uniquely. CREATE INDEX avoids
> >>> this by way of the unique index on relname. The equivalent index on
> >>> pg_constraint is not unique, though *cannot* be made unique without
> >>> breaking some corner cases of table inheritance.
> >> 
> >> Urk... this seems pretty undesirable.
> 
> > OK, but please say what behaviour you would like in its place. 
> 
> I wonder whether this could be helped if we refactored pg_constraint.
> The lack of a useful pkey for it has been annoying me for awhile,
> and I think it stems from a misguided choice to put table and domain
> constraints into the same catalog.  Suppose that
> 
> * table constraints go into pg_relation_constraint, with a unique key
> on (conrelid, conname)
> 
> * domain constraints go into pg_domain_constraint, with a unique key
> on (contypid, conname)
> 
> * pg_constraint can still exist as a union view, for client
> compatibility
> 
> Then the unique key would prevent concurrent creation of
> identically-named constraints for the same relation.

Sounds better. Doesn't make much sense as it is now.

> I'm confused by your comment about inheritance --- what cases are
> you thinking this would break?

Well, I made the index unique and got a boat load of errors. I guess I
might have misdiagnosed their cause. I'll have another look.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> 2. Also need to decide whether we want pg_class.reltriggers as int2 (as
> implemented here) or switch to relhastriggers as boolean.

I'd go for changing the column name/type.  Yeah, you will break any
clients that are still trying to manipulate reltriggers directly, but
better to break them obviously than non-obviously.  And I think a silent
change in the column semantics has significant risk of the latter.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Tue, 2008-10-07 at 10:35 -0400, Tom Lane wrote:
>> I wonder whether this could be helped if we refactored pg_constraint.

> Sounds better. Doesn't make much sense as it is now.

I looked at the code a bit, and it seems the only place where the
current design makes any sense is in ChooseConstraintName, which
explains itself thusly:
* Select a nonconflicting name for a new constraint.** The objective here is to choose a name that is unique within
the*specified namespace.  Postgres does not require this, but the SQL* spec does, and some apps depend on it.
Thereforewe avoid choosing* default names that so conflict.** Note: it is theoretically possible to get a collision
anyway,if someone* else chooses the same name concurrently.  This is fairly unlikely to be* a problem in practice,
especiallyif one is holding an exclusive lock on* the relation identified by name1.
 

(The last bit of the comment falls flat when you consider constraints
on domains...)

Note that this policy is used for system-selected constraint names;
it's not enforced against user-selected names.  We do attempt (in
ConstraintNameIsUsed) to reject duplicate user-selected constraint names
*on the same object*, but that test is not bulletproof against
concurrent additions.  The refactoring I suggested would make for
bulletproof enforcement via the unique indexes.

To preserve the same behavior for system-selected constraint names with
the new design, we'd still need to store namespace OIDs in the two new
tables (I had been thinking those columns would go away), and still have
nonunique indexes on (conname, connamespace), and probe both of the new
catalogs via these indexes to look for a match to a proposed constraint
name.  So that's a bit of a PITA but certainly doable.  Again, it's not
bulletproof against concurrent insertions, but the existing code is not
either.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 11:46 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Tue, 2008-10-07 at 10:35 -0400, Tom Lane wrote:
> >> I wonder whether this could be helped if we refactored pg_constraint.
> 
> > Sounds better. Doesn't make much sense as it is now.
> 
> I looked at the code a bit, and it seems the only place where the
> current design makes any sense is in ChooseConstraintName, which
> explains itself thusly:
> 
>  * Select a nonconflicting name for a new constraint.
>  *
>  * The objective here is to choose a name that is unique within the
>  * specified namespace.  Postgres does not require this, but the SQL
>  * spec does, and some apps depend on it.  Therefore we avoid choosing
>  * default names that so conflict.
>  *
>  * Note: it is theoretically possible to get a collision anyway, if someone
>  * else chooses the same name concurrently.  This is fairly unlikely to be
>  * a problem in practice, especially if one is holding an exclusive lock on
>  * the relation identified by name1.
> 
> (The last bit of the comment falls flat when you consider constraints
> on domains...)
> 
> Note that this policy is used for system-selected constraint names;
> it's not enforced against user-selected names.  We do attempt (in
> ConstraintNameIsUsed) to reject duplicate user-selected constraint names
> *on the same object*, but that test is not bulletproof against
> concurrent additions.  The refactoring I suggested would make for
> bulletproof enforcement via the unique indexes.
> 
> To preserve the same behavior for system-selected constraint names with
> the new design, we'd still need to store namespace OIDs in the two new
> tables (I had been thinking those columns would go away), and still have
> nonunique indexes on (conname, connamespace), and probe both of the new
> catalogs via these indexes to look for a match to a proposed constraint
> name.  So that's a bit of a PITA but certainly doable.  Again, it's not
> bulletproof against concurrent insertions, but the existing code is not
> either.

How about we put a partial unique index on instead?

Dunno if its possible, but the above begins to sound too much froth for
such a small error.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 11:15 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > 2. Also need to decide whether we want pg_class.reltriggers as int2 (as
> > implemented here) or switch to relhastriggers as boolean.
>
> I'd go for changing the column name/type.  Yeah, you will break any
> clients that are still trying to manipulate reltriggers directly, but
> better to break them obviously than non-obviously.  And I think a silent
> change in the column semantics has significant risk of the latter.

New version with column type change.

Initdb forced, psql, pg_dump changes also.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> How about we put a partial unique index on instead?

We don't currently support partial indexes on system catalogs.
I forget what all would need to be fixed to make it happen, but I'm
pretty sure it's not trivial.  Certainly refactoring pg_constraint
would be a lot easier.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Greg Stark
Date:
We can't do partial indexes on system tables. I forget exactly why nut  
if you search for relevant comments it should pop up.

greg

On 7 Oct 2008, at 07:38 PM, Simon Riggs <simon@2ndQuadrant.com> wrote:

>
> On Tue, 2008-10-07 at 11:46 -0400, Tom Lane wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> On Tue, 2008-10-07 at 10:35 -0400, Tom Lane wrote:
>>>> I wonder whether this could be helped if we refactored  
>>>> pg_constraint.
>>
>>> Sounds better. Doesn't make much sense as it is now.
>>
>> I looked at the code a bit, and it seems the only place where the
>> current design makes any sense is in ChooseConstraintName, which
>> explains itself thusly:
>>
>> * Select a nonconflicting name for a new constraint.
>> *
>> * The objective here is to choose a name that is unique within the
>> * specified namespace.  Postgres does not require this, but the SQL
>> * spec does, and some apps depend on it.  Therefore we avoid choosing
>> * default names that so conflict.
>> *
>> * Note: it is theoretically possible to get a collision anyway, if  
>> someone
>> * else chooses the same name concurrently.  This is fairly unlikely  
>> to be
>> * a problem in practice, especially if one is holding an exclusive  
>> lock on
>> * the relation identified by name1.
>>
>> (The last bit of the comment falls flat when you consider constraints
>> on domains...)
>>
>> Note that this policy is used for system-selected constraint names;
>> it's not enforced against user-selected names.  We do attempt (in
>> ConstraintNameIsUsed) to reject duplicate user-selected constraint  
>> names
>> *on the same object*, but that test is not bulletproof against
>> concurrent additions.  The refactoring I suggested would make for
>> bulletproof enforcement via the unique indexes.
>>
>> To preserve the same behavior for system-selected constraint names  
>> with
>> the new design, we'd still need to store namespace OIDs in the two  
>> new
>> tables (I had been thinking those columns would go away), and still  
>> have
>> nonunique indexes on (conname, connamespace), and probe both of the  
>> new
>> catalogs via these indexes to look for a match to a proposed  
>> constraint
>> name.  So that's a bit of a PITA but certainly doable.  Again, it's  
>> not
>> bulletproof against concurrent insertions, but the existing code is  
>> not
>> either.
>
> How about we put a partial unique index on instead?
>
> Dunno if its possible, but the above begins to sound too much froth  
> for
> such a small error.
>
> -- 
> Simon Riggs           www.2ndQuadrant.com
> PostgreSQL Training, Services and Support
>
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 11:46 -0400, Tom Lane wrote: 
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Tue, 2008-10-07 at 10:35 -0400, Tom Lane wrote:
> >> I wonder whether this could be helped if we refactored pg_constraint.
> 
> > Sounds better. Doesn't make much sense as it is now.
> 
> I looked at the code a bit, and it seems the only place where the
> current design makes any sense is in ChooseConstraintName, which
> explains itself thusly:
> 
>  * Select a nonconflicting name for a new constraint.
>  *
>  * The objective here is to choose a name that is unique within the
>  * specified namespace.  Postgres does not require this, but the SQL
>  * spec does, and some apps depend on it.  Therefore we avoid choosing
>  * default names that so conflict.
>  *
>  * Note: it is theoretically possible to get a collision anyway, if someone
>  * else chooses the same name concurrently.  This is fairly unlikely to be
>  * a problem in practice, especially if one is holding an exclusive lock on
>  * the relation identified by name1.
> 
> (The last bit of the comment falls flat when you consider constraints
> on domains...)
> 
> Note that this policy is used for system-selected constraint names;
> it's not enforced against user-selected names.  We do attempt (in
> ConstraintNameIsUsed) to reject duplicate user-selected constraint names
> *on the same object*, but that test is not bulletproof against
> concurrent additions.  The refactoring I suggested would make for
> bulletproof enforcement via the unique indexes.

Thought about this some more and examined the way CREATE INDEX works.

Current patch allows concurrent duplicate name creation. Putting a
unique index on it makes it wait for the first command to finish, then
causes it to fail. (That in itself is painful, surely DDL should fail if
it sees another DDL statement in progress trying to do same thing).

Simply creating two FKs concurrently doesn't give any kind of a problem.
Only problem occurs if we have duplicate names.

Problems won't go away if we have unique index. We will still fail to
get concurrent ALTER TABLE, since similar named constraints will wait
behind others, then fail. That is actually worse behaviour than now,
because now we wait behind first statement, then select new unique name.
So currently, we wait then succeed. If we have a unique index we will
wait and then fail.

So the problem AFAICS isn't the lack of a unique index, it is the name
selection itself. If we have more unique naming for constraints the
problem goes away.

Right now the name is selected based upon the column in the FK. The
relation referenced doesn't form any part of the name. So if you have a
column that references two separate tables (very weird) you get the
problem because the name doesn't uniquely describe the situation.

Currently we name things according to the pattern:
<tablename>_<columname1>_fkey. If we have multiple columns we still only
use the first column. That makes it much more likely to have duplicates.

An alternative would be
<tablename>_<referencedtablename>_<columname1>_fkey
which is more useful as well as more likely to be unique. This is a much
better situation than just appending a number to the end of the name.
That's always been annoying since if you assume that all constraints end
with "_fkey" you miss out some valid constraints. So it seems better to
have a longer, more unique name and enforce the idea that it always ends
with "_fkey".

If we *don't* do this, there's not much point adding a unique index,
because it will prevent concurrency in some cases and that's the whole
point here.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> (That in itself is painful, surely DDL should fail if
> it sees another DDL statement in progress trying to do same thing).

Surely not. The other transaction doing the DDL might roll back.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Wed, 2008-10-08 at 11:24 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > (That in itself is painful, surely DDL should fail if
> > it sees another DDL statement in progress trying to do same thing).
> 
> Surely not. The other transaction doing the DDL might roll back.

Maybe so, but trying to create a duplicate object in the first place is
also fairly questionable. Why would anybody be doing that and it *not*
be an error? Of potentially a more serious kind. I'd rather have it fail
with a sensible error message so I can work out what to do about that,
rather than wait for hours and then have it fail anyway.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Wed, 2008-10-08 at 11:24 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> (That in itself is painful, surely DDL should fail if
>>> it sees another DDL statement in progress trying to do same thing).
>> 
>> Surely not. The other transaction doing the DDL might roll back.

> Maybe so, but trying to create a duplicate object in the first place is
> also fairly questionable.

Indeed, which is why I wonder why you are concerning yourself with this
case at all.  I certainly don't think that it needs to drive the design.

In the case of a parallel restore, the restore script is going to be
specifying constraint names to match the old database; so the
name-selection code won't even be executed, and collisions aren't going
to happen.

I'd be happier with switching to the two-catalog design since it would
at least make one of the uniqueness conditions bulletproof; but that's
a cleanup issue that does not seem very relevant to parallel restore
performance issues.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-10-07 at 10:35 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Tue, 2008-10-07 at 10:05 -0400, Robert Haas wrote:
> >>> 3. The patch introduces a slight weirdness: if you create two FKs on the
> >>> same column at the same time you end up with two constraints with
> >>> identical names. Drop constraint then removes them both, though in other
> >>> respects they are both valid, just not uniquely. CREATE INDEX avoids
> >>> this by way of the unique index on relname. The equivalent index on
> >>> pg_constraint is not unique, though *cannot* be made unique without
> >>> breaking some corner cases of table inheritance.
> >> 
> >> Urk... this seems pretty undesirable.
> 
> > OK, but please say what behaviour you would like in its place. 
> 
> I wonder whether this could be helped if we refactored pg_constraint.
> The lack of a useful pkey for it has been annoying me for awhile,
> and I think it stems from a misguided choice to put table and domain
> constraints into the same catalog.  Suppose that
> 
> * table constraints go into pg_relation_constraint, with a unique key
> on (conrelid, conname)
> 
> * domain constraints go into pg_domain_constraint, with a unique key
> on (contypid, conname)
> 
> * pg_constraint can still exist as a union view, for client
> compatibility
> 
> Then the unique key would prevent concurrent creation of
> identically-named constraints for the same relation.

I'm planning to squeeze this refactoring work in now also before freeze,
unless somebody else wants to pick this up?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Tue, 2008-10-07 at 11:15 -0400, Tom Lane wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> 2. Also need to decide whether we want pg_class.reltriggers as int2 (as
>>> implemented here) or switch to relhastriggers as boolean.
>> 
>> I'd go for changing the column name/type.  Yeah, you will break any
>> clients that are still trying to manipulate reltriggers directly, but
>> better to break them obviously than non-obviously.  And I think a silent
>> change in the column semantics has significant risk of the latter.

> New version with column type change. 

I'm starting to review this now.  It strikes me that while we are at it,
we should get rid of the useless pg_class columns relukeys, relfkeys,
and relrefs.  These haven't been maintained since Berkeley days, and
this patch puts the final kibosh on any thought that we'd ever start
to maintain relukeys and relfkeys counts.

Any objections?
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Sun, 2008-11-09 at 13:58 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Tue, 2008-10-07 at 11:15 -0400, Tom Lane wrote:
> >> Simon Riggs <simon@2ndQuadrant.com> writes:
> >>> 2. Also need to decide whether we want pg_class.reltriggers as int2 (as
> >>> implemented here) or switch to relhastriggers as boolean.
> >> 
> >> I'd go for changing the column name/type.  Yeah, you will break any
> >> clients that are still trying to manipulate reltriggers directly, but
> >> better to break them obviously than non-obviously.  And I think a silent
> >> change in the column semantics has significant risk of the latter.
> 
> > New version with column type change. 
> 
> I'm starting to review this now.  It strikes me that while we are at it,
> we should get rid of the useless pg_class columns relukeys, relfkeys,
> and relrefs.  These haven't been maintained since Berkeley days, and
> this patch puts the final kibosh on any thought that we'd ever start
> to maintain relukeys and relfkeys counts.
> 
> Any objections?

None here.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Reviewing away ...

There's a fairly serious problem with this patch, which is that it
overlooks one of the reasons that index_update_stats can work the
way it does:
    * 3. Because we execute CREATE INDEX with just share lock on the parent    * rel (to allow concurrent index
creations),an ordinary update could    * suffer a tuple-concurrently-updated failure against another CREATE    * INDEX
committingat about the same time.  We can avoid that by having    * them both do nontransactional updates (we assume
theywill both be                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^    * trying to change the
pg_classrow to the same thing, so it doesn't      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^    * matter which
goesfirst).
 

While the above assumption would still be true for two CreateTrigger
operations happening in parallel, it is surely *not* true for, say,
CreateTrigger and CreateIndex in parallel.

The window for a race condition is actually fairly wide, because the
various functions that call heap_inplace_update mostly start with a
tuple they got from syscache, which means it probably reflects the most
recent commit, and could be missing changes from recent nontransactional
updates.  So the second guy to apply his change could wipe out the first
guy's change.

I looked at the other existing uses of heap_inplace_update, and I think
there could possibly be a similar issue for vac_update_datfrozenxid(),
though the chance of creating a real problem there seems vanishingly
small.  In any case the proposed patch has got lots of chances for
trouble since it introduces several different not-mutually-monotonic
ways of changing a pg_class entry nontransactionally.

I do not think this is a fatal problem, but what it means is that we
have to get some kind of short-term lock on the target tuple and be
sure we read the actual old tuple contents *after* acquiring that lock.
It would seem to be a good idea to fix all the uses of
heap_inplace_update to work that way, even if they seem safe at the
moment.

Any thoughts about the best way to do it?  My immediate inclination is
to use heap_lock_tuple but it's a bit expensive.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Sun, 2008-11-09 at 17:12 -0500, Tom Lane wrote:

> Reviewing away ...

Thanks for reviewing.

> There's a fairly serious problem...

...

> Any thoughts about the best way to do it?  My immediate inclination is
> to use heap_lock_tuple but it's a bit expensive.

Not sure how non-transactional tuple locking would/could work.

The user space solution to this problem is optimistic locking. i.e.
re-read the row immediately prior to the update. If row has changed,
keep re-reading it until it stays same, then update. Rely on block
locking to protect us. I'm tired and handwaving a lot.

Will think some more and report back.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Will think some more and report back.

If you want to do some more development, here's the portion of the
patch as yet unapplied --- will save you extracting it for yourself.

            regards, tom lane


Attachment

Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Sun, 2008-11-09 at 20:18 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > Will think some more and report back.
> 
> If you want to do some more development, here's the portion of the
> patch as yet unapplied --- will save you extracting it for yourself.

Thanks.

More thought tells me that we should have a
LockRelationForCatalogUpdate() that uses nearly the same design pattern
as LockRelationForExtension(). There is no lockmode since we always take
the lock in exclusive mode.

Callers would grab the catalog update lock, re-read catalog, assemble
the new tuple, make in-place update and release lock. Lock is
non-transactional and exists only to serialise catalog updates from
concurrent DDL operations.

We then have the rule that all callers of heap_inplace_update() must
already hold the catalog update lock.

You like?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> More thought tells me that we should have a
> LockRelationForCatalogUpdate() that uses nearly the same design pattern
> as LockRelationForExtension(). There is no lockmode since we always take
> the lock in exclusive mode.

This works only for updaters that cooperate with the rule, though.

The scenario that is bothering me is a manual UPDATE on a pg_class row
happening concurrently with an operation that wants to apply a
nontransactional update.  While that's more or less deprecated, there
are still plenty of people out there who might try it (cf the old trick
for disabling triggers).  I don't mind if one or the other operation
fails and rolls back, but silently losing the nontransactional update
would be entirely unacceptable, as it would quickly lead to database
corruption.

The reason I was thinking about heap_lock_tuple is that it might provide
a suitable defense against that case.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Mon, 2008-11-10 at 19:15 -0500, Tom Lane wrote:

> The reason I was thinking about heap_lock_tuple is that it might provide
> a suitable defense against that case.

OK. Lock tuple works OK, but its the unlock that I'm worried about. How
would non-transactional un-lock tuple work?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Mon, 2008-11-10 at 19:15 -0500, Tom Lane wrote:
>> The reason I was thinking about heap_lock_tuple is that it might provide
>> a suitable defense against that case.

> OK. Lock tuple works OK, but its the unlock that I'm worried about. How
> would non-transactional un-lock tuple work?

I was imagining that the heap_inplace_update operation would release the
lock.  Is there some problem with the concept?
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-11-11 at 21:57 -0500, Tom Lane wrote: 
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Mon, 2008-11-10 at 19:15 -0500, Tom Lane wrote:
> >> The reason I was thinking about heap_lock_tuple is that it might provide
> >> a suitable defense against that case.
> 
> > OK. Lock tuple works OK, but its the unlock that I'm worried about. How
> > would non-transactional un-lock tuple work?
> 
> I was imagining that the heap_inplace_update operation would release the
> lock.  Is there some problem with the concept?

Not the concept, just the mechanism.

Current tuple lock requestors do XactLockTableWait() which waits until
the lock on the transaction is released and removed from procarray.

The only way I can see to do this is by having another lock type, using
an additional info bit on t_infomask2. If that is set then we just wait
on a tuple lock, rather on the transaction lock. So we would add another
wait case into the heap_lock_tuple(), heap_update() and heap_delete().
Maybe NonXactLockTupleWait().

Requirement is to have anybody placing a non-transactional tuple lock to
already have a lock on relation. We would not need to WAL log this type
of lock, since we will require the lock holder to keep the buffer
pinned. The lock type would not need to be replayed for hot standby,

That OK, or do you see another way?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Tue, 2008-11-11 at 21:57 -0500, Tom Lane wrote: 
>> I was imagining that the heap_inplace_update operation would release the
>> lock.  Is there some problem with the concept?

> Not the concept, just the mechanism.

> Current tuple lock requestors do XactLockTableWait() which waits until
> the lock on the transaction is released and removed from procarray.

Ah, I see.  Yeah, that won't work nicely.

> The only way I can see to do this is by having another lock type, using
> an additional info bit on t_infomask2.

The alternative I was thinking about involved taking an exclusive buffer
lock on the page containing the tuple to be updated in-place.  The point
being that you have to examine the old tuple contents and decide whether
to update after you have lock, not before.  I think this would amount to
refactoring heap_inplace_update into two operations: fetch/lock and
update/unlock.  (I guess there should be a third function to release
without updating, too --- that would really just be an unlock-buffer
operation, but it'd be better if callers didn't explicitly know that.)
The callers would probably still use the syscache to obtain the tuple
address, but they wouldn't rely on it to supply the tuple image.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Wed, 2008-11-12 at 16:25 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Tue, 2008-11-11 at 21:57 -0500, Tom Lane wrote: 
> >> I was imagining that the heap_inplace_update operation would release the
> >> lock.  Is there some problem with the concept?
> 
> > Not the concept, just the mechanism.
> 
> > Current tuple lock requestors do XactLockTableWait() which waits until
> > the lock on the transaction is released and removed from procarray.
> 
> Ah, I see.  Yeah, that won't work nicely.
> 
> > The only way I can see to do this is by having another lock type, using
> > an additional info bit on t_infomask2.
> 
> The alternative I was thinking about involved taking an exclusive buffer
> lock on the page containing the tuple to be updated in-place.  The point
> being that you have to examine the old tuple contents and decide whether
> to update after you have lock, not before.  I think this would amount to
> refactoring heap_inplace_update into two operations: fetch/lock and
> update/unlock.  (I guess there should be a third function to release
> without updating, too --- that would really just be an unlock-buffer
> operation, but it'd be better if callers didn't explicitly know that.)
> The callers would probably still use the syscache to obtain the tuple
> address, but they wouldn't rely on it to supply the tuple image.

I'll look into this.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Wed, 2008-11-12 at 16:25 -0500, Tom Lane wrote:

> The alternative I was thinking about involved taking an exclusive buffer
> lock on the page containing the tuple to be updated in-place.  The point
> being that you have to examine the old tuple contents and decide whether
> to update after you have lock, not before.  I think this would amount to
> refactoring heap_inplace_update into two operations: fetch/lock and
> update/unlock.  (I guess there should be a third function to release
> without updating, too --- that would really just be an unlock-buffer
> operation, but it'd be better if callers didn't explicitly know that.)
> The callers would probably still use the syscache to obtain the tuple
> address, but they wouldn't rely on it to supply the tuple image.

Here's what I'm working on:

HeapTuple
heap_inplace_xxx(Relation relation, HeapTuple tuple, Buffer *buffer)

xxx = (fetch, update, release)

usage:

tuple = heap_inplace_fetch(rel, tuple, &buffer);

....

if (dirty)heap_inplace_fe(rel, tuple, &buffer);
elseheap_inplace_fetch(rel, tuple, &buffer);


heap_inplace_fetch takes as input "tuple" which is a palloc'd tuple,
extracts from it the tid of the tuple, reads the buffer, locks it, then
releases the original tuple. It then returns a copy of the on-block
tuple. So all other code the same as before when we were working on a
copy produced from the syscache.

Is that roughly what you intended?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Reducing some DDL Locks to ShareLock

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> heap_inplace_fetch takes as input "tuple" which is a palloc'd tuple,
> extracts from it the tid of the tuple, reads the buffer, locks it, then
> releases the original tuple. It then returns a copy of the on-block
> tuple. So all other code the same as before when we were working on a
> copy produced from the syscache.

> Is that roughly what you intended?

I'd suggest making it take a TID rather than presuming where the caller
is going to get the TID from.  Otherwise, +1.
        regards, tom lane


Re: Reducing some DDL Locks to ShareLock

From
Simon Riggs
Date:
On Tue, 2008-11-18 at 15:04 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > heap_inplace_fetch takes as input "tuple" which is a palloc'd tuple,
> > extracts from it the tid of the tuple, reads the buffer, locks it, then
> > releases the original tuple. It then returns a copy of the on-block
> > tuple. So all other code the same as before when we were working on a
> > copy produced from the syscache.
> 
> > Is that roughly what you intended?
> 
> I'd suggest making it take a TID rather than presuming where the caller
> is going to get the TID from.  Otherwise, +1.

Just to mention I haven't forgotten about this. I wrote a patch on 20/11
and was debugging it when I fell ill. I've updated the patch now to CVS
HEAD and will retest it in next few days.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support