Thread: namespace dilemma

namespace dilemma

From
jeff.greco@bluehavenmg.com
Date:
I came across an interesting feature regarding namespace name changes.  To illustrate suppose you have two
connections open whose commands occur in the following sequence:

Time | Session A                                        | Session B
-----+--------------------------------------------------+-----------------------------------------------1   | CREATE
SCHEMAmy_schema;                         |2   | CREATE TABLE my_schema.my_table (my_column int); |3   | BEGIN;
                                |4   | INSERT INTO my_schema.my_table VALUES (1);       |5   |
                       | BEGIN;6   |                                                  | ALTER SCHEMA my_schema RENAME
TOyour_schema;7   |                                                  | COMMIT;8   | SELECT my_column FROM
my_schema.my_table;       |
 

If this is attempted, then session A results in the following error after the command issued at time "8":
ERROR:  schema "my_schema" does not exist

This feature occurs when the isolation level is either READ COMMITED or SERIALIZABLE.  If you instead were to
attempt a table rename in session B, then session B would appropriately hang waiting for an ACCESS EXCLUSIVE
lock.

My humble opinion (as a non-PostgreSQL developer) is that renaming the schema in an implied rename of the table
from my_schema.my_table to your_schema.my_table.  Therefore it should also obtain a lock of some type.

As a result, all of my server side functions begin with something along the lines of:   SELECT oid FROM
pg_catalog.pg_namespaces  WHERE nspname = 'my_schema' FOR UPDATE;
 
I do this for every schema which the function consults through the SPI manager.  Also, AFAIK, to be very
careful (paranoid) would require this tedious approach for every transaction.

I attempted to get around this issue by adding various entries to pg_rewrite to try to force a select statement
on pg_namespace to be rewritten as a SELECT ... FOR UPDATE.  This failed.  I have not tried to patch the
source, though I imagine it would not be difficult.

Any opinions on approaches to this issue or the correctness of the current behavior of PostgreSQL are greatly
appreciated.

Thanks,
Jeff Greco



Re: namespace dilemma

From
Tom Lane
Date:
jeff.greco@bluehavenmg.com writes:
> [ there isn't any locking associated with namespaces ]

Nor any other kind of object except tables...

There have been some discussions about fixing this, but the idea has
bogged down in concerns about performance and possible deadlocks.

To be completely correct, a query as simple asselect * from foo where col1 = 42;
would have to obtain locks on not only table foo, but- foo's namespace- the integer '=' operator- the int4eq() function
underlyingthe operator- the namespace(s) of the operator and function- any other namespaces preceding these in your
searchpath  (else someone might insert a new foo or '=' that should  take precedence in your query)- maybe also some
lockson data types
 
so you can see that we'd probably be talking at least one order of
magnitude increase in the traffic through the lock manager, which is
already something of a bottleneck :-(.  Plus this traffic would be
very heavily concentrated on a few popular objects, such as the
pg_catalog schema.  The lock manager is not really designed to have
large numbers of processes taking out locks on a single object, and
would probably need considerable work to make the performance
acceptable.

The concern about extra deadlocks is more hypothetical, but I think
there would definitely be some risk there, with so many more lock
requests flying around the system.

All in all it looks like a lot of work to solve a problem that doesn't
affect very many people.  I can't recall many complaints other than
yours ...
        regards, tom lane