Simplifying OID lookups in the presence of namespaces - Mailing list pgsql-hackers

From Tom Lane
Subject Simplifying OID lookups in the presence of namespaces
Date
Msg-id 26213.1019507037@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
We are about to need to fix a fair number of places in client code
(eg, psql and pg_dump) that presently do things like

SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'foo');

This does not work reliably anymore because there could be multiple
relations named 'foo' in different namespaces.  The sub-select to
get the relation OID will fail because it'll return multiple results.

The brute-force answer is

SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'foo'                 AND relnamespace = (SELECT oid FROM
pg_namespace                                    WHERE nspname = 'bar'));
 

But aside from being really ugly, this requires that the client code
know exactly which namespace contains the relation it's after.  If
the client is relying on namespace search then it may not know that;
in fact, the client code very possibly isn't even aware of the exact
namespace search path it's using.  I am planning to introduce an
informational function CURRENT_SCHEMAS() (or some such name) that
returns the current effective search path, probably as a NAME[] array.
But it looks really, really messy to write an SQL query that makes
use of such a function to look up the first occurrence of 'foo' in
the search path.  We need to encapsulate the lookup procedure somehow
so that we don't have lots of clients reinventing this wheel.

We already have some functions that accept a text string and do a
suitable lookup of a relation; an example is nextval(), for which
you can presently write
nextval('foo')        --- searches namespace path for foonextval('foo.bar')    --- looks only in namespace
foonextval('"Foo".bar')   --- quoting works for mixed-case names
 

Seems like what we want to do is make the lookup part of this available
separately, as a function that takes such a string and returns an OID.
We'd need such functions for each of the namespace-ified object kinds:
relations, datatypes, functions, and operators.

A variant of the idea of inventing functions is to extend the existing
datatype 'regproc' to do this, and invent also 'regclass', 'regtype',
'regoperator' datatypes to do the lookups for the other object kinds.
I proposed this in a different context last year,http://archives.postgresql.org/pgsql-hackers/2001-08/msg00589.php
but it seemed too late to do anything with the idea for 7.2.

If we went with the datatype approach then we'd be able to write
queries like

SELECT * FROM pg_attribute WHERE attrelid = 'foo'::regclass;

or

SELECT * FROM pg_attribute WHERE attrelid = 'foo.bar'::regclass;

or for that matter you could do

SELECT * FROM pg_attribute WHERE attrelid = regclass('foo');

which'd be syntactically indistinguishable from using a function.

The datatype approach seems a little bit odder at first glance, but it
has some interesting possibilities with respect to implicit casting
(see above-referenced thread).  So I'm inclined to go that route unless
someone's got an objection.

With a datatype, we also have outbound conversion to think of: so there
must be a function that takes an OID and produces a string.  What I am
inclined to do on that side is emit an unqualified name if the OID
refers to a relation/type/etc that would be found first in the current
namespace search path.  Otherwise, a qualified name (foo.bar) would be
emitted.  This will have usefulness for applications like pg_dump, which
will have exactly this requirement (per discussion a few days ago that
pg_dump should not qualify names unnecessarily).

One question is what to do with invalid input.  For example, if table
foo doesn't exist then what should 'foo'::regclass do?  The existing
regproc datatype throws an error, but I wonder whether it wouldn't be
more useful to return NULL.  Any thoughts on that?

Also, for functions and operators the name alone is not sufficient to
uniquely identify the object.  Type regproc currently throws an error
if asked to convert a nonunique function name; that severely limits its
usefulness.  I'm toying with allowing datatypes in the input string,
eg'sum(bigint)'::regproc
but I wonder if this will create compatibility problems.  In particular,
should the regproc and regoperator output converters include datatype
indicators in the output string?  (Always, never, only if not unique?)
Doing so would be a non-backwards-compatible change for regproc.
We might avoid that complaint by leaving regproc as-is and instead
inventing a parallel datatype (say regfunction) that supports datatype
indications.  But I'm not sure whether regproc is used enough to make
this an important concern.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Patrick Macdonald
Date:
Subject: Re: Documentation on page files
Next
From: "Dann Corbit"
Date:
Subject: What is wrong with hashed index usage?