Thread: mapping object names to role IDs

mapping object names to role IDs

From
Robert Haas
Date:
Suppose you have an object name as a CString and you want to convert
it to an OID.  The method of doing this varies widely depending on the
object type:

oid = get_database_oid(name);
oid = get_tablespace_oid(name);
oid = GetForeignDataWrapperOidByName(name, true);
oid = GetForeignServerOidByName(name, true);
oid = LookupFuncNameTypeNames(name, args, true);
oid = LookupAggNameTypeNames(name, args, true);
oid = LookupFuncNameTypeNames(name, args, true);
oid = LookupOperNameTypeNames(name, args, true);
oid = GetSysCacheOid1(LANGNAME, CStringGetDatum(name));
oid = GetSysCacheOid1(NAMESPACENAME, CStringGetDatum(name));
oid = GetSysCacheOid1(AMNAME, CStringGetDatum(name));
oid = get_roleid(name);
oid = GetConstraintByName(reloid, name);
oid = FindConversionByName(list_make1(name));
oid = TSDictionaryGetDictid(list_make1(name), true);
oid = TSTemplateGetTmplid(list_make1(name), true);
oid = TSConfigGetCfgid(list_make1(name), true);

If you'd like to throw an error if the object doesn't exist, then you
can change the "true" parameter in the above examples to false, where
it exists, or you can use get_roleid_checked for roles.  For the types
where a direct syscache lookup is used, you get to write the check
yourself.  For constraints, GetConstraintByName throws an error
anyway; there's no equivalent that doesn't.  Some other object types -
like rules and casts - have even more complex logic that is sometimes
duplicated in multiple places; and others (like tablespaces) that have
convenience functions still manage to duplicate the logic anyway.
Long story short, this is kind of a mess.

Looking at the different cases, there seem to be three main cases:

1. Objects that just have one name, like databases and procedural languages.
2. Objects that have possibly-qualified names, like conversions and
text search dictionaries.
3. Objects that have both names and argument, like functions and operators.

There's enough complexity about the stuff in category #3 that I'm
inclined to leave it alone, but try to make the other stuff more
standardized.  What I would propose is that we create a new source
file somewhere (maybe utils/cache), move all of the other functions of
this type there, give them standardized names, and provide them all
with an argument that specifies whether an error is to be thrown if
the object doesn't exist.  Begin bikeshedding here.  I suggest that
the names be get_<object-type>_oid and that they take two parameters,
either a List * for possibly-qualified names (a little wonky, but it's
what we do now) or a char * for unqualified names, and a boolean
indicating whether to throw an error if the object isn't found.  Thus:

Oid get_<object-type>_oid(List *qualname, bool missingok);
-or-
Oid get_<object-type>_oid(char *name, bool missingok);

Thus get_database_oid and get_tablespace_oid would remain unchanged
except for taking a second argument, get_roleid and get_roleid_checked
would merge, and all of the others would change to match that style.

Thoughts?

P.S. Purple.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
Robert Haas
Date:
And of course I meant for the subject line to be "mapping object names
to OIDs", not "role IDs".

Sigh.

...Robert


Re: mapping object names to role IDs

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> Long story short, this is kind of a mess.

I agree that it's a bit of a mess.

> What I would propose is that we create a new source
> file somewhere (maybe utils/cache), move all of the other functions of
> this type there, give them standardized names, and provide them all
> with an argument that specifies whether an error is to be thrown if
> the object doesn't exist.

Something which has come up in the past is that putting all the
functions that do the same kind of thing, but operate on different
types of objects, into the same backend file/area ends up meaning that
such an area has an untold amount of knowledge about everything.
Additionally, what *does* go into said area has tons of things that are
only related by the kind of operation- not because they actually have
anything to do with each other.

This was one of the complaints levied at the approach for moving all the
ACL checking into one place.  I think it would be good to have a
consistant naming/calling scheme for these various functions, but I'm
not sure that moving them all to the same place makes sense.
Thanks,
    Stephen

Re: mapping object names to role IDs

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Long story short, this is kind of a mess.

> ... I think it would be good to have a
> consistant naming/calling scheme for these various functions, but I'm
> not sure that moving them all to the same place makes sense.

I'm with Stephen on this one.  I agree that standardizing the function
names and behavior would be a good idea, but don't try to put them all
in one place.

BTW, the plain-name cases should be "const char *", else some callers
will have to cast away const.  You could possibly make an argument for
"const List *" in the qualified-name cases, but we don't do that
anywhere else so I think it'd just look funny here (and would require
internally casting away const, too).
        regards, tom lane


Re: mapping object names to role IDs

From
Robert Haas
Date:
On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>> * Robert Haas (robertmhaas@gmail.com) wrote:
>>> Long story short, this is kind of a mess.
>
>> ... I think it would be good to have a
>> consistant naming/calling scheme for these various functions, but I'm
>> not sure that moving them all to the same place makes sense.
>
> I'm with Stephen on this one.  I agree that standardizing the function
> names and behavior would be a good idea, but don't try to put them all
> in one place.

Some of the existing functions are in lsyscache.c, some are in files
in the commands directory, and some are in files in the parser
directory; also, even between commands and parser, not every object
type has its own file.  It would be nice to bring some consistency to
where the functions are located as well as what they do.  Any thoughts
on how to achieve that?

> BTW, the plain-name cases should be "const char *", else some callers
> will have to cast away const.  You could possibly make an argument for
> "const List *" in the qualified-name cases, but we don't do that
> anywhere else so I think it'd just look funny here (and would require
> internally casting away const, too).

Makes sense.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm with Stephen on this one. �I agree that standardizing the function
>> names and behavior would be a good idea, but don't try to put them all
>> in one place.

> Some of the existing functions are in lsyscache.c, some are in files
> in the commands directory, and some are in files in the parser
> directory; also, even between commands and parser, not every object
> type has its own file.  It would be nice to bring some consistency to
> where the functions are located as well as what they do.  Any thoughts
> on how to achieve that?

I think both Stephen and I are saying we don't see merit in that.
Moving around pre-existing functions won't accomplish much except
causing include-list churn.  Let's just standardize the names/APIs
and be done.
        regards, tom lane


Re: mapping object names to role IDs

From
Robert Haas
Date:
On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm with Stephen on this one.  I agree that standardizing the function
>>> names and behavior would be a good idea, but don't try to put them all
>>> in one place.
>
>> Some of the existing functions are in lsyscache.c, some are in files
>> in the commands directory, and some are in files in the parser
>> directory; also, even between commands and parser, not every object
>> type has its own file.  It would be nice to bring some consistency to
>> where the functions are located as well as what they do.  Any thoughts
>> on how to achieve that?
>
> I think both Stephen and I are saying we don't see merit in that.
> Moving around pre-existing functions won't accomplish much except
> causing include-list churn.  Let's just standardize the names/APIs
> and be done.

Where do we put the new functions?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think both Stephen and I are saying we don't see merit in that.
>> Moving around pre-existing functions won't accomplish much except
>> causing include-list churn. �Let's just standardize the names/APIs
>> and be done.

> Where do we put the new functions?

Probably in files that are already concerned with each object type.
        regards, tom lane


Re: mapping object names to role IDs

From
Robert Haas
Date:
On Sun, May 23, 2010 at 1:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think both Stephen and I are saying we don't see merit in that.
>>> Moving around pre-existing functions won't accomplish much except
>>> causing include-list churn.  Let's just standardize the names/APIs
>>> and be done.
>
>> Where do we put the new functions?
>
> Probably in files that are already concerned with each object type.

Not every object type has a file, and the existing functions are split
across three different directories, sometimes in files that don't
really pertain to the object type being dealt with.  I think this is
going to be difficult to maintain if we intentionally spread out the
parallel code across essentially the entire backend.  But I guess I
can code it up and we can argue about it then.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Not every object type has a file, and the existing functions are split
> across three different directories, sometimes in files that don't
> really pertain to the object type being dealt with.  I think this is
> going to be difficult to maintain if we intentionally spread out the
> parallel code across essentially the entire backend.  But I guess I
> can code it up and we can argue about it then.

The only thing that seems really significantly parallel is the error
message to be issued for object-not-found.  I would suggest maybe
putting the code in lsyscache.c, except that lsyscache functions
generally are not expected to throw error on object-not-found.

As for "not every object type has a file", there is certainly code
someplace that would be calling these functions.  Whereever (the
preponderance of) such calls are would be an appropriate place for the
function, IMO.
        regards, tom lane


Re: mapping object names to role IDs

From
Peter Eisentraut
Date:
On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote:
> Oid get_<object-type>_oid(List *qualname, bool missingok);
> -or-
> Oid get_<object-type>_oid(char *name, bool missingok);
> 
> Thus get_database_oid and get_tablespace_oid would remain unchanged
> except for taking a second argument, get_roleid and get_roleid_checked
> would merge, and all of the others would change to match that style.

If you are doing some refactoring work in that area, maybe you can also
take care of the issue I talked about there:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php

"""
Our code contains about 200 copies of the following code:

tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0);
if (!HeapTupleIsValid(tuple))   elog(ERROR, "cache lookup failed for foo %u", fooid);
"""

It looks like your proposal would reduce that number significantly.




Re: mapping object names to role IDs

From
Robert Haas
Date:
On Wed, May 26, 2010 at 5:27 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote:
>> Oid get_<object-type>_oid(List *qualname, bool missingok);
>> -or-
>> Oid get_<object-type>_oid(char *name, bool missingok);
>>
>> Thus get_database_oid and get_tablespace_oid would remain unchanged
>> except for taking a second argument, get_roleid and get_roleid_checked
>> would merge, and all of the others would change to match that style.
>
> If you are doing some refactoring work in that area, maybe you can also
> take care of the issue I talked about there:
> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php
>
> """
> Our code contains about 200 copies of the following code:
>
> tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0);
> if (!HeapTupleIsValid(tuple))
>    elog(ERROR, "cache lookup failed for foo %u", fooid);
> """
>
> It looks like your proposal would reduce that number significantly.

Well, not directly, because I was proposing to do something about
wanting to turn an object identified by name into an OID, rather than
wanting to look up an OID and find a syscache tuple.  But the same
approach could be applied to the problem you mention.

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend.  It's true that that one file will then
depend on all the catalog stuff, but it actually can limit
dependencies a little bit on the other end, because if someone wants
to call a bunch of these functions from the same file, they only need
to include the one header where they are all declared, rather than all
the individual files that contain the individual functions.  And more
to the point, it's way easier from a maintenance standpoint: there is
much less chance that someone will change one function without
changing all the others if they are all in the same place.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I still feel that we'd be better off putting all the functions that
> use the same design pattern in a single file, rather than spreading
> them out all over the backend.  It's true that that one file will then
> depend on all the catalog stuff, but it actually can limit
> dependencies a little bit on the other end, because if someone wants
> to call a bunch of these functions from the same file, they only need
> to include the one header where they are all declared,

This is nonsense, because the call sites are going to be places that are
actually *doing* something with that catalog, and so will need not only
the catalog .h file but any other support functions associated with
doing work on that catalog.  Centralizing the lookups will do nothing
whatsoever to reduce dependencies; it'll just create a central file
dependent on everything in addition to every dependency we have now.

The closest equivalent we have now is lsyscache.c, which is not exactly
a sterling example of how to design a module: it's got no conceptual
consistency whatsoever.

I'm for standardizing the API of lookup functions, but not for
relocating them.
        regards, tom lane


Re: mapping object names to role IDs

From
Robert Haas
Date:
On Wed, May 26, 2010 at 9:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This is nonsense

You can assert that, but I don't agree.  We certainly have places
(comment.c being the obvious example) where we need to look up a name
and map it to an OID without doing anything else, and actually I
believe there are useful ways to refactor the code that might lead to
more of this.  Anyway, I think the code maintenance argument ought to
carry a lot more weight than whether one or two small files get
rebuilt for dependencies slightly more often.  lsyscache.c might have
no conceptual consistency but it's extremely useful, and there are
plenty of other examples of where we've put code for different object
types into a single file to simplify maintenance and reduce code
complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
alvherre
Date:
Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:

> I still feel that we'd be better off putting all the functions that
> use the same design pattern in a single file, rather than spreading
> them out all over the backend.  It's true that that one file will then
> depend on all the catalog stuff, but it actually can limit
> dependencies a little bit on the other end, because if someone wants
> to call a bunch of these functions from the same file, they only need
> to include the one header where they are all declared, rather than all
> the individual files that contain the individual functions.

This doesn't buy you anything, because that one header will likely have
to #include all the other headers anyway.  And if this is so, then all
those headers will now be included in all files that require even a
single one of these functions.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: mapping object names to role IDs

From
Robert Haas
Date:
On Wed, May 26, 2010 at 11:01 AM, alvherre <alvherre@commandprompt.com> wrote:
> Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:
>
>> I still feel that we'd be better off putting all the functions that
>> use the same design pattern in a single file, rather than spreading
>> them out all over the backend.  It's true that that one file will then
>> depend on all the catalog stuff, but it actually can limit
>> dependencies a little bit on the other end, because if someone wants
>> to call a bunch of these functions from the same file, they only need
>> to include the one header where they are all declared, rather than all
>> the individual files that contain the individual functions.
>
> This doesn't buy you anything, because that one header will likely have
> to #include all the other headers anyway.  And if this is so, then all
> those headers will now be included in all files that require even a
> single one of these functions.

Well, at any rate, I'm giving up on the argument.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
Tom Lane
Date:
alvherre <alvherre@commandprompt.com> writes:
> Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:
>> I still feel that we'd be better off putting all the functions that
>> use the same design pattern in a single file, rather than spreading
>> them out all over the backend.

> This doesn't buy you anything, because that one header will likely have
> to #include all the other headers anyway.  And if this is so, then all
> those headers will now be included in all files that require even a
> single one of these functions.

For the particular case Robert is proposing, the *header* isn't a
problem, because the only types it would deal in are Oid, bool,
const char *, and List *.  But you're right that in general this design
pattern carries a risk of having to include the world in a commonly-used
header file, which is certainly not a good idea.
        regards, tom lane


Re: mapping object names to role IDs

From
Robert Haas
Date:
On Wed, May 26, 2010 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> alvherre <alvherre@commandprompt.com> writes:
>> Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:
>>> I still feel that we'd be better off putting all the functions that
>>> use the same design pattern in a single file, rather than spreading
>>> them out all over the backend.
>
>> This doesn't buy you anything, because that one header will likely have
>> to #include all the other headers anyway.  And if this is so, then all
>> those headers will now be included in all files that require even a
>> single one of these functions.
>
> For the particular case Robert is proposing, the *header* isn't a
> problem, because the only types it would deal in are Oid, bool,
> const char *, and List *.  But you're right that in general this design
> pattern carries a risk of having to include the world in a commonly-used
> header file, which is certainly not a good idea.

Right.  I am very cognizant of the problem, but it isn't really an
issue in this case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: mapping object names to role IDs

From
alvherre
Date:
Excerpts from Robert Haas's message of mié may 26 10:34:00 -0400 2010:

> lsyscache.c might have no conceptual consistency but it's extremely
> useful,

I know I've been annoyed by lsyscache: looking for accessors to catalog
stuff, not finding them and so creating my own by using syscache
directly, only to find out later that they already existed there.
I think we should be moving in the direction of *removing* lsyscache,
not replicating it.

BTW I quite agree with both the suggestion you give in this thread
(modulo this issue), and Peter's idea of getting rid of the repetitive
syscache coding pattern.

> and there are
> plenty of other examples of where we've put code for different object
> types into a single file to simplify maintenance and reduce code
> complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.).

Well, that's all related to node manipulation, so I'm not so sure it's
exactly the same.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support