Thread: mapping object names to role IDs
Suppose you have an object name as a CString and you want to convert it to an OID. The method of doing this varies widely depending on the object type: oid = get_database_oid(name); oid = get_tablespace_oid(name); oid = GetForeignDataWrapperOidByName(name, true); oid = GetForeignServerOidByName(name, true); oid = LookupFuncNameTypeNames(name, args, true); oid = LookupAggNameTypeNames(name, args, true); oid = LookupFuncNameTypeNames(name, args, true); oid = LookupOperNameTypeNames(name, args, true); oid = GetSysCacheOid1(LANGNAME, CStringGetDatum(name)); oid = GetSysCacheOid1(NAMESPACENAME, CStringGetDatum(name)); oid = GetSysCacheOid1(AMNAME, CStringGetDatum(name)); oid = get_roleid(name); oid = GetConstraintByName(reloid, name); oid = FindConversionByName(list_make1(name)); oid = TSDictionaryGetDictid(list_make1(name), true); oid = TSTemplateGetTmplid(list_make1(name), true); oid = TSConfigGetCfgid(list_make1(name), true); If you'd like to throw an error if the object doesn't exist, then you can change the "true" parameter in the above examples to false, where it exists, or you can use get_roleid_checked for roles. For the types where a direct syscache lookup is used, you get to write the check yourself. For constraints, GetConstraintByName throws an error anyway; there's no equivalent that doesn't. Some other object types - like rules and casts - have even more complex logic that is sometimes duplicated in multiple places; and others (like tablespaces) that have convenience functions still manage to duplicate the logic anyway. Long story short, this is kind of a mess. Looking at the different cases, there seem to be three main cases: 1. Objects that just have one name, like databases and procedural languages. 2. Objects that have possibly-qualified names, like conversions and text search dictionaries. 3. Objects that have both names and argument, like functions and operators. There's enough complexity about the stuff in category #3 that I'm inclined to leave it alone, but try to make the other stuff more standardized. What I would propose is that we create a new source file somewhere (maybe utils/cache), move all of the other functions of this type there, give them standardized names, and provide them all with an argument that specifies whether an error is to be thrown if the object doesn't exist. Begin bikeshedding here. I suggest that the names be get_<object-type>_oid and that they take two parameters, either a List * for possibly-qualified names (a little wonky, but it's what we do now) or a char * for unqualified names, and a boolean indicating whether to throw an error if the object isn't found. Thus: Oid get_<object-type>_oid(List *qualname, bool missingok); -or- Oid get_<object-type>_oid(char *name, bool missingok); Thus get_database_oid and get_tablespace_oid would remain unchanged except for taking a second argument, get_roleid and get_roleid_checked would merge, and all of the others would change to match that style. Thoughts? P.S. Purple. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
And of course I meant for the subject line to be "mapping object names to OIDs", not "role IDs". Sigh. ...Robert
* Robert Haas (robertmhaas@gmail.com) wrote: > Long story short, this is kind of a mess. I agree that it's a bit of a mess. > What I would propose is that we create a new source > file somewhere (maybe utils/cache), move all of the other functions of > this type there, give them standardized names, and provide them all > with an argument that specifies whether an error is to be thrown if > the object doesn't exist. Something which has come up in the past is that putting all the functions that do the same kind of thing, but operate on different types of objects, into the same backend file/area ends up meaning that such an area has an untold amount of knowledge about everything. Additionally, what *does* go into said area has tons of things that are only related by the kind of operation- not because they actually have anything to do with each other. This was one of the complaints levied at the approach for moving all the ACL checking into one place. I think it would be good to have a consistant naming/calling scheme for these various functions, but I'm not sure that moving them all to the same place makes sense. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Robert Haas (robertmhaas@gmail.com) wrote: >> Long story short, this is kind of a mess. > ... I think it would be good to have a > consistant naming/calling scheme for these various functions, but I'm > not sure that moving them all to the same place makes sense. I'm with Stephen on this one. I agree that standardizing the function names and behavior would be a good idea, but don't try to put them all in one place. BTW, the plain-name cases should be "const char *", else some callers will have to cast away const. You could possibly make an argument for "const List *" in the qualified-name cases, but we don't do that anywhere else so I think it'd just look funny here (and would require internally casting away const, too). regards, tom lane
On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Stephen Frost <sfrost@snowman.net> writes: >> * Robert Haas (robertmhaas@gmail.com) wrote: >>> Long story short, this is kind of a mess. > >> ... I think it would be good to have a >> consistant naming/calling scheme for these various functions, but I'm >> not sure that moving them all to the same place makes sense. > > I'm with Stephen on this one. I agree that standardizing the function > names and behavior would be a good idea, but don't try to put them all > in one place. Some of the existing functions are in lsyscache.c, some are in files in the commands directory, and some are in files in the parser directory; also, even between commands and parser, not every object type has its own file. It would be nice to bring some consistency to where the functions are located as well as what they do. Any thoughts on how to achieve that? > BTW, the plain-name cases should be "const char *", else some callers > will have to cast away const. You could possibly make an argument for > "const List *" in the qualified-name cases, but we don't do that > anywhere else so I think it'd just look funny here (and would require > internally casting away const, too). Makes sense. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm with Stephen on this one. �I agree that standardizing the function >> names and behavior would be a good idea, but don't try to put them all >> in one place. > Some of the existing functions are in lsyscache.c, some are in files > in the commands directory, and some are in files in the parser > directory; also, even between commands and parser, not every object > type has its own file. It would be nice to bring some consistency to > where the functions are located as well as what they do. Any thoughts > on how to achieve that? I think both Stephen and I are saying we don't see merit in that. Moving around pre-existing functions won't accomplish much except causing include-list churn. Let's just standardize the names/APIs and be done. regards, tom lane
On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm with Stephen on this one. I agree that standardizing the function >>> names and behavior would be a good idea, but don't try to put them all >>> in one place. > >> Some of the existing functions are in lsyscache.c, some are in files >> in the commands directory, and some are in files in the parser >> directory; also, even between commands and parser, not every object >> type has its own file. It would be nice to bring some consistency to >> where the functions are located as well as what they do. Any thoughts >> on how to achieve that? > > I think both Stephen and I are saying we don't see merit in that. > Moving around pre-existing functions won't accomplish much except > causing include-list churn. Let's just standardize the names/APIs > and be done. Where do we put the new functions? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think both Stephen and I are saying we don't see merit in that. >> Moving around pre-existing functions won't accomplish much except >> causing include-list churn. �Let's just standardize the names/APIs >> and be done. > Where do we put the new functions? Probably in files that are already concerned with each object type. regards, tom lane
On Sun, May 23, 2010 at 1:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I think both Stephen and I are saying we don't see merit in that. >>> Moving around pre-existing functions won't accomplish much except >>> causing include-list churn. Let's just standardize the names/APIs >>> and be done. > >> Where do we put the new functions? > > Probably in files that are already concerned with each object type. Not every object type has a file, and the existing functions are split across three different directories, sometimes in files that don't really pertain to the object type being dealt with. I think this is going to be difficult to maintain if we intentionally spread out the parallel code across essentially the entire backend. But I guess I can code it up and we can argue about it then. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > Not every object type has a file, and the existing functions are split > across three different directories, sometimes in files that don't > really pertain to the object type being dealt with. I think this is > going to be difficult to maintain if we intentionally spread out the > parallel code across essentially the entire backend. But I guess I > can code it up and we can argue about it then. The only thing that seems really significantly parallel is the error message to be issued for object-not-found. I would suggest maybe putting the code in lsyscache.c, except that lsyscache functions generally are not expected to throw error on object-not-found. As for "not every object type has a file", there is certainly code someplace that would be calling these functions. Whereever (the preponderance of) such calls are would be an appropriate place for the function, IMO. regards, tom lane
On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote: > Oid get_<object-type>_oid(List *qualname, bool missingok); > -or- > Oid get_<object-type>_oid(char *name, bool missingok); > > Thus get_database_oid and get_tablespace_oid would remain unchanged > except for taking a second argument, get_roleid and get_roleid_checked > would merge, and all of the others would change to match that style. If you are doing some refactoring work in that area, maybe you can also take care of the issue I talked about there: http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php """ Our code contains about 200 copies of the following code: tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0); if (!HeapTupleIsValid(tuple)) elog(ERROR, "cache lookup failed for foo %u", fooid); """ It looks like your proposal would reduce that number significantly.
On Wed, May 26, 2010 at 5:27 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote: >> Oid get_<object-type>_oid(List *qualname, bool missingok); >> -or- >> Oid get_<object-type>_oid(char *name, bool missingok); >> >> Thus get_database_oid and get_tablespace_oid would remain unchanged >> except for taking a second argument, get_roleid and get_roleid_checked >> would merge, and all of the others would change to match that style. > > If you are doing some refactoring work in that area, maybe you can also > take care of the issue I talked about there: > http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php > > """ > Our code contains about 200 copies of the following code: > > tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0); > if (!HeapTupleIsValid(tuple)) > elog(ERROR, "cache lookup failed for foo %u", fooid); > """ > > It looks like your proposal would reduce that number significantly. Well, not directly, because I was proposing to do something about wanting to turn an object identified by name into an OID, rather than wanting to look up an OID and find a syscache tuple. But the same approach could be applied to the problem you mention. I still feel that we'd be better off putting all the functions that use the same design pattern in a single file, rather than spreading them out all over the backend. It's true that that one file will then depend on all the catalog stuff, but it actually can limit dependencies a little bit on the other end, because if someone wants to call a bunch of these functions from the same file, they only need to include the one header where they are all declared, rather than all the individual files that contain the individual functions. And more to the point, it's way easier from a maintenance standpoint: there is much less chance that someone will change one function without changing all the others if they are all in the same place. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > I still feel that we'd be better off putting all the functions that > use the same design pattern in a single file, rather than spreading > them out all over the backend. It's true that that one file will then > depend on all the catalog stuff, but it actually can limit > dependencies a little bit on the other end, because if someone wants > to call a bunch of these functions from the same file, they only need > to include the one header where they are all declared, This is nonsense, because the call sites are going to be places that are actually *doing* something with that catalog, and so will need not only the catalog .h file but any other support functions associated with doing work on that catalog. Centralizing the lookups will do nothing whatsoever to reduce dependencies; it'll just create a central file dependent on everything in addition to every dependency we have now. The closest equivalent we have now is lsyscache.c, which is not exactly a sterling example of how to design a module: it's got no conceptual consistency whatsoever. I'm for standardizing the API of lookup functions, but not for relocating them. regards, tom lane
On Wed, May 26, 2010 at 9:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > This is nonsense You can assert that, but I don't agree. We certainly have places (comment.c being the obvious example) where we need to look up a name and map it to an OID without doing anything else, and actually I believe there are useful ways to refactor the code that might lead to more of this. Anyway, I think the code maintenance argument ought to carry a lot more weight than whether one or two small files get rebuilt for dependencies slightly more often. lsyscache.c might have no conceptual consistency but it's extremely useful, and there are plenty of other examples of where we've put code for different object types into a single file to simplify maintenance and reduce code complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010: > I still feel that we'd be better off putting all the functions that > use the same design pattern in a single file, rather than spreading > them out all over the backend. It's true that that one file will then > depend on all the catalog stuff, but it actually can limit > dependencies a little bit on the other end, because if someone wants > to call a bunch of these functions from the same file, they only need > to include the one header where they are all declared, rather than all > the individual files that contain the individual functions. This doesn't buy you anything, because that one header will likely have to #include all the other headers anyway. And if this is so, then all those headers will now be included in all files that require even a single one of these functions. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, May 26, 2010 at 11:01 AM, alvherre <alvherre@commandprompt.com> wrote: > Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010: > >> I still feel that we'd be better off putting all the functions that >> use the same design pattern in a single file, rather than spreading >> them out all over the backend. It's true that that one file will then >> depend on all the catalog stuff, but it actually can limit >> dependencies a little bit on the other end, because if someone wants >> to call a bunch of these functions from the same file, they only need >> to include the one header where they are all declared, rather than all >> the individual files that contain the individual functions. > > This doesn't buy you anything, because that one header will likely have > to #include all the other headers anyway. And if this is so, then all > those headers will now be included in all files that require even a > single one of these functions. Well, at any rate, I'm giving up on the argument. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
alvherre <alvherre@commandprompt.com> writes: > Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010: >> I still feel that we'd be better off putting all the functions that >> use the same design pattern in a single file, rather than spreading >> them out all over the backend. > This doesn't buy you anything, because that one header will likely have > to #include all the other headers anyway. And if this is so, then all > those headers will now be included in all files that require even a > single one of these functions. For the particular case Robert is proposing, the *header* isn't a problem, because the only types it would deal in are Oid, bool, const char *, and List *. But you're right that in general this design pattern carries a risk of having to include the world in a commonly-used header file, which is certainly not a good idea. regards, tom lane
On Wed, May 26, 2010 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > alvherre <alvherre@commandprompt.com> writes: >> Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010: >>> I still feel that we'd be better off putting all the functions that >>> use the same design pattern in a single file, rather than spreading >>> them out all over the backend. > >> This doesn't buy you anything, because that one header will likely have >> to #include all the other headers anyway. And if this is so, then all >> those headers will now be included in all files that require even a >> single one of these functions. > > For the particular case Robert is proposing, the *header* isn't a > problem, because the only types it would deal in are Oid, bool, > const char *, and List *. But you're right that in general this design > pattern carries a risk of having to include the world in a commonly-used > header file, which is certainly not a good idea. Right. I am very cognizant of the problem, but it isn't really an issue in this case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Excerpts from Robert Haas's message of mié may 26 10:34:00 -0400 2010: > lsyscache.c might have no conceptual consistency but it's extremely > useful, I know I've been annoyed by lsyscache: looking for accessors to catalog stuff, not finding them and so creating my own by using syscache directly, only to find out later that they already existed there. I think we should be moving in the direction of *removing* lsyscache, not replicating it. BTW I quite agree with both the suggestion you give in this thread (modulo this issue), and Peter's idea of getting rid of the repetitive syscache coding pattern. > and there are > plenty of other examples of where we've put code for different object > types into a single file to simplify maintenance and reduce code > complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.). Well, that's all related to node manipulation, so I'm not so sure it's exactly the same. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support