Thread: refactoring comment.c
At PGCon, we discussed the possibility that a minimal SE-PostgreSQL implementation would need little more than a hook in ExecCheckRTPerms() [which we've since added] and a security label facility [for which KaiGai has submitted a patch]. I actually sat down to write the security label patch myself while we were in Ottawa, but quickly ran into difficulties: while the hook we have now can't do anything useful with objects other than relations, it's pretty clear from previous discussions on this topic that the demand for labels on other kinds of objects is not going to go away. Rather than adding additional syntax to every object type in the system (some of which don't even have ALTER commands at present), I suggested basing the syntax on the existing COMMENT syntax. After some discussion[1], we seem to have settled on the following: SECURITY LABEL [ FOR <provider> ] ON <object class> <object name> IS '<label>'; At present, there are some difficulties with generalizing this syntax to other object types. As I found out when I initially set out to write this patch, it'd basically require duplicating all of comment.c, which is an unpleasant prospect, because that file is big and crufty; it has a large amount of internal duplication. Furthermore, the existing locking mechanism that we're using for comments is known to be inadequate[2]. Dropping a comment while someone else is in the midst of commenting on it leaves orphaned comments lying around in pg_(sh)description that could later be inherited by a new object. That's a minor nuisance for comments and would be nice to fix, but is obviously a far larger problem for security labels, where even a small chance of randomly mislabeling an object is no good. So I wrote a patch. The attached patch factors out all of the code in comment.c that is responsible for translating parser representations into a new file parser/parse_object.c, leaving just the comment-specific stuff in commands/comment.c. It also adds appropriate locking, so that concurrent COMMENT/DROP scenarios don't leave behind leftovers. It's a fairly large patch, but the changes are extremely localized: comment.c gets a lot smaller, and parse_object.c gets bigger by a slightly smaller amount. Any comments? (ha ha ha...) [1] http://archives.postgresql.org/pgsql-hackers/2010-07/msg01328.php [2] http://archives.postgresql.org/pgsql-hackers/2010-07/msg00351.php -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Attachment
Excerpts from Robert Haas's message of vie ago 06 11:02:58 -0400 2010: > Any comments? (ha ha ha...) Interesting idea. The patch looks fine on a quick once-over. Two small things: this comment + /* + * Databases, tablespaces, and roles are cluster-wide objects, so any + * comments on those objects are record in the shared pg_shdescription + * catalog. Comments on all other objects are recorded in pg_description. + */ says "record" where it should say "recorded". Also, it strikes me that perhaps the ObjectAddress struct definition should be moved to the new header file; seems a more natural place for it (but then, it seems like a lot of files will need to include the new header, so perhaps that should be left to a subsequent patch). Thirdly, is it just me or just could replace a lot of code in DropFoo functions with this new auxiliary code? Seems it's just missing "missing_ok" support ... -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Fri, Aug 6, 2010 at 11:36 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Robert Haas's message of vie ago 06 11:02:58 -0400 2010: > >> Any comments? (ha ha ha...) > > Interesting idea. The patch looks fine on a quick once-over. Thanks for taking a look. > Two small > things: this comment > > + /* > + * Databases, tablespaces, and roles are cluster-wide objects, so any > + * comments on those objects are record in the shared pg_shdescription > + * catalog. Comments on all other objects are recorded in pg_description. > + */ > > says "record" where it should say "recorded". Thanks, good eye. > Also, it strikes me that perhaps the ObjectAddress struct definition > should be moved to the new header file; seems a more natural place for > it (but then, it seems like a lot of files will need to include the new > header, so perhaps that should be left to a subsequent patch). I thought about that, but erred on the side of being conservative and didn't move it. I like the idea, though. > Thirdly, is it just me or just could replace a lot of code in DropFoo > functions with this new auxiliary code? Seems it's just missing > "missing_ok" support ... I am not sure how much code it would save you at the level of the individual DropFoo() functions; I'd have to look through them more carefully. But now that you mention it, what about getting rid of all of the individual parse nodes for drop statements? Right now we have: DropTableSpaceStmt DropFdwStmt DropForeignServerStmt DropUserMappingStmt DropPLangStmt DropRoleStmt DropStmt (table, sequence, view, index, domain, conversion, schema, text search {parser, dictionary, template, configuration} DropPropertyStmt (rules and triggers) DropdbStmt (capitalized differently, just for fun) DropCastStmt RemoveFuncStmt (so you can't find it by grepping for Drop!) RemoveOpClassStmt RemoveOpFamilyStmt It seems like we could probably unify all of these into a single DropStmt, following the same pattern as CommentStmt, although I think perhaps that should be a follow-on patch rather than doing it as part of this one. GRANT also has some code to translate object names to OIDs, which I thought might be able to use this machinery as well, though I haven't really checked whether it makes sense. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Fri, 2010-08-06 at 11:02 -0400, Robert Haas wrote: > At PGCon, we discussed the possibility that a minimal SE-PostgreSQL > implementation would need little more than a hook in > ExecCheckRTPerms() [which we've since added] and a security label > facility [for which KaiGai has submitted a patch]. I actually sat > down to write the security label patch myself while we were in Ottawa, > but quickly ran into difficulties: while the hook we have now can't do > anything useful with objects other than relations, it's pretty clear > from previous discussions on this topic that the demand for labels on > other kinds of objects is not going to go away. Rather than adding > additional syntax to every object type in the system (some of which > don't even have ALTER commands at present), I suggested basing the > syntax on the existing COMMENT syntax. After some discussion[1], we > seem to have settled on the following: > > SECURITY LABEL [ FOR <provider> ] ON <object class> <object name> IS '<label>'; I understand the concept and it seems like it might work. Not too keen on pretending a noun is a verb. That leads to erroring. <verb> SECURITY LABEL? verb = CREATE, ADD, ... Can't objects have more than one label? How will you set default security labels on objects? Where do you define labels? Will there be a new privilege to define this? Presumably object owners would not be able to set that themselves, otherwise you could create an object, add a security label to it and then use it to see other things at that level. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
On Fri, Aug 6, 2010 at 12:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > I understand the concept and it seems like it might work. Not too keen > on pretending a noun is a verb. That leads to erroring. > > <verb> SECURITY LABEL? verb = CREATE, ADD, ... > > Can't objects have more than one label? > > How will you set default security labels on objects? > > Where do you define labels? > > Will there be a new privilege to define this? Presumably object owners > would not be able to set that themselves, otherwise you could create an > object, add a security label to it and then use it to see other things > at that level. Uh, these are all good questions, but I think they'd be more appropriate on the thread about the security label patch, to which I linked in my previous email. Many of them have already been discussed there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
(2010/08/07 0:02), Robert Haas wrote: > At PGCon, we discussed the possibility that a minimal SE-PostgreSQL > implementation would need little more than a hook in > ExecCheckRTPerms() [which we've since added] and a security label > facility [for which KaiGai has submitted a patch]. I actually sat > down to write the security label patch myself while we were in Ottawa, > but quickly ran into difficulties: while the hook we have now can't do > anything useful with objects other than relations, it's pretty clear > from previous discussions on this topic that the demand for labels on > other kinds of objects is not going to go away. Rather than adding > additional syntax to every object type in the system (some of which > don't even have ALTER commands at present), I suggested basing the > syntax on the existing COMMENT syntax. After some discussion[1], we > seem to have settled on the following: > > SECURITY LABEL [ FOR<provider> ] ON<object class> <object name> IS '<label>'; > > At present, there are some difficulties with generalizing this syntax > to other object types. As I found out when I initially set out to > write this patch, it'd basically require duplicating all of comment.c, > which is an unpleasant prospect, because that file is big and crufty; > it has a large amount of internal duplication. Furthermore, the > existing locking mechanism that we're using for comments is known to > be inadequate[2]. Dropping a comment while someone else is in the > midst of commenting on it leaves orphaned comments lying around in > pg_(sh)description that could later be inherited by a new object. > That's a minor nuisance for comments and would be nice to fix, but is > obviously a far larger problem for security labels, where even a small > chance of randomly mislabeling an object is no good. > > So I wrote a patch. The attached patch factors out all of the code in > comment.c that is responsible for translating parser representations > into a new file parser/parse_object.c, leaving just the > comment-specific stuff in commands/comment.c. It also adds > appropriate locking, so that concurrent COMMENT/DROP scenarios don't > leave behind leftovers. It's a fairly large patch, but the changes > are extremely localized: comment.c gets a lot smaller, and > parse_object.c gets bigger by a slightly smaller amount. > > Any comments? (ha ha ha...) > > [1] http://archives.postgresql.org/pgsql-hackers/2010-07/msg01328.php > [2] http://archives.postgresql.org/pgsql-hackers/2010-07/msg00351.php > Thanks for your efforts. I believe the get_object_address() enables to implement security label features on various kind of database objects. I tried to look at the patch. Most part is fine, but I found out two issues. On the object_exists(), when we verify existence of a large object, it needs to scan pg_largeobject_metadata, instead of pg_largeobject. When we implement pg_largeobject_metadata catalog, we decided to set LargeObjectRelationId on object.classId due to the backend compatibility. | /* | * For object types that have a relevant syscache, we use it; for | * everything else, we'll have to do an index-scan. This switch | * sets either the cache to be used for the syscache lookup, or the | * index to be used for the index scan. | */ | switch (address.classId) | { | case RelationRelationId: | cache = RELOID; | break; | : | case LargeObjectRelationId: | indexoid = LargeObjectMetadataOidIndexId; | break; | : | } | | /* Found a syscache? */ | if (cache != -1) | return SearchSysCacheExists1(cache, ObjectIdGetDatum(address.objectId)); | | /* No syscache, so examine the table directly. */ | Assert(OidIsValid(indexoid)); | ScanKeyInit(&skey[0], ObjectIdAttributeNumber, BTEqualStrategyNumber, | F_OIDEQ, ObjectIdGetDatum(address.objectId)); | rel = heap_open(address.classId, AccessShareLock); ^^^^^^^^^^^^^^^ <- It tries to open pg_largeobject | sd = systable_beginscan(rel, indexoid, true, SnapshotNow, 1, skey); | found = HeapTupleIsValid(systable_getnext(sd)); | systable_endscan(sd); | heap_close(rel, AccessShareLock); | return found; | } Although no caller invokes get_object_address() with lockmode = NoLock, isn't it necessary to skip locking if NoLock was given. | /* | * If we're dealing with a relation or attribute, then the relation is | * already locked. If we're dealing with any other type of object, we need | * to lock it and then verify that it still exists. | */ | if (address.classId != RelationRelationId) | { | if (IsSharedRelation(address.classId)) | LockSharedObject(address.classId, address.objectId, 0, lockmode); | else | LockDatabaseObject(address.classId, address.objectId, 0, lockmode); | /* Did it go away while we were waiting for the lock? */ | if (!object_exists(address)) | elog(ERROR, "cache lookup failed for class %u object %u subobj %d", | address.classId, address.objectId, address.objectSubId); | } Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Fri, Aug 6, 2010 at 11:15 PM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote: > [brief review] OK, here's an updated patch: 1. I fixed the typo Alvaro spotted. 2. I haven't done anything about moving the definition of ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure quite where it ought to go. I still think it's a good idea, though I'm not dead set on it, either. Suggestions? 3. I fixed the issue Kaigai Kohei spotted, regarding LargeObjectRelationId vs. LargeObjectMetadataRelationId, by adding a grotty hack. However, I feel that I'm not so much adding a new grotty hack as working around an existing grotty hack which was added for reasons I'm unclear on. Is there a pg_upgrade-related reason not to revert the original hack instead? 4. In response to Kaigai Kohei's complaint about lockmode possibly being NoLock, I've just added an Assert() that it isn't, in lieu of trying to do something sensible in that case. I can't at present think of a situation in which being able to call it that way would be useful, and the Assert() seems like it ought to be enough warning to anyone coming along later that they'd better think twice before thinking that will work. 5. Since I'm hoping Tom will read this, I ran it through filterdiff. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Attachment
(2010/08/16 11:50), Robert Haas wrote: > On Fri, Aug 6, 2010 at 11:15 PM, KaiGai Kohei<kaigai@kaigai.gr.jp> wrote: >> [brief review] > > OK, here's an updated patch: > > 1. I fixed the typo Alvaro spotted. > > 2. I haven't done anything about moving the definition of > ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure > quite where it ought to go. I still think it's a good idea, though > I'm not dead set on it, either. Suggestions? > > 3. I fixed the issue Kaigai Kohei spotted, regarding > LargeObjectRelationId vs. LargeObjectMetadataRelationId, by adding a > grotty hack. However, I feel that I'm not so much adding a new grotty > hack as working around an existing grotty hack which was added for > reasons I'm unclear on. Is there a pg_upgrade-related reason not to > revert the original hack instead? > When we were developing largeobject access controls, Tom Lane commented as follows: * Re: [HACKERS] [PATCH] Largeobject access controls http://marc.info/?l=pgsql-hackers&m=125548822906571&w=2 | I notice that the patch decides to change the pg_description classoid for | LO comments from pg_largeobject's OID to pg_largeobject_metadata's. This | will break existing clients that look at pg_description (eg, pg_dump and | psql, plus any other clients that have any intelligence about comments, | for instance it probably breaks pgAdmin). And there's just not a lot of | return that I can see. I agree that using pg_largeobject_metadata would | be more consistent given the new catalog layout, but I'm inclined to think | we should stick to the old convention on compatibility grounds. Given | that choice, for consistency we'd better also use pg_largeobject's OID not | pg_largeobject_metadata's in pg_shdepend entries for LOs. He concerned about existing applications which have knowledge about internal layout of system catalogs, then I fixed up the patch according to the suggestion. > 4. In response to Kaigai Kohei's complaint about lockmode possibly > being NoLock, I've just added an Assert() that it isn't, in lieu of > trying to do something sensible in that case. I can't at present > think of a situation in which being able to call it that way would be > useful, and the Assert() seems like it ought to be enough warning to > anyone coming along later that they'd better think twice before > thinking that will work. > > 5. Since I'm hoping Tom will read this, I ran it through filterdiff. :-) > -- KaiGai Kohei <kaigai@ak.jp.nec.com>
Excerpts from KaiGai Kohei's message of lun ago 16 00:19:54 -0400 2010: > (2010/08/16 11:50), Robert Haas wrote: > When we were developing largeobject access controls, Tom Lane commented > as follows: > > * Re: [HACKERS] [PATCH] Largeobject access controls > http://marc.info/?l=pgsql-hackers&m=125548822906571&w=2 > | I notice that the patch decides to change the pg_description classoid for > | LO comments from pg_largeobject's OID to pg_largeobject_metadata's. This > | will break existing clients that look at pg_description (eg, pg_dump and > | psql, plus any other clients that have any intelligence about comments, > | for instance it probably breaks pgAdmin). And there's just not a lot of > | return that I can see. I agree that using pg_largeobject_metadata would > | be more consistent given the new catalog layout, but I'm inclined to think > | we should stick to the old convention on compatibility grounds. Given > | that choice, for consistency we'd better also use pg_largeobject's OID not > | pg_largeobject_metadata's in pg_shdepend entries for LOs. > > He concerned about existing applications which have knowledge about internal > layout of system catalogs, then I fixed up the patch according to the suggestion. I think that with this patch we have the return for the change that we didn't have previously. A patch that changes it should also fix pg_dump and psql at the same time, but IMHO it doesn't make sense to keep adding grotty hacks on top of it. Maybe we could do with a grotty hack in obj_description() instead? (...checks...) Oh, it's defined as a SQL function directly in pg_proc.h :-( -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Robert Haas <robertmhaas@gmail.com> writes: > 5. Since I'm hoping Tom will read this, I ran it through filterdiff. :-) OK, I looked ;-). It mostly looks good, but of course I've got some opinions... > 2. I haven't done anything about moving the definition of > ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure > quite where it ought to go. I still think it's a good idea, though > I'm not dead set on it, either. Suggestions? I think the problem is you're trying to put this into backend/parser which is not really the right place for it. It's an execution-time animal not a parse-time animal. I would put it into backend/catalog, perhaps named objectaddress.c, and similarly the header file would be objectaddress.h. Then it would be reasonable to move struct ObjectAddress into this header and have dependency.h #include it. There might be some other stuff in dependency.c that more naturally belongs here, too. > 3. I fixed the issue Kaigai Kohei spotted, regarding > LargeObjectRelationId vs. LargeObjectMetadataRelationId, by adding a > grotty hack. However, I feel that I'm not so much adding a new grotty > hack as working around an existing grotty hack which was added for > reasons I'm unclear on. Is there a pg_upgrade-related reason not to > revert the original hack instead? It's not pg_upgrade (nor psql, nor pg_dump) we are protecting here. It's third-party applications that might understand the contents of pg_description, pg_depend, etc. I think that hack is quite small and localized enough to live with, rather than causing a flag day for an unknown number of clients by changing the representation. + /* + * Translate the parser representation which identifies this object into + * an ObjectAddress. get_object_address() will throw an error if the + * object does not exist. + */ + address = get_object_address(stmt->objtype, stmt->objname, stmt->objargs, + &relation, ShareUpdateExclusiveLock); I think this comment should also explicitly mention that we're getting a lock to protect against concurrent DROPs. + /* + * Translate an object name and arguments (as passed by the parser) to an + * ObjectAddress. + * + * The returned object will be locked using the specified lockmode. If a + * sub-object is looked up, the parent object will be locked instead. + * + * We don't currently provide a function to release the locks acquired here; + * typically, the lock must be held until commit to guard against a concurrent + * drop operation. + */ + ObjectAddress + get_object_address(ObjectType objtype, List *objname, List *objargs, + Relation *relp, LOCKMODE lockmode) This comment's a bit shy of a load too, since it totally fails to mention the relp argument, or specify what the caller is required to do with it, or explain how the locking on the relation works. As a matter of style, I'd suggest putting the single externally callable function (ie get_object_address) at the top of the file not the bottom. People shouldn't have to read through the entire file before finding out what API it is supposed to provide to the outside world. regards, tom lane
On Mon, Aug 16, 2010 at 3:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 2. I haven't done anything about moving the definition of >> ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure >> quite where it ought to go. I still think it's a good idea, though >> I'm not dead set on it, either. Suggestions? > > I think the problem is you're trying to put this into backend/parser > which is not really the right place for it. It's an execution-time > animal not a parse-time animal. I would put it into backend/catalog, > perhaps named objectaddress.c, and similarly the header file would be > objectaddress.h. Then it would be reasonable to move struct > ObjectAddress into this header and have dependency.h #include it. > There might be some other stuff in dependency.c that more naturally > belongs here, too. If this isn't parse analysis, then you and I have very different ideas of what parse analysis is. And under this theory, what are routines like LookupAggNameTypeNames() doing in src/backend/parser? I'll make the rest of the changes you suggest... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Aug 16, 2010 at 3:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think the problem is you're trying to put this into backend/parser >> which is not really the right place for it. > If this isn't parse analysis, then you and I have very different ideas > of what parse analysis is. Maybe so, but the parser is expected to put out a representation that will still be valid when the command is executed some time later. That is exactly why utility statements have the barely-more-than-source parsetree representation they do: because we do not hold locks on the objects from parsing to execution, we could not expect an OID-level representation to remain good. This is a lot different from what we do with DML statements, but there are good reasons for it. I repeat my observation that this code doesn't belong in /parser. The code you're replacing was not in /parser, and that was because it didn't belong there, not because somebody didn't understand the system structure. regards, tom lane
I wrote: > Maybe so, but the parser is expected to put out a representation that > will still be valid when the command is executed some time later. Rereading this, I see I didn't make my point very clearly. The reason this code doesn't belong in parser/ is that there's no prospect the parser itself would ever use it. ObjectAddress is an execution-time creature because we don't want utility statement representations to be resolved to OID-level detail before they execute. regards, tom lane
On Tue, Aug 17, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Maybe so, but the parser is expected to put out a representation that >> will still be valid when the command is executed some time later. > > Rereading this, I see I didn't make my point very clearly. The reason > this code doesn't belong in parser/ is that there's no prospect the > parser itself would ever use it. ObjectAddress is an execution-time > creature because we don't want utility statement representations to be > resolved to OID-level detail before they execute. Well, that is a good reason for doing it your way, but I'm slightly fuzzy on why we need a crisp separation between parse-time and execution-time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Aug 17, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Rereading this, I see I didn't make my point very clearly. �The reason >> this code doesn't belong in parser/ is that there's no prospect the >> parser itself would ever use it. �ObjectAddress is an execution-time >> creature because we don't want utility statement representations to be >> resolved to OID-level detail before they execute. > Well, that is a good reason for doing it your way, but I'm slightly > fuzzy on why we need a crisp separation between parse-time and > execution-time. I don't insist that the separation has to be crisp. I'm merely saying that putting a large chunk of useful-only-at-execution-time code into backend/parser is the Wrong Thing. regards, tom lane
On Tue, Aug 17, 2010 at 2:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Aug 17, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Rereading this, I see I didn't make my point very clearly. The reason >>> this code doesn't belong in parser/ is that there's no prospect the >>> parser itself would ever use it. ObjectAddress is an execution-time >>> creature because we don't want utility statement representations to be >>> resolved to OID-level detail before they execute. > >> Well, that is a good reason for doing it your way, but I'm slightly >> fuzzy on why we need a crisp separation between parse-time and >> execution-time. > > I don't insist that the separation has to be crisp. I'm merely saying > that putting a large chunk of useful-only-at-execution-time code into > backend/parser is the Wrong Thing. OK, but there should be a reason for that. For example, if there are circumstances when we parse a statement, and then time passes, and then we execute it later, that's a good argument for what you're saying here. But otherwise, the fact that these bits of barely-parsed stuff get passed all over the backend seems like an inconvenient wart.I was actually thinking of proposing that we make morethings happen during the parsing process and postpone less to the execution phase, and I need to make sure that I understand why you don't want to do that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Aug 17, 2010 at 2:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I don't insist that the separation has to be crisp. �I'm merely saying >> that putting a large chunk of useful-only-at-execution-time code into >> backend/parser is the Wrong Thing. > OK, but there should be a reason for that. For example, if there are > circumstances when we parse a statement, and then time passes, and > then we execute it later, that's a good argument for what you're > saying here. Yeah, and that's exactly what happens with utility statements that (for example) get into the plan cache. > I was actually thinking of proposing that we make more things happen > during the parsing process and postpone less to the execution phase, > and I need to make sure that I understand why you don't want to do > that. The reason to not do that is that you'd have to hold more locks, and do more management of those locks, in order to protect the more-thoroughly-analyzed statements from parsing to execution. In the case of utility statements, particularly ones that affect a lot of objects, I think that would be a seriously bad idea. In fact it would fly in the face of lessons we learned the hard way. The whole of parser/parse_utilcmd.c is code that we used to execute "at parse time", and had to rearrange to postpone to execution time, in order to avoid nasty bugs. It's still in backend/parser because it fits well there and uses a lot of parser infrastructure that's shared with parse-time analysis of DML statements. But neither of those statements hold for the ObjectAddress resolution code. regards, tom lane
On Tue, Aug 17, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Aug 17, 2010 at 2:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I don't insist that the separation has to be crisp. I'm merely saying >>> that putting a large chunk of useful-only-at-execution-time code into >>> backend/parser is the Wrong Thing. > >> OK, but there should be a reason for that. For example, if there are >> circumstances when we parse a statement, and then time passes, and >> then we execute it later, that's a good argument for what you're >> saying here. > > Yeah, and that's exactly what happens with utility statements that (for > example) get into the plan cache. OK. I'll have to look through that code some time. Here's v3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Attachment
Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010: > Here's v3. The header comment in objectaddress.c contains a funny mistake: it says it works with ObjectAddresses. However, ObjectAddresses is a different type altogether, so I recommend not using that as plural for ObjectAddress. Maybe "ObjectAddress objects"? :-D -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010: >> Here's v3. > The header comment in objectaddress.c contains a funny mistake: it says > it works with ObjectAddresses. However, ObjectAddresses is a different > type altogether, so I recommend not using that as plural for > ObjectAddress. Maybe "ObjectAddress objects"? :-D Alternatively, maybe ObjectAddresses was a bad choice of type name, and it should be ObjectAddressList or ObjectAddressArray or some such. But changing that might be more trouble than it's worth. regards, tom lane
On Thu, Aug 19, 2010 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010: >>> Here's v3. > >> The header comment in objectaddress.c contains a funny mistake: it says >> it works with ObjectAddresses. However, ObjectAddresses is a different >> type altogether, so I recommend not using that as plural for >> ObjectAddress. Maybe "ObjectAddress objects"? :-D > > Alternatively, maybe ObjectAddresses was a bad choice of type name, > and it should be ObjectAddressList or ObjectAddressArray or some such. > But changing that might be more trouble than it's worth. Yeah, I think it was a bad choice of type name. If I were otherwise touching that code, I'd probably advocate for changing it, but since I'm not, I'm inclined to just reword the comment. It might be something to keep in mind if we ever overhaul that part of the system, though, since at that point anything that must be back-patched will have merge conflicts anyway. Any other kibitzing before I commit this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > Any other kibitzing before I commit this? Sure ... + * If the object is a relation or a child object of a relation (e.g. an + * attribute or contraint, *relp will set to point to that relation). This Parenthesis in the wrong place here, grammar and spelling not much better. Also, I still feel that this comment could do better about explaining the behavior, particularly with respect to locking. Perhaps say + * If the target object is a relation or a child object of a relation + * (e.g. an attribute or constraint), the relation is also opened, and *relp + * receives the open relcache entry pointer; otherwise *relp is set to NULL. + * This is a bit grotty but it makes life simpler, since the caller will + * typically need the relcache entry too. Caller must close the relcache + * entry when done with it. The relation is locked with the specified + * lockmode if the target object is the relation itself or an attribute, + * but for other child objects, only AccessShareLock is acquired on the + * relation. + ScanKeyInit(&skey[0], ObjectIdAttributeNumber, BTEqualStrategyNumber, + F_OIDEQ, ObjectIdGetDatum(address.objectId)); There's a standard convention for the layout of ScanKeyInit calls, and this isn't it. Trivial, I know, but it's better to make similar code look similar. There's no longer any need for a diff in src/backend/parser/Makefile. + #define OBJECTADDRESS_H + + #include "nodes/parsenodes.h" + #include "nodes/pg_list.h" + #include "storage/lock.h" + #include "utils/rel.h" You shouldn't need pg_list.h here, as parsenodes.h surely includes it. regards, tom lane
On Thu, Aug 19, 2010 at 3:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Any other kibitzing before I commit this? > > Sure ... > > [kibitzing] v4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Attachment
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Aug 19, 2010 at 3:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Any other kibitzing before I commit this? >> >> Sure ... >> >> [kibitzing] > v4. For my money, you could just have said "Applied with suggested changes". They were pretty darn trivial. regards, tom lane
On Fri, Aug 27, 2010 at 12:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Aug 19, 2010 at 3:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> Any other kibitzing before I commit this? >>> >>> Sure ... >>> >>> [kibitzing] > >> v4. > > For my money, you could just have said "Applied with suggested changes". > They were pretty darn trivial. Boy, tough to please. Complaints if you commit it too soon, complaints if you don't commit it soon enough. Anyway, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On fre, 2010-08-27 at 07:49 -0400, Robert Haas wrote: > Anyway, committed. I suppose this is responsible for this: gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -Werror -Wno-inline -I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o objectaddress.o objectaddress.c -MMD -MP -MF .deps/objectaddress.Po cc1: warnings being treated as errors objectaddress.c: In function ‘get_object_address’: objectaddress.c:102: error: ‘address.objectId’ may be used uninitialized in this function objectaddress.c:102: error: ‘address.classId’ may be used uninitialized in this function make[1]: *** [objectaddress.o] Error 1
On Fri, Aug 27, 2010 at 4:16 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On fre, 2010-08-27 at 07:49 -0400, Robert Haas wrote: >> Anyway, committed. > > I suppose this is responsible for this: > > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing > -fwrapv -g -Werror -Wno-inline -I../../../src/include -D_GNU_SOURCE > -I/usr/include/libxml2 -c -o objectaddress.o objectaddress.c -MMD -MP > -MF .deps/objectaddress.Po > cc1: warnings being treated as errors > objectaddress.c: In function ‘get_object_address’: > objectaddress.c:102: error: ‘address.objectId’ may be used uninitialized > in this function > objectaddress.c:102: error: ‘address.classId’ may be used uninitialized > in this function > make[1]: *** [objectaddress.o] Error 1 I suppose this is unhappy because it things elog(ERROR) might return? For whatever reason, I don't get this on my machine. But I'll try plugging that hole and perhaps that will fix it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> wrote: > I suppose this is unhappy because it things elog(ERROR) might > return? It looks more like this code uses it without initialization: case OBJECT_INDEX: case OBJECT_SEQUENCE: caseOBJECT_TABLE: case OBJECT_VIEW: relation = get_relation_by_qualified_name(objtype, objname, lockmode); address.classId = RelationRelationId; address.objectId = RelationGetRelid(relation); address.objectSubId = 0; break; -Kevin
On Fri, Aug 27, 2010 at 5:35 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: >> I suppose this is unhappy because it things elog(ERROR) might >> return? > > It looks more like this code uses it without initialization: > > case OBJECT_INDEX: > case OBJECT_SEQUENCE: > case OBJECT_TABLE: > case OBJECT_VIEW: > relation = > get_relation_by_qualified_name(objtype, objname, > lockmode); > address.classId = RelationRelationId; > address.objectId = RelationGetRelid(relation); > address.objectSubId = 0; > break; OK, I must be missing something. Go through that a little slower? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> wrote: > OK, I must be missing something. Go through that a little slower? No, my mistake. Please ignore. I am getting the same warnings, but I think your take on the cause must be right. -Kevin
Robert Haas <robertmhaas@gmail.com> wrote: > I suppose this is unhappy because it things elog(ERROR) might > return? If I set these fields right in front of the elog(ERROR) the warnings go away for me. (Hopefully this observation will make up, to some extent, for my earlier brain fart.) -Kevin
On Fri, Aug 27, 2010 at 5:57 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: >> I suppose this is unhappy because it things elog(ERROR) might >> return? > > If I set these fields right in front of the elog(ERROR) the warnings > go away for me. (Hopefully this observation will make up, to some > extent, for my earlier brain fart.) I set them right after the ERROR so that those statements needn't actually be executed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> wrote: > I set them right after the ERROR so that those statements needn't > actually be executed. Good thinking. I checked and that also eliminates the warnings for me. (No surprise there, of course.) -Kevin
Excerpts from Robert Haas's message of vie ago 27 18:07:55 -0400 2010: > On Fri, Aug 27, 2010 at 5:57 PM, Kevin Grittner > <Kevin.Grittner@wicourts.gov> wrote: > > Robert Haas <robertmhaas@gmail.com> wrote: > >> I suppose this is unhappy because it things elog(ERROR) might > >> return? > > > > If I set these fields right in front of the elog(ERROR) the warnings > > go away for me. (Hopefully this observation will make up, to some > > extent, for my earlier brain fart.) > > I set them right after the ERROR so that those statements needn't > actually be executed. Didn't we inject some smarts so that the compiler would notice that elog(ERROR) doesn't return? Why isn't it working here? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Didn't we inject some smarts so that the compiler would notice that > elog(ERROR) doesn't return? No. If you know a portable (as in "works on every compiler") way to do that, we could talk. If only some compilers understand it, we'll probably end up worse off --- the ones that don't understand it will still need things like these unreachable assignments, while the ones that do understand will start warning about unreachable code. regards, tom lane
On Fri, Aug 27, 2010 at 9:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Didn't we inject some smarts so that the compiler would notice that >> elog(ERROR) doesn't return? > > No. If you know a portable (as in "works on every compiler") way > to do that, we could talk. If only some compilers understand it, > we'll probably end up worse off --- the ones that don't understand it > will still need things like these unreachable assignments, while the > ones that do understand will start warning about unreachable code. What's a bit odd about this is that I do get warnings for this kind of thing in general, which get turned into errors since I compile with -Werror; and in fact the version of the patch that I committed has no fewer than four places where there is a comment that says "placate compiler". But for some reason the compiler I used to develop this patch (gcc-4.2.1 i686-apple-darwin10) did not complain about this case, for reasons that are not quite clear to me. Which I guess also goes to your point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > What's a bit odd about this is that I do get warnings for this kind of > thing in general, which get turned into errors since I compile with > -Werror; and in fact the version of the patch that I committed has no > fewer than four places where there is a comment that says "placate > compiler". But for some reason the compiler I used to develop this > patch (gcc-4.2.1 i686-apple-darwin10) did not complain about this > case, for reasons that are not quite clear to me. gcc has been able to detect possibly-uninitialized scalar variables for many years, but only fairly-recent versions seem to apply the same type of logic to fields of local structs. I've also noticed that sometimes it can only spot the potential problem after inlining a function that sets the local variable, and so a more recent version and/or a more aggressive -O setting can also affect whether you get a warning. In short: this warning is a lot more context sensitive than you might guess. regards, tom lane
On Fri, Aug 27, 2010 at 9:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> What's a bit odd about this is that I do get warnings for this kind of >> thing in general, which get turned into errors since I compile with >> -Werror; and in fact the version of the patch that I committed has no >> fewer than four places where there is a comment that says "placate >> compiler". But for some reason the compiler I used to develop this >> patch (gcc-4.2.1 i686-apple-darwin10) did not complain about this >> case, for reasons that are not quite clear to me. > > gcc has been able to detect possibly-uninitialized scalar variables > for many years, but only fairly-recent versions seem to apply the > same type of logic to fields of local structs. I thought it might be something like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Fri, Aug 27, 2010 at 09:35:55PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Didn't we inject some smarts so that the compiler would notice that > > elog(ERROR) doesn't return? > > No. If you know a portable (as in "works on every compiler") way > to do that, we could talk. If only some compilers understand it, > we'll probably end up worse off --- the ones that don't understand it > will still need things like these unreachable assignments, while the > ones that do understand will start warning about unreachable code. We've been here before: http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg72113.html The problem appears to be mainly avoiding double evaluation, as otherwise it's just some macro tweaking. On gcc you can avoid the double evaluation also, but it's other compilers which we also need to support. If we really wanted to we could arrange for GCC to throw an error if the first argument to elog was non-constant, which would prevent bugs creeping in due to double evaluation. That still won't help users of other compilers though. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle