Thread: Inconsistency in libpq connection parameters, and extension thereof
Hello list, I have been playing with the URI connection strings in the bleeding edge 9.2 and noticed an inconsistency with the old connection string behavior: $ psql 'host=/var/run/postgresql dbname=postgres arbitrary=property' psql: invalid connection option "arbitrary" (psql exits with an error code immediately) vs. $ psql postgres://%2Fvar%2Frun%2Fpostgresql/?arbitrary=property WARNING: ignoring unrecognized URI query parameter: arbitrary (subsequent successful connection) Both of these appear to be checked entirely client-side in libpq. I view both of these approaches as problematic, because what I really want for a more ambitious set of projects is to pass arbitrary properties in the startup packet from libpq, as so that FEBE proxies and potentially Postgres backend hooks/extensions can have a chance to process the extra startup properties. The clear downside of deferring some error messages until post-connection is that typos would be harder to catch, so it seems that warning should be retained for un-processed properties, but instead be issued by the server. That might also make forward-compatibility of libpq more complete, although I know that is not an express goal (consider cases like when application_name was added as a valid connection parameter, though). This could also be useful for Postgres extensions. Please send objections on the basis of principle. If objections on the basis of principle are not heard or are addressable, I'll follow-up with a summary of places where we might need changes and ask for objections on mechanism next. -- fdr
Daniel Farina <daniel@heroku.com> writes: > I have been playing with the URI connection strings in the bleeding > edge 9.2 and noticed an inconsistency with the old connection string > behavior: > $ psql 'host=/var/run/postgresql dbname=postgres arbitrary=property' > psql: invalid connection option "arbitrary" > vs. > $ psql postgres://%2Fvar%2Frun%2Fpostgresql/?arbitrary=property > WARNING: ignoring unrecognized URI query parameter: arbitrary Um. We oughta fix that. I'm not necessarily wedded to the old throw-an-error definition, but there seems no good reason for these two syntaxes to act inconsistently. > I view both of these approaches as problematic, because what I really > want for a more ambitious set of projects is to pass arbitrary > properties in the startup packet from libpq, as so that FEBE proxies > and potentially Postgres backend hooks/extensions can have a chance to > process the extra startup properties. The connection string does not really seem like the right place for that. > The clear downside of deferring some error messages until > post-connection is that typos would be harder to catch, Or, indeed, impossible to catch. What if the typo causes libpq to fail to contact the server at all? Good luck getting any assistance from libpq in spotting your typo, if that happens and we've adopted this sort of approach. We already have a mechanism (PGOPTIONS, aka options=foo) for passing through settings that will not be interpreted by libpq. I think that stuff that is meant to be handled at the server end should be confined to that. regards, tom lane
On Tue, Jun 5, 2012 at 6:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Daniel Farina <daniel@heroku.com> writes: >> I have been playing with the URI connection strings in the bleeding >> edge 9.2 and noticed an inconsistency with the old connection string >> behavior: > >> $ psql 'host=/var/run/postgresql dbname=postgres arbitrary=property' >> psql: invalid connection option "arbitrary" > >> vs. > >> $ psql postgres://%2Fvar%2Frun%2Fpostgresql/?arbitrary=property >> WARNING: ignoring unrecognized URI query parameter: arbitrary > > Um. We oughta fix that. I'm not necessarily wedded to the old > throw-an-error definition, but there seems no good reason for these > two syntaxes to act inconsistently. I agree with that. The URIs may have been done this way as a concession to some small fragmentation that may have taken place before URIs were standardized, but perhaps the author can speak to that (he has been put on the To: list for this mail). >> The clear downside of deferring some error messages until >> post-connection is that typos would be harder to catch, > > Or, indeed, impossible to catch. What if the typo causes libpq to fail > to contact the server at all? Good luck getting any assistance from > libpq in spotting your typo, if that happens and we've adopted this > sort of approach. > > We already have a mechanism (PGOPTIONS, aka options=foo) for passing > through settings that will not be interpreted by libpq. I think that > stuff that is meant to be handled at the server end should be confined > to that. Interesting. Okay, I did not know about that, and didn't want to go so far as to suggest a new thing to handle such options, but since there is an older thing, that's handy. However, it seems like as-is it may be too ugly to have is nested in an additional layer of quoting when not being specified via environment variable. Does that seem like a generally agreeable statement? If that is the case, is there a convention we can use to separate the parts of the connection string (in both representations) into the parts sent to the server and the part that the client needs? We already abuse this a little bit because URI syntax (in general, not just our rendition of it) leaves little room for extension for parameters on the client side. Consider ?sslmode=require. In both representations, the net effect of a typo would be that instead of magically reading some properties on the client side, they'd be sent to the server. How often is this going to be so wrong that one cannot send a response from the server indicating to the user their error? On casual inspection it doesn't seem like prohibitively often, but I haven't mulled over that for very long. On that topic, the description of it in the libpq source is a little bit mystifying, as I did see it but passed over it: {"options", "PGOPTIONS", DefaultOption, NULL,"Backend-Debug-Options", "D", 40}, If they're just options, what does this have to do with debugging? I presumed some more specific purpose. To end my email with a request for a pony, I think it'd be reasonable to send these kinds of options again even while remaining connected. One might be able appropriate "SET" statements to that purpose, but I am not sure. This is mostly to the purpose of routing software, but I could imagine other uses... -- fdr
Daniel Farina <daniel@heroku.com> writes: > On Tue, Jun 5, 2012 at 6:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> We already have a mechanism (PGOPTIONS, aka options=foo) for passing >> through settings that will not be interpreted by libpq. �I think that >> stuff that is meant to be handled at the server end should be confined >> to that. > Interesting. Okay, I did not know about that, and didn't want to go so > far as to suggest a new thing to handle such options, but since there > is an older thing, that's handy. However, it seems like as-is it may > be too ugly to have is nested in an additional layer of quoting when > not being specified via environment variable. Does that seem like a > generally agreeable statement? Yeah, I was wondering the same thing. The current syntax for options certainly carries plenty of historical baggage. The main point here IMO is that libpq should have some way of telling parameters-for-the- server from things that are meant to be its own parameters. > If that is the case, is there a convention we can use to separate the > parts of the connection string (in both representations) into the > parts sent to the server and the part that the client needs? Rather than imagining this as two parts of the connection string, I think it would be nicer if we could have a convention that modifies the names of server-side parameters. In the connection string syntax we could perhaps do something like 'host=localhost dbname=foo x:param=value1 x:anotherparam=value2' This preserves the benefits of order-independence of the items, and seems at least a little bit more able to recognize typos than your original concept. I don't know if the specific notation "x:" would translate well into URI notation, but if not that, maybe there's something else that would. > In both representations, the net effect of a typo would be that > instead of magically reading some properties on the client side, > they'd be sent to the server. How often is this going to be so wrong > that one cannot send a response from the server indicating to the user > their error? If you misspell "host" for instance, you've got a problem. More generally, if you misspell "user" or "dbname" you're likely to get an authentication-related error that would not be at all helpful at leading your mind to the problem. (And it's not like these things are so consistently named that nobody ever gets 'em wrong...) I would not expect the server to start examining optional parameters until the connection has been successfully authenticated, so any typo in the basic connection properties is going to lead to unhelpful behavior if it causes us to think the item is an optional parameter rather than a misspelled one. regards, tom lane
On Tue, Jun 5, 2012 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The main point here > IMO is that libpq should have some way of telling parameters-for-the- > server from things that are meant to be its own parameters. I agree with this. >> If that is the case, is there a convention we can use to separate the >> parts of the connection string (in both representations) into the >> parts sent to the server and the part that the client needs? > > Rather than imagining this as two parts of the connection string, > I think it would be nicer if we could have a convention that modifies > the names of server-side parameters. In the connection string syntax > we could perhaps do something like > > 'host=localhost dbname=foo x:param=value1 x:anotherparam=value2' > > This preserves the benefits of order-independence of the items, and > seems at least a little bit more able to recognize typos than your > original concept. I don't know if the specific notation "x:" would > translate well into URI notation, but if not that, maybe there's > something else that would. How about "x-param=foo&x-anotherparam=bar"? I think this could work in the connection syntax as well: "host=localhost x-param=value1". It doesn't scan as visibly as ":", but probably will cause less escaping pain across the board. It also looks a lot like a lower-cased HTTP header or email-header, so people might make the right association right away. >> In both representations, the net effect of a typo would be that >> instead of magically reading some properties on the client side, >> they'd be sent to the server. How often is this going to be so wrong >> that one cannot send a response from the server indicating to the user >> their error? > > If you misspell "host" for instance, you've got a problem. More > generally, if you misspell "user" or "dbname" you're likely to get > an authentication-related error that would not be at all helpful at > leading your mind to the problem. (And it's not like these things > are so consistently named that nobody ever gets 'em wrong...) Well, I wonder if there's a way to mitigate that, but I think as long as we're satisfied thus far with a prefix there's no need to talk about this particular thorn, as it can be considered sidestepped. If we go so far as to namespace with something like "x:" or "x-" and to have such a rigorous concept of extension in protocol related matters, we're in a good place to have a nice cohesive expansion (at some later time) into more "x-..." headers exchanged between host in client mid-communication, beyond just startup. At some later date. Would these hypothetical extension-pairs be using the "options" device at startup time, or something else (possibly brand new)? I'm not ideally informed as to contemplate on if reuse of an existing thing makes sense. -- fdr
Daniel Farina <daniel@heroku.com> writes: > Would these hypothetical extension-pairs be using the "options" device > at startup time, or something else (possibly brand new)? I'd argue for just translating them into "options", at least in the near term. If they use some new mechanism then they would only work with new servers, and it's generally not good for libpq to assume much about what version of server it's working with, especially not when sending an initial connection packet (when, by definition, it can't know that for sure). As far as changing such settings later in the session is concerned, isn't that what SET is for? regards, tom lane
On Tue, Jun 5, 2012 at 9:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Daniel Farina <daniel@heroku.com> writes: >> Would these hypothetical extension-pairs be using the "options" device >> at startup time, or something else (possibly brand new)? > > I'd argue for just translating them into "options", at least in the > near term. If they use some new mechanism then they would only work > with new servers, and it's generally not good for libpq to assume much > about what version of server it's working with, especially not when > sending an initial connection packet (when, by definition, it can't > know that for sure). > > As far as changing such settings later in the session is concerned, > isn't that what SET is for? Yeah; I mentioned that upthread. The only semantic difference I can think of -- and this is a persnickety detail -- is that startup-time options are evaluated all together. This is not unlike email or HTTP headers as well. The result is that if one expects to get several extension headers in one shot that life is made more difficult. This can make validation more tricky, but perhaps it could be fixed with a multi-set command, should one not already exist. Another issue is that it becomes more painful for proxies to efficiently and easily pick out session value adjustments, as is useful when composing them, so part of me would prefer to have a protocol-level-only construct, but yet another part of me would really like to just type "SET" into psql and try stuff. I guess it could be a backslashified command, though...I wasn't looking to design this at this time, but as long as you are willing to think and write about it, I'm happy for the opportunity. In the interest of unblocking a bit of exploratory work while discussing this, perhaps we think a convention around "x-" or whatnot and packing things into the PGOPTIONS makes sense? -- fdr
On Wed, Jun 6, 2012 at 4:38 AM, Daniel Farina <daniel@heroku.com> wrote: > On Tue, Jun 5, 2012 at 6:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Daniel Farina <daniel@heroku.com> writes: > If that is the case, is there a convention we can use to separate the > parts of the connection string (in both representations) into the > parts sent to the server and the part that the client needs? We > already abuse this a little bit because URI syntax (in general, not > just our rendition of it) leaves little room for extension for > parameters on the client side. Consider ?sslmode=require. > > In both representations, the net effect of a typo would be that > instead of magically reading some properties on the client side, > they'd be sent to the server. How often is this going to be so wrong > that one cannot send a response from the server indicating to the user > their error? On casual inspection it doesn't seem like prohibitively > often, but I haven't mulled over that for very long. I think that's an excellent example of this being a bad idea. If you mis-spell sslmode=require, that should absolutely result in an error on the client side. Otherwise, you might end up sending your password (or other details that are not as sensitive, but still sensitive) over an unencrypted connection. If you wait for the error from the server, it's too late. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Jun 6, 2012 at 1:09 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Wed, Jun 6, 2012 at 4:38 AM, Daniel Farina <daniel@heroku.com> wrote: >> On Tue, Jun 5, 2012 at 6:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Daniel Farina <daniel@heroku.com> writes: >> If that is the case, is there a convention we can use to separate the >> parts of the connection string (in both representations) into the >> parts sent to the server and the part that the client needs? We >> already abuse this a little bit because URI syntax (in general, not >> just our rendition of it) leaves little room for extension for >> parameters on the client side. Consider ?sslmode=require. >> >> In both representations, the net effect of a typo would be that >> instead of magically reading some properties on the client side, >> they'd be sent to the server. How often is this going to be so wrong >> that one cannot send a response from the server indicating to the user >> their error? On casual inspection it doesn't seem like prohibitively >> often, but I haven't mulled over that for very long. > > I think that's an excellent example of this being a bad idea. If you > mis-spell sslmode=require, that should absolutely result in an error > on the client side. Otherwise, you might end up sending your password > (or other details that are not as sensitive, but still sensitive) over > an unencrypted connection. If you wait for the error from the server, > it's too late. That is an excellent point. Is there enough time in the day to gripe about how sslmode=require is not the default? Well, this seems pretty obviated by the prefix-naming convention, but it's an iron clad example of how the older idea was a bad one. -- fdr
On Wed, Jun 6, 2012 at 6:58 PM, Daniel Farina <daniel@heroku.com> wrote: > On Wed, Jun 6, 2012 at 1:09 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Wed, Jun 6, 2012 at 4:38 AM, Daniel Farina <daniel@heroku.com> wrote: >>> On Tue, Jun 5, 2012 at 6:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Daniel Farina <daniel@heroku.com> writes: >>> If that is the case, is there a convention we can use to separate the >>> parts of the connection string (in both representations) into the >>> parts sent to the server and the part that the client needs? We >>> already abuse this a little bit because URI syntax (in general, not >>> just our rendition of it) leaves little room for extension for >>> parameters on the client side. Consider ?sslmode=require. >>> >>> In both representations, the net effect of a typo would be that >>> instead of magically reading some properties on the client side, >>> they'd be sent to the server. How often is this going to be so wrong >>> that one cannot send a response from the server indicating to the user >>> their error? On casual inspection it doesn't seem like prohibitively >>> often, but I haven't mulled over that for very long. >> >> I think that's an excellent example of this being a bad idea. If you >> mis-spell sslmode=require, that should absolutely result in an error >> on the client side. Otherwise, you might end up sending your password >> (or other details that are not as sensitive, but still sensitive) over >> an unencrypted connection. If you wait for the error from the server, >> it's too late. > > That is an excellent point. Is there enough time in the day to gripe > about how sslmode=require is not the default? Well, you'll get me first in line to back that the current default is stupid. But I'm not sure sslmode=require is a proper default either. Because then the connection will fail completely to the vast majority of servers, which simply don't have SSL support. > Well, this seems pretty obviated by the prefix-naming convention, but > it's an iron clad example of how the older idea was a bad one. Yeah, a prefix based solution would fix this, since we can keep throwing errors. However, not throwing errors on the URL syntax should be considered a bug, I think. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Jun 6, 2012 at 4:08 PM, Magnus Hagander <magnus@hagander.net> wrote: > However, not throwing errors on the URL syntax should be considered a > bug, I think. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 6, 2012 at 1:13 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jun 6, 2012 at 4:08 PM, Magnus Hagander <magnus@hagander.net> wrote: >> However, not throwing errors on the URL syntax should be considered a >> bug, I think. > > +1. +1 Here's a patch that just makes the thing an error. Of course we could revert it if it makes the URI feature otherwise unusable...but I don't see a huge and terrible blocker ATM. A major question mark for me any extra stuff in JDBC URLs. Nevertheless, here it is. -- fdr
Attachment
On Wed, Jun 6, 2012 at 6:04 PM, Daniel Farina <daniel@heroku.com> wrote: > On Wed, Jun 6, 2012 at 1:13 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Jun 6, 2012 at 4:08 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> However, not throwing errors on the URL syntax should be considered a >>> bug, I think. >> >> +1. > > +1 > > Here's a patch that just makes the thing an error. Of course we could > revert it if it makes the URI feature otherwise unusable...but I don't > see a huge and terrible blocker ATM. A major question mark for me any > extra stuff in JDBC URLs. It looks like the answer is "yes". http://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters ...but I'm inclined to think we should make this change anyway. If JDBC used libpq, then it might be nice to let JDBC parse out bits of the URL and then pass the whole thing, unmodified, through to libpq, without having libpq spit up. But it doesn't. And even if someone were inclined to try to do something of that type, the warnings we're omitting now would presumably discourage them. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Jun 8, 2012 at 1:48 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jun 6, 2012 at 6:04 PM, Daniel Farina <daniel@heroku.com> wrote: >> On Wed, Jun 6, 2012 at 1:13 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Wed, Jun 6, 2012 at 4:08 PM, Magnus Hagander <magnus@hagander.net> wrote: >>>> However, not throwing errors on the URL syntax should be considered a >>>> bug, I think. >>> >>> +1. >> >> +1 >> >> Here's a patch that just makes the thing an error. Of course we could >> revert it if it makes the URI feature otherwise unusable...but I don't >> see a huge and terrible blocker ATM. A major question mark for me any >> extra stuff in JDBC URLs. > > It looks like the answer is "yes". > > http://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters > > ...but I'm inclined to think we should make this change anyway. If > JDBC used libpq, then it might be nice to let JDBC parse out bits of > the URL and then pass the whole thing, unmodified, through to libpq, > without having libpq spit up. But it doesn't. And even if someone > were inclined to try to do something of that type, the warnings we're > omitting now would presumably discourage them. > > Thoughts? I think we *have* to make the change for regular parameters, for security reasons. What we do with "prefixed parameters" can be debated... But we'll have to pass those to the server anyway for validation, so it might be an uninteresting case. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Fri, Jun 8, 2012 at 7:53 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Jun 8, 2012 at 1:48 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Jun 6, 2012 at 6:04 PM, Daniel Farina <daniel@heroku.com> wrote: >>> On Wed, Jun 6, 2012 at 1:13 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>>> On Wed, Jun 6, 2012 at 4:08 PM, Magnus Hagander <magnus@hagander.net> wrote: >>>>> However, not throwing errors on the URL syntax should be considered a >>>>> bug, I think. >>>> >>>> +1. >>> >>> +1 >>> >>> Here's a patch that just makes the thing an error. Of course we could >>> revert it if it makes the URI feature otherwise unusable...but I don't >>> see a huge and terrible blocker ATM. A major question mark for me any >>> extra stuff in JDBC URLs. >> >> It looks like the answer is "yes". >> >> http://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters >> >> ...but I'm inclined to think we should make this change anyway. If >> JDBC used libpq, then it might be nice to let JDBC parse out bits of >> the URL and then pass the whole thing, unmodified, through to libpq, >> without having libpq spit up. But it doesn't. And even if someone >> were inclined to try to do something of that type, the warnings we're >> omitting now would presumably discourage them. >> >> Thoughts? > > I think we *have* to make the change for regular parameters, for > security reasons. > > What we do with "prefixed parameters" can be debated... But we'll have > to pass those to the server anyway for validation, so it might be an > uninteresting case. OK, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Daniel Farina <daniel@heroku.com> writes: > On Tue, Jun 5, 2012 at 6:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Um. We oughta fix that. I'm not necessarily wedded to the old >> throw-an-error definition, but there seems no good reason for these >> two syntaxes to act inconsistently. > > I agree with that. The URIs may have been done this way as a > concession to some small fragmentation that may have taken place > before URIs were standardized, but perhaps the author can speak to > that (he has been put on the To: list for this mail). Sorry for the silence. The original intent was to not error out on any extra parameters from JDBC or other existing URI implementations. The example of a possible typo in sslmode=require clearly demonstrates that this was not a well-thought decision. Anyway, I can see you've already sorted this out. -- Alex