Thread: Notes on implementing URI syntax for libpq
Hello, It was proposed a while ago for libpq to support URI syntax for specifying the connection information: http://archives.postgresql.org/message-id/1302114698.23164.17.camel@jd-desktop http://archives.postgresql.org/pgsql-hackers/2011-07/msg01144.php It appears to me that the consensus was that: 1) this feature is indeed going to be useful, and 2) that we would go by implementinga simple URI parser ourselves instead of adding dependency on any fancy external library. Now we're going to actually implement this. It is known that libpq (and, thus every utility using it to connect a database: psql, pg_dump, etc.) supports a way to specifysome of the connection parameters (or all of them) via a single conninfo string, e.g: psql -d "dbname=mydb host=example.net port=5433" This, in my opinion, is very similar to what we would like to achieve with the URI syntax, so the above could also be specifiedusing a URI parameter like this: psql -d postgresql://example.net:5433/mydb We can also support specifying extra parameters via the usual "?keyword=value&keyword2=other" syntax. As it was noted inthe original discussion, sticking to what JDBC provides makes the most sense: http://jdbc.postgresql.org/documentation/head/connect.html So we should support 'user', 'password' and 'ssl' parameters (and probably just ignore the rest, at least for start.) Upon libpq code inspection I come to think that the best place to plug this seems to be conninfo_array_parse function (wheredbname keyword is currently being checked for '=' symbol and expanded): http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/interfaces/libpq/fe-connect.c;h=ed9dce941e1d57cce51f2c21bf29769dfe2ee542;hb=HEAD#l4262 We could similarly check for "postgresql:" designator and if present, extract the connection options from the dbname keywordthought to be a connection URI. The check should obviously go before the current check for '=', if we're going supportthe extra parameters, as outlined above. I am going to sketch a work-in-progress patch in the background of a discussion here. Your thoughts on this are very welcome! -- Alex
> It was proposed a while ago for libpq to support URI syntax for specifying the connection information: > ... > Now we're going to actually implement this. Do you know that we had this feature (more or less) in libpq for years but it was removed quite a while ago. It should still be there in the archive, not sure though if the old code fits the requirements for this feature completely. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
* Alexander Shulgin: > This, in my opinion, is very similar to what we would like to achieve with the URI syntax, so the above could also be specifiedusing a URI parameter like this: > > psql -d postgresql://example.net:5433/mydb How would you specifiy a local port/UNIX domain socket? Would it be possible to add something like psql -d postgresql+ssh://fweimer@db5/var/run/postgresql/.s.PGSQL.5432 similar to what Subversion supports? (This might have security implications when used from untrusted PHP scripts.) -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Excerpts from Florian Weimer's message of Wed Nov 23 13:04:47 +0200 2011: > > * Alexander Shulgin: > > > This, in my opinion, is very similar to what we would like to achieve with the URI syntax, so the above could also bespecified using a URI parameter like this: > > > > psql -d postgresql://example.net:5433/mydb > > How would you specifiy a local port/UNIX domain socket? > > Would it be possible to add something like > > psql -d postgresql+ssh://fweimer@db5/var/run/postgresql/.s.PGSQL.5432 > > similar to what Subversion supports? (This might have security > implications when used from untrusted PHP scripts.) While it is really tempting to provide support for all that fancy stuff (or at least support "user:password@host" part insteadof the ugly "?user=&password=") this will make psql URIs backward-incompatible with the JDBC syntax, which is exactlywhat we want to avoid. The primary reason people even considering adding the syntax, IMO is compatibility and thus, it has to be compatible in bothdirections. If we support something that's more than JDBC provides, we're just adding to the soup of incompatible URIsyntaxes out there. -- Alex
Excerpts from Florian Weimer's message of Wed Nov 23 13:04:47 +0200 2011: > > * Alexander Shulgin: > > > This, in my opinion, is very similar to what we would like to achieve with the URI syntax, so the above could also bespecified using a URI parameter like this: > > > > psql -d postgresql://example.net:5433/mydb > > How would you specifiy a local port/UNIX domain socket? Missed that in my previous reply. If host part of the URI points to localhost, the UNIX domain socket would be considered by libpq just as if you would pass"-h localhost -p 5433".
Hey Alexander,
// Dmitriy.
2011/11/24 Alexander Shulgin <ash@commandprompt.com>
--
Excerpts from Florian Weimer's message of Wed Nov 23 13:04:47 +0200 2011:
>> * Alexander Shulgin:Missed that in my previous reply.
>
> > This, in my opinion, is very similar to what we would like to achieve with the URI syntax, so the above could also be specified using a URI parameter like this:
> >
> > psql -d postgresql://example.net:5433/mydb
>
> How would you specifiy a local port/UNIX domain socket?
If host part of the URI points to localhost, the UNIX domain socket would be considered by libpq just as if you would pass "-h localhost -p 5433".
But what if the user wants to connect exactly via socket or
TCP/IP ?
And what if the user needs to specify a socket file name extension?
TCP/IP ?
And what if the user needs to specify a socket file name extension?
// Dmitriy.
Excerpts from Dmitriy Igrishin's message of Thu Nov 24 09:19:02 +0200 2011: > > > If host part of the URI points to localhost, the UNIX domain socket would > > be considered by libpq just as if you would pass "-h localhost -p 5433". > > > But what if the user wants to connect exactly via socket or > TCP/IP ? > And what if the user needs to specify a socket file name extension? How do you achieve that with the current psql set of command line options (and, possibly environment variables?) I would think the same method will work with URI, as with the proposed approach the URI is just decomposed into host, portand dbname parts and the rest of the code works like if you've had specified "-h example.net -p 5433 -d mydb" insteadof the URI parameter. -- Alex
On Thu, Nov 24, 2011 at 08:59:56AM +0200, Alexander Shulgin wrote: > > How would you specifiy a local port/UNIX domain socket? > > Missed that in my previous reply. > > If host part of the URI points to localhost, the UNIX domain socket would be considered by libpq just as if you would pass"-h localhost -p 5433". Uh, no it doesn't. "-h localhost" uses TCP/IP (try it). This is one piece of mysql magic we don't copy. If you want to use the socket you need to specify "-h /tmp" or wherever you keep it. Leaving out the -h parameter also uses UNIX domain sockets. Which does raise the valid question of how to represent that in URI syntax. SQLAlchemy (for example) doesn't try with it's URL syntax, to connect to a non-default UNIX socket, you need to create the URL object directly. How about the "service" option, that's a nice way of handling non-default socket options. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
2011/11/24 Alexander Shulgin <ash@commandprompt.com>
Excerpts from Dmitriy Igrishin's message of Thu Nov 24 09:19:02 +0200 2011:>How do you achieve that with the current psql set of command line options (and, possibly environment variables?)
> > If host part of the URI points to localhost, the UNIX domain socket would
> > be considered by libpq just as if you would pass "-h localhost -p 5433".
> >
> But what if the user wants to connect exactly via socket or
> TCP/IP ?
> And what if the user needs to specify a socket file name extension?
For psql(1) see -h option and -p option
http://www.postgresql.org/docs/9.1/static/app-psql.html
For the libpq see host option and port option of PQconnectdbparams()
http://www.postgresql.org/docs/9.1/static/libpq-connect.html
In both cases:
If the value of host begins with a slash, it is used as the directory for the Unix-domain socket.
Port specifies the TCP port or the local Unix-domain socket file extension.
http://www.postgresql.org/docs/9.1/static/app-psql.html
For the libpq see host option and port option of PQconnectdbparams()
http://www.postgresql.org/docs/9.1/static/libpq-connect.html
In both cases:
If the value of host begins with a slash, it is used as the directory for the Unix-domain socket.
Port specifies the TCP port or the local Unix-domain socket file extension.
I would think the same method will work with URI, as with the proposed approach the URI is just decomposed into host, port and dbname parts and the rest of the code works like if you've had specified "-h example.net -p 5433 -d mydb" instead of the URI parameter.
Thats great, but see above.
--
// Dmitriy.
On Nov 24, 2011, at 9:40 AM, Martijn van Oosterhout wrote: > On Thu, Nov 24, 2011 at 08:59:56AM +0200, Alexander Shulgin wrote: >>> How would you specifiy a local port/UNIX domain socket? >> >> Missed that in my previous reply. >> >> If host part of the URI points to localhost, the UNIX domain socket would be considered by libpq just as if you wouldpass "-h localhost -p 5433". > > Uh, no it doesn't. "-h localhost" uses TCP/IP (try it). This is one > piece of mysql magic we don't copy. If you want to use the socket you > need to specify "-h /tmp" or wherever you keep it. Leaving out the -h > parameter also uses UNIX domain sockets. > > Which does raise the valid question of how to represent that in URI > syntax. SQLAlchemy (for example) doesn't try with it's URL syntax, to > connect to a non-default UNIX socket, you need to create the URL object > directly. > > How about the "service" option, that's a nice way of handling > non-default socket options. Another idea is to use local:/dir/name for UNIX domain socket instead of hostname:port, like it's displayed in the psql prompt. -- Alexey Klyukin http://www.commandprompt.com The PostgreSQL Company – Command Prompt, Inc.
Excerpts from Martijn van Oosterhout's message of Thu Nov 24 09:40:42 +0200 2011: > On Thu, Nov 24, 2011 at 08:59:56AM +0200, Alexander Shulgin wrote: > > > How would you specifiy a local port/UNIX domain socket? > > > > Missed that in my previous reply. > > > > If host part of the URI points to localhost, the UNIX domain socket would be considered by libpq just as if you wouldpass "-h localhost -p 5433". > > Uh, no it doesn't. "-h localhost" uses TCP/IP (try it). This is one > piece of mysql magic we don't copy. If you want to use the socket you > need to specify "-h /tmp" or wherever you keep it. Leaving out the -h > parameter also uses UNIX domain sockets. Oh, you're right -- I was under wrong impression (hacking in the wrong local install, you know.) > Which does raise the valid question of how to represent that in URI > syntax. SQLAlchemy (for example) doesn't try with it's URL syntax, to > connect to a non-default UNIX socket, you need to create the URL object > directly. Well, whatever syntax we're going to invent here: it is not supported by the JDBC driver. "Because Java does not support using unix sockets the PostgreSQL™ server must be configured to allow TCP/IP connections." http://jdbc.postgresql.org/documentation/head/prepare.html Or, this has to be done not in the URI syntax itself, but with the use of some external option. Or maybe we can just add &unixsocket=... and hope that JDBC simply ignores that? I think I will try the last option to seeif that's the case. (Looking at libpq code, I think we will also need to verify that host/hostaddr parameter is pointingto the local host and reset it to NULL, to actually make libpq consider UNIX sockets.) > How about the "service" option, that's a nice way of handling > non-default socket options. The service handling isn't going to be affected with the proposed approach. So, if PGSERVICE is given, the options fromthe service file are applied after the URI is parsed, filling any parameters not set using previous methods. -- Alex
On Nov 24, 2011, at 1:57 AM, Alexander Shulgin <ash@commandprompt.com> wrote: > While it is really tempting to provide support for all that fancy stuff (or at least support "user:password@host" partinstead of the ugly "?user=&password=") this will make psql URIs backward-incompatible with the JDBC syntax, which isexactly what we want to avoid. I think it would be really weird not to support user:pw@host:port. You can presumably also support the JDBC style for backwardcompatibility, but I don't think we should adopt that syntax as project standard. ...Robert
Excerpts from Robert Haas's message of Thu Nov 24 13:57:17 +0200 2011: > > I think it would be really weird not to support user:pw@host:port. You can presumably also support the JDBC style forbackward compatibility, but I don't think we should adopt that syntax as project standard. Well, I don't believe JDBC syntax is ideal either, but I don't recall any better option proposed in the original discussion:http://archives.postgresql.org/pgsql-hackers/2011-03/msg01945.php Do you suggest that we should reconsider? -- Alex
Excerpts from Alexander Shulgin's message of jue nov 24 05:58:57 -0300 2011: > Excerpts from Martijn van Oosterhout's message of Thu Nov 24 09:40:42 +0200 2011: > > Which does raise the valid question of how to represent that in URI > > syntax. SQLAlchemy (for example) doesn't try with it's URL syntax, to > > connect to a non-default UNIX socket, you need to create the URL object > > directly. > > Well, whatever syntax we're going to invent here: it is not supported by the JDBC driver. > > "Because Java does not support using unix sockets the PostgreSQL™ server must be configured to allow TCP/IP connections." > > http://jdbc.postgresql.org/documentation/head/prepare.html > > Or, this has to be done not in the URI syntax itself, but with the use of some external option. > > Or maybe we can just add &unixsocket=... and hope that JDBC simply ignores that? I think this is misguided. We don't need to have a URL that specifies a Unix socket to work on JDBC, because it's obviously not going to work; if you just have it "ignore" the &unixsocket bit, then the URI is no longer the same and you could have it connecting to a completely different server. I think we should just propose something that will not work in JDBC. Surely if the user wants an URL that works both in JDBC and libpq, they should just not use a Unix-domain-socket specifying URI in the first place. What about something like postgresql://<path-to-dir>:port/database where the < > are present, i.e. if you want to specify a different socket directory, postgresql://</var/run/postgresql>:5433/database and if you just want to use the default location, postgresql://<>:5433/database A coworker also suggested using a different designator: postgresqli:///path/to/socket:5433/database postgresqli://:5433/database > > How about the "service" option, that's a nice way of handling > > non-default socket options. > > The service handling isn't going to be affected with the proposed approach. So, if PGSERVICE is given, the options fromthe service file are applied after the URI is parsed, filling any parameters not set using previous methods. I think the question is allowing the URI to specify a service. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Nov 24, 2011 at 7:33 AM, Alexander Shulgin <ash@commandprompt.com> wrote: > > Excerpts from Robert Haas's message of Thu Nov 24 13:57:17 +0200 2011: >> >> I think it would be really weird not to support user:pw@host:port. You can presumably also support the JDBC style forbackward compatibility, but I don't think we should adopt that syntax as project standard. > > Well, I don't believe JDBC syntax is ideal either, but I don't recall any better option proposed in the original discussion:http://archives.postgresql.org/pgsql-hackers/2011-03/msg01945.php > > Do you suggest that we should reconsider? I guess my feeling is that if we're going to have URLs, we ought to try to adhere to the same conventions that are used for pretty much every other service that supports URLs. user:pw@host:port is widely supported by multiple protocols, so I think we would need a very good reason to decide to go off in a completely different direction. It would be nice to be compatible with whatever JDBC does (link?) but I'm not prepared to put that ahead of general good design. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Alvaro Herrera's message of Thu Nov 24 15:21:49 +0200 2011: > > I think the question is allowing the URI to specify a service. Huh? The service definitions are read from a local pg_service.conf, and are specified by setting PGSERVICE (and PGSERVICEFILE)environment variables, no? What would you do with such URI if you need to other people to connect to the same service? Send them URI along with thepg_service.conf? Or are we talking about different things completely?
Excerpts from Robert Haas's message of Thu Nov 24 15:35:36 +0200 2011: > > > Do you suggest that we should reconsider? > > I guess my feeling is that if we're going to have URLs, we ought to > try to adhere to the same conventions that are used for pretty much > every other service that supports URLs. user:pw@host:port is widely > supported by multiple protocols, so I think we would need a very good > reason to decide to go off in a completely different direction. It > would be nice to be compatible with whatever JDBC does (link?) but I'm > not prepared to put that ahead of general good design. What JDBC supports is rather weird and far from being ideal: http://jdbc.postgresql.org/documentation/head/connect.html The problem with supporting multiple syntaxes, IMO is that it makes libpq compatible in only one direction: from particularforeign syntax to libpq, but not from libqp to any other particular foreign syntax. So when you see psql -d <URL>you wouldn't know if you can copy that URL to JDBC or any other connection interface parameter, unless you check thedocs thoroughly. -- Alex
Excerpts from Robert Haas's message of jue nov 24 10:35:36 -0300 2011: > On Thu, Nov 24, 2011 at 7:33 AM, Alexander Shulgin > <ash@commandprompt.com> wrote: > > > > Excerpts from Robert Haas's message of Thu Nov 24 13:57:17 +0200 2011: > >> > >> I think it would be really weird not to support user:pw@host:port. You can presumably also support the JDBC style forbackward compatibility, but I don't think we should adopt that syntax as project standard. > > > > Well, I don't believe JDBC syntax is ideal either, but I don't recall any better option proposed in the original discussion:http://archives.postgresql.org/pgsql-hackers/2011-03/msg01945.php > > > > Do you suggest that we should reconsider? > > I guess my feeling is that if we're going to have URLs, we ought to > try to adhere to the same conventions that are used for pretty much > every other service that supports URLs. user:pw@host:port is widely > supported by multiple protocols, so I think we would need a very good > reason to decide to go off in a completely different direction. It > would be nice to be compatible with whatever JDBC does (link?) but I'm > not prepared to put that ahead of general good design. Apparently there's no standard: http://www.petefreitag.com/articles/jdbc_urls/ Supporting the usual user:pw@host convention, _in addition to_ what our own JDBC driver already supports, seems reasonable to me. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Martijn van Oosterhout's message of jue nov 24 04:40:42 -0300 2011: > How about the "service" option, that's a nice way of handling > non-default socket options. What about it? Are you suggesting we should support some way to specify a service name in the URI? If so, consider this: if you set up a pg_service.conf file, and then pass around a URI that specifies a service, no one else can use the URI until you also pass around the service file. So, in that light, do we still think that letting the user specify a service name in the URI makes sense? (My personal opinion is yes). -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Nov 24, 2011 at 8:50 AM, Alexander Shulgin <ash@commandprompt.com> wrote: > What JDBC supports is rather weird and far from being ideal: http://jdbc.postgresql.org/documentation/head/connect.html > > The problem with supporting multiple syntaxes, IMO is that it makes libpq compatible in only one direction: from particularforeign syntax to libpq, but not from libqp to any other particular foreign syntax. So when you see psql -d <URL>you wouldn't know if you can copy that URL to JDBC or any other connection interface parameter, unless you check thedocs thoroughly. Well, based on that document, I think that trying to be bug-compatible with the JDBC syntax is a, erm, doomed effort. I mean, what are you going to do with things like loglevel or logUnclosedConnections that change the behavior of JDBC, not PostgreSQL? I think we could do something like: postgresql://user:pw@host:port/database?param1=val1¶m2=val2¶m3=val3&... ...where the param and val bits are standard libpq connection parameters. And for compatibility you could allow "user" and "password" to be specified as connection parameters rather than included in the host portion of the string. But you're still not going to be 100% compatible with JDBC, because we're not going support unknownLenghth=42 in libpq just because JDBC has chosen to implement some weirdness in that area. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Alvaro Herrera's message of jue nov 24 10:21:49 -0300 2011: > A coworker also suggested using a different designator: > > postgresqli:///path/to/socket:5433/database > postgresqli://:5433/database I forgot to mention: this "i" thing comes from LDAP. Apparently you can use "ldapi://" to specify a Unix-domain socket connection. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Robert Haas's message of Thu Nov 24 15:59:08 +0200 2011: > > Well, based on that document, I think that trying to be bug-compatible > with the JDBC syntax is a, erm, doomed effort. I mean, what are you > going to do with things like loglevel or logUnclosedConnections that > change the behavior of JDBC, not PostgreSQL? The proposition was to ignore keywords not known to libpq (see top of this thread.)
On Thu, Nov 24, 2011 at 8:54 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Martijn van Oosterhout's message of jue nov 24 04:40:42 -0300 2011: > >> How about the "service" option, that's a nice way of handling >> non-default socket options. > > What about it? Are you suggesting we should support some way to specify > a service name in the URI? > > If so, consider this: if you set up a pg_service.conf file, and then > pass around a URI that specifies a service, no one else can use the URI > until you also pass around the service file. > > So, in that light, do we still think that letting the user specify a > service name in the URI makes sense? (My personal opinion is yes). service is just a connection parameter, so if we choose a URL format that allows any connection parameter to be specified, this falls out naturally, without any additional work. And if we don't choose such a URL format, we are, in my humble opinion, crazy. e.g. if we used the format suggested in my previous email, this would just boil down to: postgresql:///?service=foo -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Alvaro Herrera: > I think we should just propose something that will not work in JDBC. I'm not sure if this is a good idea. 8-) I plan to add UNIX Domain socket support to the JDBC driver. Eventually, the JDK will expose UNIX Domain sockets to Java code, too (they are already used internally for management functions). -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Excerpts from Alexey Klyukin's message of Thu Nov 24 10:22:21 +0200 2011: > > Another idea is to use local:/dir/name for UNIX domain socket instead of hostname:port, like it's displayed in the psqlprompt. So the whole thing would look like this: postgresql://local:/dir/name/dbname?param1=val1&... Where "/dir/name" is the absolute path to the directory containing the socket file. If one wants to use the default directorythe following syntax may serve the need: postgresql://local:/dbname -- Alex
Excerpts from Florian Weimer's message of jue nov 24 11:31:29 -0300 2011: > > * Alvaro Herrera: > > > I think we should just propose something that will not work in JDBC. > > I'm not sure if this is a good idea. 8-) > > I plan to add UNIX Domain socket support to the JDBC driver. > Eventually, the JDK will expose UNIX Domain sockets to Java code, too > (they are already used internally for management functions). Well, in that case, the JDBC could simply adopt whatever syntax that libpq ends up adopting. I just meant "something that will not work in JDBC *right now*" (i.e. with no local socket support). -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Excerpts from Robert Haas's message of Thu Nov 24 16:02:38 +0200 2011: > > > So, in that light, do we still think that letting the user specify a > > service name in the URI makes sense? (My personal opinion is yes). > > service is just a connection parameter, so if we choose a URL format > that allows any connection parameter to be specified, this falls out > naturally, without any additional work. And if we don't choose such a > URL format, we are, in my humble opinion, crazy. The patch draft I have uses that format, yes: so any keyword libqp recognizes can be given in form of param=value URI queryparameter. > e.g. if we used the format suggested in my previous email, this would > just boil down to: > > postgresql:///?service=foo Oh, well, that would make sense. It also appeared to me that we should deny overriding host, port and dbname by the queryparameters to prevent confusion, e.g: postgresql://host:port/dbname?host=otherhost&port=otherport&dbname=otherdb -- Alex
Excerpts from Florian Weimer's message of Thu Nov 24 16:31:29 +0200 2011: > > I plan to add UNIX Domain socket support to the JDBC driver. > Eventually, the JDK will expose UNIX Domain sockets to Java code, too > (they are already used internally for management functions). Do you maybe plan to support "user:pw@host" syntax too? :-) Apparently, that would make people happier, also JDBC and libpq URIs will become fully compatible (eventually.) -- Alex
* Alvaro Herrera: > Excerpts from Florian Weimer's message of jue nov 24 11:31:29 -0300 2011: >> >> * Alvaro Herrera: >> >> > I think we should just propose something that will not work in JDBC. >> >> I'm not sure if this is a good idea. 8-) >> >> I plan to add UNIX Domain socket support to the JDBC driver. >> Eventually, the JDK will expose UNIX Domain sockets to Java code, too >> (they are already used internally for management functions). > > Well, in that case, the JDBC could simply adopt whatever syntax that > libpq ends up adopting. I just meant "something that will not work in > JDBC *right now*" (i.e. with no local socket support). Ah, okay, your proposal looked like something which couldn't work with JDBC *at all* because of invalid URI syntax (but admittedly, I haven't checked that yet). -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On Thu, Nov 24, 2011 at 9:40 AM, Alexander Shulgin <ash@commandprompt.com> wrote: >> Another idea is to use local:/dir/name for UNIX domain socket instead of hostname:port, like it's displayed in the psqlprompt. > > So the whole thing would look like this: > > postgresql://local:/dir/name/dbname?param1=val1&... > > Where "/dir/name" is the absolute path to the directory containing the socket file. If one wants to use the default directorythe following syntax may serve the need: > > postgresql://local:/dbname I think this is just weird. libpq treats any hostname that starts with a slash as hostname. And there's a standard way of URL-encoding characters that would otherwise be treated as terminators: you write a percent sign followed by two hex digits. So if you want the host to be /tmp, you just should just write: postgresql://%2Ftmp/fred Which is the equivalent of the connection string: host=/tmp dbname=fred This may appear to be slightly inconvenient notation, but there is little reason to reinvent syntax that the URL gods have already devised, and in practice specifying an explicit pathname in a connection string is quite rare. One normally specifies a local socket connection by omitting to specify a hostname at all, and that can work here, too. That is, postgresql:///fred should be equivalent to the connection string: dbname=fred ...which means it will use the default socket directory on UNIX, and a loopback connection on Windows. And postgresql:/// should be equivalent to an empty connection string, defaulting everything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Robert Haas's message of Thu Nov 24 17:02:13 +0200 2011: > > On Thu, Nov 24, 2011 at 9:40 AM, Alexander Shulgin > <ash@commandprompt.com> wrote: > >> Another idea is to use local:/dir/name for UNIX domain socket instead of hostname:port, like it's displayed in the psqlprompt. > > > > So the whole thing would look like this: > > > > postgresql://local:/dir/name/dbname?param1=val1&... > > > > Where "/dir/name" is the absolute path to the directory containing the socket file. If one wants to use the defaultdirectory the following syntax may serve the need: > > > > postgresql://local:/dbname > > I think this is just weird. libpq treats any hostname that starts > with a slash as hostname. And there's a standard way of URL-encoding > characters that would otherwise be treated as terminators: you write a > percent sign followed by two hex digits. So if you want the host to > be /tmp, you just should just write: > > postgresql://%2Ftmp/fred > > Which is the equivalent of the connection string: > > host=/tmp dbname=fred Yeah, that should work, but it's giving the pathname a really weird look. Given that this is going to be used only rarely,this is less of a problem, though. > This may appear to be slightly inconvenient notation, but there is > little reason to reinvent syntax that the URL gods have already > devised, and in practice specifying an explicit pathname in a > connection string is quite rare. One normally specifies a local > socket connection by omitting to specify a hostname at all, and that > can work here, too. That is, postgresql:///fred should be equivalent > to the connection string: > > dbname=fred > > ...which means it will use the default socket directory on UNIX, and a > loopback connection on Windows. And postgresql:/// should be > equivalent to an empty connection string, defaulting everything. Hm... that's neat. Didn't appear to me due to a bit too restrictive parser rules in my draft patch. Now that I allow hostto be empty string, the above works like a charm! -- Alex
On tor, 2011-11-24 at 09:02 -0500, Robert Haas wrote: > e.g. if we used the format suggested in my previous email, this would > just boil down to: > > postgresql:///?service=foo More correct would be postgresql:?service=foo See http://en.wikipedia.org/wiki/URI_scheme for some inspiration.
On tor, 2011-11-24 at 15:43 +0200, Alexander Shulgin wrote: > Huh? The service definitions are read from a local pg_service.conf, > and are specified by setting PGSERVICE (and PGSERVICEFILE) environment > variables, no? > > What would you do with such URI if you need to other people to connect > to the same service? Send them URI along with the pg_service.conf? A full URI would also rely on host names or IP addresses being the same everywhere. It's all a matter of degree ...
Excerpts from Peter Eisentraut's message of Thu Nov 24 22:05:09 +0200 2011: > > On tor, 2011-11-24 at 15:43 +0200, Alexander Shulgin wrote: > > Huh? The service definitions are read from a local pg_service.conf, > > and are specified by setting PGSERVICE (and PGSERVICEFILE) environment > > variables, no? > > > > What would you do with such URI if you need to other people to connect > > to the same service? Send them URI along with the pg_service.conf? > > A full URI would also rely on host names or IP addresses being the same > everywhere. It's all a matter of degree ... True, but it is much more reasonable to expect that hostnames will resolve to the same addresses most of the time (save forzone changes propagation time.) Still I can imagine where this may be useful, like local networks with shared pg_service.conf files. And since we don'tneed to do anything special to support the behavior (i.e. postgresql:///?service=foo is going to work out of the box,)this seems to be a non-problem. -- Alex
Excerpts from Robert Haas's message of Thu Nov 24 15:59:08 +0200 2011: > > I think we could do something like: > > postgresql://user:pw@host:port/database?param1=val1¶m2=val2¶m3=val3&... I wonder if this should be allowed syntax (i.e. specify a user, but connect locally, so leave 'host' to be an empty string): postgresql://user@/ Furthermore, if we want to connect locally, but to a non-default port: postgresql://user@:5433/ I would also think that if one is to specify the password in the URI, and the password happen to contain the @-sign (e.g."!@#$%^",) it should be percent-encoded, like: postgresql://user:!%40#$%^@/ Reasonable? -- Alex
Excerpts from Robert Haas's message of Thu Nov 24 13:57:17 +0200 2011: > > I think it would be really weird not to support user:pw@host:port. You can presumably also support the JDBC style forbackward compatibility, but I don't think we should adopt that syntax as project standard. By the way, if we're already considering this, what about special syntax for SSL, instead of the JDBC's "&ssl=true" thingy? Given that the default sslmode is "prefer" I assume libpq tries SSL first, then falls back to plain text if that's not available. To me, it looks much more natural if the fact that SSL is/should be used is stated early in the URI syntax, like: "https://","svn+ssh://", etc., rather than in the query parameters (if the parameters were to be passed to remote serviceto process, like it's done with HTTP[S], this would not make any sense at all.) But given that sslmode can currently be either of: "disable", "allow", "prefer", "require", "verify-ca" or "verify-full"(and who knows if any new allowed mode could show up later,) allowing "&sslmode=whatever" makes sense. Note,that this is not the same as "&ssl=whatever". So how about this: postgresql:ssl://user:pw@host:port/dbname?sslmode=... The "postgresql:ssl://" designator would assume "sslmode=require", if not overriden in extra parameters and "postgresql://"would imply "sslmode=prefer". And to disable SSL you would pick either designator and append "sslmode=disable". The JDBC's "ssl=true" will translate to "sslmode=require". If we can decide on this, we should also put reasonable effort into making JDBC support the same syntax. Thoughts? -- Alex
On 11/24/2011 05:21 AM, Alvaro Herrera wrote: > A coworker also suggested using a different designator: > > postgresqli:///path/to/socket:5433/database > postgresqli://:5433/database This is not unprecedented. An example is how CUPS handles this problem when connecting printers using URIs: http://www.cups.org/documentation.php/network.html where you might see this for the usual port: lpd://ip-address-or-hostname/queue And this for AppSocket AKA JetDirect: socket://ip-address-or-hostname I am certainly not going to defend printing setup with CUPS as a model worth emulating, just noting the similarity here. I think we'll save miles of user headaches if there's only one designator.
Excerpts from Greg Smith's message of Mon Nov 28 10:08:42 +0200 2011: > > On 11/24/2011 05:21 AM, Alvaro Herrera wrote: > > A coworker also suggested using a different designator: > > > > postgresqli:///path/to/socket:5433/database > > postgresqli://:5433/database > > This is not unprecedented. An example is how CUPS handles this problem > when connecting printers using URIs: > http://www.cups.org/documentation.php/network.html where you might see > this for the usual port: > > lpd://ip-address-or-hostname/queue > > And this for AppSocket AKA JetDirect: > > socket://ip-address-or-hostname > > I am certainly not going to defend printing setup with CUPS as a model > worth emulating, just noting the similarity here. I think we'll save > miles of user headaches if there's only one designator. I'm not a big fan of using different designator for local socket connections either, especially if they differ so little(the added 'i' might be too hard to spot, moreso if the displayed using proportional font.) Not to mention that printersand compatibility don't go together that often, in my (probably way too limited) experience. ;-) As it was suggested downthread, "postgresql://[:port]/[mydb]" should work perfectly for this purpose, since it's just a matterof allowing empty host/addr in the URI. So, using the default port: "postgresql:///mydb" (notice the similarity withthe local-filesystem URI scheme: "file:///") Speaking of JDBC, the "postgresql:///mydb" notation may be abbreviated as "postgresql:mydb", which is not unreasonable tosupport in psql too. -- Regards, Alex
Excerpts from Alexander Shulgin's message of Sat Nov 26 21:46:32 +0200 2011: > > I would also think that if one is to specify the password in the URI, and the password happen to contain the @-sign (e.g."!@#$%^",) it should be percent-encoded, like: > > postgresql://user:!%40#$%^@/ Actually, like: postgresql://user:!%40#$%25^@/ since the %-sign has to be encoded itself.
Excerpts from Alexander Shulgin's message of Sat Nov 26 22:07:21 +0200 2011: > > So how about this: > > postgresql:ssl://user:pw@host:port/dbname?sslmode=... > > The "postgresql:ssl://" designator would assume "sslmode=require", if not overriden in extra parameters and "postgresql://"would imply "sslmode=prefer". And to disable SSL you would pick either designator and append "sslmode=disable". > > The JDBC's "ssl=true" will translate to "sslmode=require". Hey, I'm going to assume "no objections" equals "positive feedback" and continue hacking in this direction. > If we can decide on this, we should also put reasonable effort into making JDBC support the same syntax. What would be our plan on this? Since the syntax proposed here is strictly a superset of the existing JDBC syntax, I wouldthink this qualifies as an improvement and it would be backwards compatible with any previous version of the JDBC connector. -- Alex
On Tue, Nov 29, 2011 at 12:02 PM, Alexander Shulgin <ash@commandprompt.com> wrote: > > Excerpts from Alexander Shulgin's message of Sat Nov 26 22:07:21 +0200 2011: >> >> So how about this: >> >> postgresql:ssl://user:pw@host:port/dbname?sslmode=... >> >> The "postgresql:ssl://" designator would assume "sslmode=require", if not overriden in extra parameters and "postgresql://"would imply "sslmode=prefer". And to disable SSL you would pick either designator and append "sslmode=disable". >> >> The JDBC's "ssl=true" will translate to "sslmode=require". > > Hey, I'm going to assume "no objections" equals "positive feedback" and continue hacking in this direction. A couple of cents on this: I think the current direction is fine, although as Robert Haas has said, I am not really at all inclined to view JDBC compatibility as any kind of a plus. JDBC URLs are weird, and do the drivers actually link libpq anyway? That world is unto itself. Looking like a normal URL to the greater body of URL-dom seems like a much more desirable design trait to me, and after that decreasing the number of ways to write the same thing. So a weak -1 from me on adding two ways to do the same thing, of which the way to do it is weird by URL standards. At best I'd try to avoid choosing any notations that clash or visually confuse the URLs with their JDBC doppelgangers, and I think the more URL-ish notation is polite to JDBC in that regard. Here's a couple of URIs used by projects that do generally end up delegating to libpq-based drivers. It is precisely this slight fragmentation that I'd like to see put to rest by this feature in libpq. Sequel, for Ruby (all valid): DB = Sequel.connect('postgres://user:password@localhost/blog') # Uses the postgres adapter DB = Sequel.connect('postgres://localhost/blog?user=user&password=password') DB = Sequel.connect('postgres://localhost/blog' :user=>'user', :password=>'password') Amazingly, I don't find it trivial to find the format ActiveRecord (part of Rails) spelled out, but here's one thing that works, at least (same as Sequel, format one) postgres://username:password@localhost/myrailsdb Django: Doesn't use URLs at all, preferring dictionary structures, as far as I can tell. SQLAlchemy, Python: dialect+driver://user:password@host/dbname[?key=value..] I'm not familiar enough with the Perl and TCL worlds to comment, nor some of the newcomers like Golang or Node.js. Dialect would be 'postgresql', driver 'psycopg2', but at the libpq level clearly that wouldn't make much sense, so it's basically Sequel format one-compatible, except it goes by postgresql rather than postgres for the dialect. I do really like the attention paid to somehow making AF_UNIX sockets work; they're great for development. I agree a different scheme would be hard to swallow, but unfortunately the options to pack that information into URL notation is at least a little ugly, as far as I can tell. Using a different scheme is probably not worth the cost of an ugly URL string, in my book. Are there any provisions for choosing X.509/cert authentication? I imagine not, but out-of-band presentation of that information is the norm there, and I'm not sure if is any room for improvement within reach. >> If we can decide on this, we should also put reasonable effort into making JDBC support the same syntax. > > What would be our plan on this? Since the syntax proposed here is strictly a superset of the existing JDBC syntax, I wouldthink this qualifies as an improvement and it would be backwards compatible with any previous version of the JDBC connector. I suppose that is nice, but is this designed, or coincidental? Is there any fundamental reason why the JDBC driver will remain so similar to libpq in the future? Will people realistically be able to use one URL across their Java and libpq projects in most situations, now and in the forseeable future, including the keyword options? Because as soon as one encounters the need to maintain two URLs for any reason, the otherwise real convenience regresses into bloat. So there's my pile of opinions. -- fdr
Excerpts from Daniel Farina's message of Mon Dec 05 11:56:19 +0200 2011: > > I think the current direction is fine, although as Robert Haas has > said, I am not really at all inclined to view JDBC compatibility as > any kind of a plus. JDBC URLs are weird, and do the drivers actually > link libpq anyway? That world is unto itself. Daniel, The JDBC driver is special in that it intentionally does not use libpq. Given every other binding (think Ruby, Python, Perl,Tcl, etc.) does use libpq, it makes perfect sense to me to make the syntax compatible with JDBC. I see this as a two-fold effort: add URI syntax to libpq *and* improve JDBC's syntax to support the usual "user:pw@" notation. This way, not only the above language's bindings URI syntaxes would become compatible with each other (eventually,with release of new libpq and new drivers' versions,) but they would also be interchangeable with JDBC's newsyntax (also, eventually.) > Are there any provisions for choosing X.509/cert authentication? I > imagine not, but out-of-band presentation of that information is the > norm there, and I'm not sure if is any room for improvement within > reach. Since the idea is to parse any supported URI query parameters, this is likely going to Just Work(tm) if you add proper "sslcert=&sslkey="query parameters to the connection URI. > >> If we can decide on this, we should also put reasonable effort into making JDBC support the same syntax. > > > > What would be our plan on this? Since the syntax proposed here is strictly a superset of the existing JDBC syntax, Iwould think this qualifies as an improvement and it would be backwards compatible with any previous version of the JDBCconnector. > > I suppose that is nice, but is this designed, or coincidental? Is > there any fundamental reason why the JDBC driver will remain so > similar to libpq in the future? Will people realistically be able to > use one URL across their Java and libpq projects in most situations, > now and in the forseeable future, including the keyword options? > Because as soon as one encounters the need to maintain two URLs for > any reason, the otherwise real convenience regresses into bloat. See above. The hope is that URIs will be compatible sans the driver-specific extra query parameters which might be not recognizedby either party. -- Regards, Alex
On Fri, Dec 9, 2011 at 6:03 AM, Alexander Shulgin <ash@commandprompt.com> wrote: > See above. The hope is that URIs will be compatible sans the driver-specific extra query parameters which might be notrecognized by either party. Yeah. I am not that concerned with being stupidity-compatible with anyone else ... but neither am I inclined to go out of our way to be incompatible. It seems like the design on the table might allow us to get the best of both worlds. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Dec 9, 2011 at 3:03 AM, Alexander Shulgin <ash@commandprompt.com> wrote: > The JDBC driver is special in that it intentionally does not use libpq. Given every other binding (think Ruby, Python,Perl, Tcl, etc.) does use libpq, it makes perfect sense to me to make the syntax compatible with JDBC. I am with you until the reasoning of the last part, which is can be (as I understand it) rephrased to "every other major language ecosystem uses libpq, so therefore it makes perfect sense to have the syntax compatible with JDBC." To which I say, "what?" I guess if I move the parenthetical grouping of logic around, what you are probably intending to say is "everyone except this one ecosystem does the normal thing, so we have an opportunity to Unite The Clans, by absorbing a unique aspect of one of them" > I see this as a two-fold effort: add URI syntax to libpq *and* improve JDBC's syntax to support the usual "user:pw@" notation. This way, not only the above language's bindings URI syntaxes would become compatible with each other (eventually,with release of new libpq and new drivers' versions,) but they would also be interchangeable with JDBC's newsyntax (also, eventually.) Okay. This is how I teased out my interpretation above. > Since the idea is to parse any supported URI query parameters, this is likely going to Just Work(tm) if you add proper"sslcert=&sslkey=" query parameters to the connection URI. Hmm. I guess the only downside is one has to have files materialized to make that work, and one cannot really embed them in the URL (for files that long, it is madness anyway). But I do appreciate the exact symmetry with the libpq options. > The hope is that URIs will be compatible sans the driver-specific extra query parameters which might be not recognizedby either party. How about a more general quirk of the hypothetical URL parsing code: because JDBC's URLs are not URLs (schemes cannot have colons, per rfc1738, section 2.1) treat the JDBC string specially and ignore it, and parse the rest as a legitimate URL. One could be even more permissive and state that all tokens before the last colon-delimited part of the scheme are ignored: i:am:feeling:like:postgresql://(etc) jdbc:postgresql://(etc) psycopg2:postgresql://(etc) Which would reduce to the same thing as: postgresql://(etc) What I can't get excited about is: postgresql:ssl://user:pw@host:port/dbname?sslmode=... Since this is not actually a URL, and the "scheme" using the above rule would be "ssl". If you really want to have SSL be part of the scheme (given ssl=require exists, I'd prefer One Way that involves no scheme alterations to denote the transport), then you can use an RFC-compatible notation like "+": postgresql+ssl://.... For which the "scheme" would be "postgresql+ssl". Again, I'm not terribly excited about having a scheme that denotes the transport (in spite of it being semi-commonly done as in svn+ssh), especially if redundant with query string options. -- fdr
Excerpts from Daniel Farina's message of Fri Dec 09 23:04:26 +0200 2011: > > I guess if I move the parenthetical grouping of logic around, what you > are probably intending to say is "everyone except this one ecosystem > does the normal thing, so we have an opportunity to Unite The Clans, > by absorbing a unique aspect of one of them" Yes, what I meant is something more or less like that. > i:am:feeling:like:postgresql://(etc) > jdbc:postgresql://(etc) > psycopg2:postgresql://(etc) > > Which would reduce to the same thing as: > > postgresql://(etc) Well, it wasn't suggested that JDBC people paste their URIs to psql, while keeping the "jdbc:" prefix, that would be reallyweird thing to do. However, I have just noticed they *do* require that part themselves, like in: String url = "jdbc:postgresql://localhost/test"; Connection conn = DriverManager.getConnection(url); It is really weird, since as far as I can see from the docs, the "jdbc:" part is always discarded by the driver manager. That must be some true Java way of doing things. :-p > What I can't get excited about is: > > postgresql:ssl://user:pw@host:port/dbname?sslmode=... > > Since this is not actually a URL, and the "scheme" using the above > rule would be "ssl". If you really want to have SSL be part of the > scheme (given ssl=require exists, I'd prefer One Way that involves no > scheme alterations to denote the transport), then you can use an > RFC-compatible notation like "+": > > postgresql+ssl://.... > > For which the "scheme" would be "postgresql+ssl". Again, I'm not > terribly excited about having a scheme that denotes the transport (in > spite of it being semi-commonly done as in svn+ssh), especially if > redundant with query string options. Yeah, I was also considering "+ssl", but don't recall if I ever suggested that on the list. My primary motivation behind making SSL stand out in the URI is that it "feels wrong" when that is pushed to the query parameters. In a real-world URI that would be impossible, since it's the server which is supposed to parse the parameters,not the client, but that can only happen after the connection has been established. However, since we're parsing all of the "query parameters" locally in a client, this becomes less of a problem, so I wouldagree that we don't need a special scheme for SSL connections. Especially, since the default SSL mode is "prefer" andto override that you still need to add a "sslmode=" query parameter. -- Alex