Thread: Policy on pulling in code from other projects?
Hey, So I am looking intently on what it is going to take to get the URI patch done for psql [1] and was digging around the web and have a URI parser library. It is under the New BSD license and is strictly RFC RFC 3986 [2] compliant . Now I have not dug into the code but the parser is used by other projects. So question is: Assuming the code actually makes this patch easier, do we: A. Pull in the code into the main tree B. Instead have it as a requirement via configure? 1. http://archives.postgresql.org/message-id/1302114698.23164.17.camel@jd-desktop 2. http://tools.ietf.org/html/rfc3986 Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On 21 July 2011 18:43, Joshua D. Drake <jd@commandprompt.com> wrote: > So I am looking intently on what it is going to take to get the URI patch > done for psql [1] and was digging around the web and have a URI parser > library. It is under the New BSD license and is strictly RFC RFC 3986 [2] > compliant . > > Now I have not dug into the code but the parser is used by other projects. > So question is: > > Assuming the code actually makes this patch easier, do we: > > A. Pull in the code into the main tree > B. Instead have it as a requirement via configure? > > 1. > http://archives.postgresql.org/message-id/1302114698.23164.17.camel@jd-desktop > > 2. http://tools.ietf.org/html/rfc3986 Without commenting on the practicalities of what you'd like to do, including code from other projects in the tree is well precedented. Off the top of my head, I can tell you that pgcrypto uses code from various sources, while preserving the original copyright notices. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
"Joshua D. Drake" <jd@commandprompt.com> writes: > So I am looking intently on what it is going to take to get the URI > patch done for psql [1] and was digging around the web and have a URI > parser library. It is under the New BSD license and is strictly RFC RFC > 3986 [2] compliant . Surely we do not need a whole library to parse URIs. regards, tom lane
On 07/21/2011 11:13 AM, Tom Lane wrote: > "Joshua D. Drake"<jd@commandprompt.com> writes: >> So I am looking intently on what it is going to take to get the URI >> patch done for psql [1] and was digging around the web and have a URI >> parser library. It is under the New BSD license and is strictly RFC RFC >> 3986 [2] compliant . > > Surely we do not need a whole library to parse URIs. Shrug, standards compliant, already runs on windows.... Seems like a good idea to me? http://uriparser.sourceforge.net/ Sincerely, Joshua D. Drake > > regards, tom lane > -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On 07/21/2011 11:13 AM, Tom Lane wrote: > "Joshua D. Drake"<jd@commandprompt.com> writes: >> So I am looking intently on what it is going to take to get the URI >> patch done for psql [1] and was digging around the web and have a URI >> parser library. It is under the New BSD license and is strictly RFC RFC >> 3986 [2] compliant . > > Surely we do not need a whole library to parse URIs. Also: http://uriparser.git.sourceforge.net/git/gitweb-index.cgi Sincerely, Joshua D. Drake > > regards, tom lane > -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On 07/21/2011 11:13 AM, Tom Lane wrote: > "Joshua D. Drake"<jd@commandprompt.com> writes: >> So I am looking intently on what it is going to take to get the URI >> patch done for psql [1] and was digging around the web and have a URI >> parser library. It is under the New BSD license and is strictly RFC RFC >> 3986 [2] compliant . > > Surely we do not need a whole library to parse URIs. > > regards, tom lane > Any other comments? I don't want to do a bunch of work just to have it tossesd on a technicality. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On Fri, Jul 22, 2011 at 10:51 AM, Joshua D. Drake <jd@commandprompt.com> wrote: > On 07/21/2011 11:13 AM, Tom Lane wrote: >> >> "Joshua D. Drake"<jd@commandprompt.com> writes: >>> >>> So I am looking intently on what it is going to take to get the URI >>> patch done for psql [1] and was digging around the web and have a URI >>> parser library. It is under the New BSD license and is strictly RFC RFC >>> 3986 [2] compliant . >> >> Surely we do not need a whole library to parse URIs. > > Any other comments? I don't want to do a bunch of work just to have it > tossesd on a technicality. Well, you haven't explained what feature you are trying to develop, so it's a bit premature to speculate on whether that feature is valuable enough to justify the carrying cost of another library. Like Tom, I'm not real sure why we would need a library for this. It seems like a hundred lines of C would be sufficient for most things you'd likely want to do, and less work than integrating someone else's code. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 07/22/2011 08:36 AM, Robert Haas wrote: > On Fri, Jul 22, 2011 at 10:51 AM, Joshua D. Drake<jd@commandprompt.com> wrote: >> On 07/21/2011 11:13 AM, Tom Lane wrote: >>> >>> "Joshua D. Drake"<jd@commandprompt.com> writes: >>>> >>>> So I am looking intently on what it is going to take to get the URI >>>> patch done for psql [1] and was digging around the web and have a URI >>>> parser library. It is under the New BSD license and is strictly RFC RFC >>>> 3986 [2] compliant . >>> >>> Surely we do not need a whole library to parse URIs. >> >> Any other comments? I don't want to do a bunch of work just to have it >> tossesd on a technicality. > > Well, you haven't explained what feature you are trying to develop, so > it's a bit premature to speculate on whether that feature is valuable > enough to justify the carrying cost of another library. Yes I did: OP of this thread: http://archives.postgresql.org/pgsql-hackers/2011-07/msg01144.php It is letting pgsql use URI syntax. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On Fri, Jul 22, 2011 at 11:43 AM, Joshua D. Drake <jd@commandprompt.com> wrote: > OP of this thread: > > http://archives.postgresql.org/pgsql-hackers/2011-07/msg01144.php > > It is letting pgsql use URI syntax. Sorry, I missed that the first time. IMHO, it seems like it would be simpler to do that by rolling our own code rather than importing someone else's. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 07/22/2011 09:03 AM, Robert Haas wrote: > On Fri, Jul 22, 2011 at 11:43 AM, Joshua D. Drake<jd@commandprompt.com> wrote: >> OP of this thread: >> >> http://archives.postgresql.org/pgsql-hackers/2011-07/msg01144.php >> >> It is letting pgsql use URI syntax. > > Sorry, I missed that the first time. > > IMHO, it seems like it would be simpler to do that by rolling our own > code rather than importing someone else's. I guess it would depend on how full featured we want the code. The library provides validation of various things that rolling our own code likely wouldn't. I mean, it is a library just like libssl or any other such thing. I would imagine that rolling our own code would likely be simpler but the code itself would be dumber (not bad, just not as capable) and we would be duplicating the effort of an already established project. Do we really want to do that? Sincerely, Joshua D. Drake > -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On Fri, Jul 22, 2011 at 1:26 PM, Joshua D. Drake <jd@commandprompt.com> wrote: >> IMHO, it seems like it would be simpler to do that by rolling our own >> code rather than importing someone else's. > > I guess it would depend on how full featured we want the code. The library > provides validation of various things that rolling our own code likely > wouldn't. I mean, it is a library just like libssl or any other such thing. Uhm, that's a complete apples-and-oranges comparison. libssl is at least three orders of magnitude more complicated than URL parsing, and provides a critical feature we would otherwise lack. Parsing URLs is trivial and something we can easily code ourselves, and supports a marginal feature that would be nice to have but is not make-or-break. If we decide to pull their source code into our repo, then will you also expect us to update it every time they release a new feature? What about when they fix a bug? What about when they fix a security bug? And what happens when the next 15 people who want similarly minor features want some library included to support their feature, too? Shall we now be responsible for syncing up with all of those projects? > I would imagine that rolling our own code would likely be simpler but the > code itself would be dumber (not bad, just not as capable) and we would be > duplicating the effort of an already established project. Do we really want > to do that? Odds are good that most of the extra features they've added are things we don't need or can't benefit from. AFAIK, we only care about pgsql URLs, not http or wais or whatever other crazy things they support. So we'll be increasing either the amount of code we have to maintain - and the size of our binaries on disk - for basically no benefit. Do we really want to do that? I mean, look, there is precedent for doing this. We sucked in chunks of snowball for tsearch and, separately, somebody's regex implementation. We depend on libedit or libreadline for command-line editing, libssl for encryption, and various other libraries for LDAP and Kerberos. So if you're asking: would we ever consider doing this?Sure, because we already have. There is no rule againstit. But if you're arguing that the benefits of this library for this feature are sufficient to justify sucking it in, I'm so far unconvinced. What does it do that is so much better than what we could code up on our own in a couple of hours? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 07/22/2011 10:57 AM, Robert Haas wrote: > On Fri, Jul 22, 2011 at 1:26 PM, Joshua D. Drake<jd@commandprompt.com> wrote: >>> IMHO, it seems like it would be simpler to do that by rolling our own >>> code rather than importing someone else's. > I mean, look, there is precedent for doing this. We sucked in chunks > of snowball for tsearch and, separately, somebody's regex > implementation. We depend on libedit or libreadline for command-line > editing, libssl for encryption, and various other libraries for LDAP > and Kerberos. So if you're asking: would we ever consider doing this? > Sure, because we already have. There is no rule against it. But if > you're arguing that the benefits of this library for this feature are > sufficient to justify sucking it in, I'm so far unconvinced. What > does it do that is so much better than what we could code up on our > own in a couple of hours? > I am not neccessarily looking to include it in our tarball. What I asked was: Assuming the code actually makes this patch easier, do we: A. Pull in the code into the main tree B. Instead have it as a requirement via configure? I am not arguing one way or the other. I am trying to find the best solution. I personally think you are handwaving if you think we can build out a robust parser in a couple of hours. Remember this library follows the RFC for URIs which is why I even brought it up. If it was just some random parser, I wouldn't even have bothered. Do we care about the RFC for URIs? (I am not being sarcastic), if we don't let's just throw some simple code together, if we do, I think this library makes sense, whether in our tree or not. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On Fri, Jul 22, 2011 at 2:09 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > I am not neccessarily looking to include it in our tarball. What I asked > was: > > Assuming the code actually makes this patch easier, do we: > > A. Pull in the code into the main tree > B. Instead have it as a requirement via configure? Well, both ways are going to cause a certain number of problems. Pulling it into the tree means we've got to maintain it forever; requiring it via configure means that everyone who packages PostgreSQL now has to include that library if they want to include PostgreSQL. > I am not arguing one way or the other. I am trying to find the best > solution. I personally think you are handwaving if you think we can build > out a robust parser in a couple of hours. I respect your opinion, but, in this case, I disagree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
"Joshua D. Drake" <jd@commandprompt.com> wrote: > I personally think you are handwaving if you think we can > build out a robust parser in a couple of hours. I don't think so. If you have a regular expression engine available, you should be able to get there by picking one of these and modifying: http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2 -Kevin
On Fri, Jul 22, 2011 at 2:22 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > "Joshua D. Drake" <jd@commandprompt.com> wrote: >> I personally think you are handwaving if you think we can >> build out a robust parser in a couple of hours. > > I don't think so. If you have a regular expression engine > available, you should be able to get there by picking one of these > and modifying: > > http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2 Well, JD is talking about doing this client-side, so I don't think our regex stuff would be available. Still, the fact that you can even write a regex for it means it can't be that hard. It's usually pretty darn straightforward to translate a regex into C code. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On fre, 2011-07-22 at 11:09 -0700, Joshua D. Drake wrote: > Assuming the code actually makes this patch easier, do we: > > A. Pull in the code into the main tree > B. Instead have it as a requirement via configure? B
On 07/22/2011 02:09 PM, Joshua D. Drake wrote: > Remember this library follows the RFC for URIs which is why I even > brought it up. If it was just some random parser, I wouldn't even have > bothered. Do we care about the RFC for URIs? The main components of the RFC involve: -Decoding escaped characters entered by percent-encoding -Parsing the permissible IPv4 and IPv6 addresses -Handling absolute vs. relative addresses. This is a lot of the spec, and it's not really relevant for PostgreSQL URIs -Splitting the URI into its five main components I know I've seen a URL-oriented %-encoding decoder as a PostgreSQL function already (I think Gabriele here wrote one). Surely useful IP address decoding functions are already around. And the splitting part seems like a fairly straightforward bit of regular expression work. I think one crossover point where it's absolutely worth using the external library for this purpose is if you have an app that has to follow all of the rules around path names. If this project didn't already have libraries around for things like IP address parsing, using the library instead would also make more sense. The remaining chores if you don't worry about all the path name trivia, and know how to interpret an IP address, seem feasible to do directly. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
Arguments in favor of coding from scratch: 1) Does not introduce new dependencies into postgresql-client packages. (note how much of a problem Readline has been) 2) keeps psql as lightweight as possible 3) We don't need to be able to parse any potential URI, just the ones we accept. Arguments against: a) waste of time coding a parser b) eventual feature creep as people ask us for more and more syntax support Overall, though, I'd say that argument (1) against is pretty powerful. Count the number of complaints and build bug reports about readline & libedit on the lists. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
<p>I generally agree, Josh, but I think readline is getting pointed at a bit too much. Yeah, it's a bad one, but we alsoinclude other stuff like zlib that doesn't commonly come up as an issue.<p>I'd argue something just a wee bit different...<p>Bythe time we would add in:<p>- autoconf rules to detect it,<br /> - makefile rules to link it in<br /> -include file changes<br /> - wrappers to ensure use of pmalloc<br /> - Debian guys add build dependancies<br /> - rpm dependenciesget added<br /> - BSD ports dependencies<p>That is likely rather more code than 1 not terribly large file ofC needed to do it ourselves. And this code is rather worse, as it is in a bunch of languages, spread all over.<p>If wewere gaining a lot of extra functionality "for free" it would be one thing. That is true for libssl, and likely zlib,but not here.
On Saturday, July 23, 2011, Christopher Browne <cbbrowne@gmail.com> wrote: > I generally agree, Josh, but I think readline is getting pointed at a bit too much. Yeah, it's a bad one, but we alsoinclude other stuff like zlib that doesn't commonly come up as an issue. > I'd argue something just a wee bit different... > By the time we would add in: > - autoconf rules to detect it, > - makefile rules to link it in > - include file changes > - wrappers to ensure use of pmalloc > - Debian guys add build dependancies > - rpm dependencies get added > - BSD ports dependencies > That is likely rather more code than 1 not terribly large file of C needed to do it ourselves. And this code is ratherworse, as it is in a bunch of languages, spread all over. > If we were gaining a lot of extra functionality "for free" it would be one thing. That is true for libssl, and likelyzlib, but not here. > Also consider if the library is widely available on common distros or not. If not, packagers are going to have to start packaging that first, in order to build the PostgreSQL packages. This is a *huge* issue for use if we want to use wxWidgets addon libraries with pgAdmin. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 07/22/2011 10:51 AM, Joshua D. Drake wrote: > On 07/21/2011 11:13 AM, Tom Lane wrote: >> "Joshua D. Drake"<jd@commandprompt.com> writes: >>> So I am looking intently on what it is going to take to get the URI >>> patch done for psql [1] and was digging around the web and have a URI >>> parser library. It is under the New BSD license and is strictly RFC >>> RFC >>> 3986 [2] compliant . >> >> Surely we do not need a whole library to parse URIs. >> >> regards, tom lane >> > > Any other comments? I don't want to do a bunch of work just to have it > tossesd on a technicality. > > 1. I think the proposed use is of very marginal value at best, and certainly not worth importing an external library for. 2. Even if we have the feature, we do not need to parse URIs generally. A small amount of hand written C code should suffice. If it doesn't that would itself be an argument against it. cheers andrew
On 07/22/2011 05:00 PM, Josh Berkus wrote: > Arguments in favor of coding from scratch: > > 1) Does not introduce new dependencies into postgresql-client packages. > (note how much of a problem Readline has been) Readline has license issues, this doesn't. > > 2) keeps psql as lightweight as possible > Agreed on that one. > 3) We don't need to be able to parse any potential URI, just the ones we > accept. True. > > Arguments against: > > a) waste of time coding a parser > > b) eventual feature creep as people ask us for more and more syntax support Yep. > > Overall, though, I'd say that argument (1) against is pretty powerful. > Count the number of complaints and build bug reports about readline& > libedit on the lists. > That is a good argument. -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
On 07/23/2011 03:39 AM, Andrew Dunstan wrote: > > 1. I think the proposed use is of very marginal value at best, and > certainly not worth importing an external library for. > > 2. Even if we have the feature, we do not need to parse URIs generally. > A small amount of hand written C code should suffice. If it doesn't that > would itself be an argument against it. Well the one area that I think this might be useful in the future is the pg_hba.conf but that is for a whole other thread. JD > > cheers > > andrew > > -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579
Excerpts from Dave Page's message of sáb jul 23 02:25:30 -0400 2011: > Also consider if the library is widely available on common distros or > not. If not, packagers are going to have to start packaging that > first, in order to build the PostgreSQL packages. This is a *huge* > issue for use if we want to use wxWidgets addon libraries with > pgAdmin. More likely, they are going to ignore it and pass the --disable-liburi (whatever) configure parameter and the functionality is going to be absent most of the time anyway. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Sun, Jul 24, 2011 at 10:12 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Dave Page's message of sáb jul 23 02:25:30 -0400 2011: > >> Also consider if the library is widely available on common distros or >> not. If not, packagers are going to have to start packaging that >> first, in order to build the PostgreSQL packages. This is a *huge* >> issue for use if we want to use wxWidgets addon libraries with >> pgAdmin. > > More likely, they are going to ignore it and pass the --disable-liburi > (whatever) configure parameter and the functionality is going to be > absent most of the time anyway. That wouldn't be too good either... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jul 25, 2011 at 3:12 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Dave Page's message of sáb jul 23 02:25:30 -0400 2011: > >> Also consider if the library is widely available on common distros or >> not. If not, packagers are going to have to start packaging that >> first, in order to build the PostgreSQL packages. This is a *huge* >> issue for use if we want to use wxWidgets addon libraries with >> pgAdmin. > > More likely, they are going to ignore it and pass the --disable-liburi > (whatever) configure parameter and the functionality is going to be > absent most of the time anyway. Yup. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Jul 23, 2011 at 3:39 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
1. I think the proposed use is of very marginal value at best, and certainly not worth importing an external library for.
Now that I've seen two people who seem to think that this is not an important feature I'll wade in and respond to this idea.
I think it's very easy to doubt the value of a definitively recognizable string that represents a postgres database when you don't have a heterogenous environment with more than a hundred thousand applications of all types in it. To make matters worse, as language support on that platform continues to widen beyond its humble beginnings, there isn't a standard across those languages for what constitutes a postgres URL. This is the current situation at Heroku, where we currently run ~150,000 individual databases on our infrastructure as well as a variety of other databases such as MySQL, Redis, Mongo, Couch, Riak, Cassandra, &c.
To head off the most obvious criticism, we aren't using connection strings in our system because there isn't any reasonable way to recognize them. A PG conninfo string is just a set of key value pairs with no dependably present signifier. This is why almost every database library from Ruby to Python to Java takes some form of a URL with a protocol called "postgres" in it in order to help select which driver to use.
Further, support (and syntax!) for the more esoteric connection parameters varies from library to library as well as between languages. A good spec by the project would go a long way in resolving this, and I can at least be confident that we could get it adopted very quickly by all three of the Ruby-community Postgres libraries.
In conclusion, this is a serious operational concern for me and my team and I will be personally dealing with fires caused by this for years to come regardless of the outcome of this thread.
-pvh
--
Peter van Hardenberg
Department of Data
Heroku
"Everything was beautiful, and nothing hurt." -- Kurt Vonnegut
On Aug 9, 2011, at 6:00 PM, Peter van Hardenberg wrote: > In conclusion, this is a serious operational concern for me and my team and I will be personally dealing with fires causedby this for years to come regardless of the outcome of this thread. Do you have an interest in funding development of the necessary URI-parsing code to get the feature done for 9.2? Best, David