Thread: PostGIS spatial extensions
Tom Lane wrote: > > [ why is this thread hiding in -patches? It should be on -hackers or > -general, methinks. ] > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Let me suggest a solution. What if we took the part of the GIS code > > that duplicated our existing code (geometric types) and replaced what we > > had in the core distribution with the GIS version. > > This is a complete nonstarter unless the GIS guys are willing to accept > BSD licensing of that part of their code; which I doubt given Paul's > prior comments. > > regards, tom lane Hi Tom, I have discussed this with Dave Blasby, who has done all of the programming to date (and will no doubt pop up here soon to put his oar in). There are a few issues germain to us in this: 1) Protection of important intellectual property under the GPL so that a core of geospatial algorithms can begin to coallesce. 2) Promotion of PostGIS as a central OpenGIS component (the University of Minnesota Mapserver is another) which will hopefully bring our business some consulting work over time. 3) Promition of PostgreSQL/PostGIS as an open-source alternative to things like OracleSpatial or SDE/Oracle. Our feeling is that the basic database objects and their hooks into GiST are not the core of IP we are interested in protecting. The most important code for PostGIS and open source GIS is not yet incorporated: it is the overlay, union, binary predicate algorithms specificed by the OpenGIS spec. Those are the bits we want to have GPL'ed. We are not averse to having the objects and spatial indexing under BSD and in the core pgsql distribution, but would like the rest of the OpenGIS Simple Feature Spec to be part of a GPL package (the functions, the supporting triggers and consistency maintainance devices, blah blah blah). So, 1) we can do by maintaining the important OpenGIS algorithms in an external package while the objects and indexes are brought into the pgsql main tree 2) and 3) are better served by being part of the main tree, where everyone can use the main objects, and the savants can learn about OpenGIS and move on to the complete package. Now, why would you want these objects? - they are toastable, so one of the big GIS usability bugaboos with the old geometries - they are indexable, using GiST, and do lossy indexing so "large polygon" bugaboo is not a problem - they follow an existing spec for GIS-in-a-database - they support polygons-with-holes - 3d coordinates supported Why don't you want these objects? - some of the existing funcionatily is missing, because it is not in the OpenGIS spec - no circles, or arcs - different canonical representations (EG, a point is 'POINT(1 2)' not '(1,2)' - superannuation of alot of the operator notation in short... - not backward compatible I'm sure there's other reasons as well. Something I would like Dave to comment on is how cleanly we can split the object/indexing from the OpenGIS spec'ed support tables and reference systems. I am thinking about the canonical representation in particular, which could be pretty ugly with the SRS id's hanging in there for no purpose. The OpenGIS spec is at http://www.opengis.org/techno/specs/99-049.pdf
> Something I would like Dave to comment on is how cleanly we can split > the object/indexing from the OpenGIS spec'ed support tables and > reference systems. I am thinking about the canonical representation in > particular, which could be pretty ugly with the SRS id's hanging in > there for no purpose. The OpenGIS spec is at > http://www.opengis.org/techno/specs/99-049.pdf I am thinking we can turn off the prefix tags with some postgresql.conf option. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
I think it would be great for PostgreSQL to be an 'OpenGIS Simple Feature Specification for SQL' compliant database with robust spatial operators right-out-of-the-box. Currently, PostGIS implements most of the OpenGIS specification. The unimplemented portions are the important; spatial operators (the DE-9IM spatial relationship matrix) and boolean functions (union, intersection, XOR, etc ). Since these are extremely difficult algorithms, the PostGIS team will probably translate the JTS (Java Topology Suite) to C++. The JTS is a soon-to-be-released robust Java implementation of the OpenGIS simple feature type. Vivid Solutions (cf. http://www.vividsolutions.com/jts/jtshome.htm) will be releasing it under the LGPL. JTS is the only open-source robust spatial library I've ever heard of. The PostGIS developers and Vivid Solutions want this to remain Free Software and not be co-opted and closed. Since PostgreSQL cannot have LGPL code in its core, this would make it impossible to ever have a fully-compliant PostGIS in its core. In fact, its unlikely that anyone will spend the huge effort required creating a BSD equivalent spatial library when there is already a LGPL one available. This leaves the option for creating a semi-compliant OpenGIS core inside PostgreSQL and having a LGPL add-on for the complex spatial operations (making a fully compliant implementation). The next question is, of course, what does 'semi-compliant' mean? Or, more interesting, why would you want a semi-compliant database? For most people's simple tasks, the built in geometry types are adequate. Those interested in doing more complex tasks will probably want the full OpenGIS implementation. A few people have suggested that we simplify PostGIS, release it as BSD, and use that in the core of PostgreSQL. The simplified PostGIS would have the basic types, indexing, and a few operations (those following PostGIS development, this is very much like version 0.5 and earlier). The 'full' PostGIS (with JTS) would have the entire OpenGIS spec. Unfortunately, this is easier said than done. The full implementation requires a properly maintained metadata table (with information about every geometry column in the DB), a spatial referencing system table (info about each map projection used), and each geometry must have spatial referencing information. The JTS may also require precision grid (offset/scale) information in each geometry. This would make it really difficult (and confusing) to upgrade to the fully compliant version from the partially compliant version - friction I don't want. Secondly, as paul has already pointed out, there wouldn't be very many operations you could do on these objects. dave For those reading the OpenGIS spec, PostGIS is most accurately described as "SQL92 with Geometry Types Implementation of FeatureTables".
Dave Blasby <dblasby@refractions.net> writes: > [snip] Vivid Solutions (cf. > http://www.vividsolutions.com/jts/jtshome.htm) will be releasing it > under the LGPL. > [snip] > This leaves the option for creating a semi-compliant OpenGIS core inside > PostgreSQL and having a LGPL add-on for the complex spatial operations > (making a fully compliant implementation). Um, the tarfile that Paul sent us contained the GPL license, not LGPL. There's a pretty substantial difference. Please clarify exactly which license you intend to use. regards, tom lane
Tom Lane wrote: > > Dave Blasby <dblasby@refractions.net> writes: > > [snip] Vivid Solutions (cf. > > http://www.vividsolutions.com/jts/jtshome.htm) will be releasing it > > under the LGPL. > > [snip] > > This leaves the option for creating a semi-compliant OpenGIS core inside > > PostgreSQL and having a LGPL add-on for the complex spatial operations > > (making a fully compliant implementation). > > Um, the tarfile that Paul sent us contained the GPL license, not LGPL. > There's a pretty substantial difference. Please clarify exactly which > license you intend to use. PostGIS is currently released under the GPL, and is developed by Refractions Research. JTS (Java Topology Suite) will be released under the LGPL, and is developed by Vivid Solutions. JTS hasnt been released yet, and it will need to be converted to C++ before it could be incorporated into PostGIS. Even if PostGIS is converted to BSD at some point in the future, it will always have a LGPL component if we decide to use the JTS to do the complex spatial relations and operations. Sorry for the confusion, dave
Dave Blasby wrote: > > The next question is, of course, what does 'semi-compliant' mean? Or, > more interesting, why would you want a semi-compliant database? For > most people's simple tasks, the built in geometry types are adequate. > Those interested in doing more complex tasks will probably want the full > OpenGIS implementation. I would argue that for most people's simple tasks the built-in geometry types are in fact not adequate. The fact that they choke on large objects and are mostly not indexable (polgyons and boxes excepted) should be enough to discourage most people with GIS intentions. I would tend to say that a semi-compliant database would be good enough to hack with, but not good enough to plug-n-play with an existing OpenGIS client. It would include the objects, indexing and accessors. Dump data in, search real fast, dump data out. A more philosophical question would be whether a semi-compliant database is desirable from a public good point of view: semi-compliant infrastructure will encourage non-standard applications, which will in turn weaken the raison d'etre of the standard in the first place. > A few people have suggested that we simplify PostGIS, release it as BSD, > and use that in the core of PostgreSQL. The simplified PostGIS would > have the basic types, indexing, and a few operations (those following > PostGIS development, this is very much like version 0.5 and earlier). > The 'full' PostGIS (with JTS) would have the entire OpenGIS spec. > > Unfortunately, this is easier said than done. The full implementation > requires a properly maintained metadata table (with information about > every geometry column in the DB), a spatial referencing system table > (info about each map projection used), and each geometry must have > spatial referencing information. The JTS may also require precision > grid (offset/scale) information in each geometry. This would make it > really difficult (and confusing) to upgrade to the fully compliant > version from the partially compliant version - friction I don't want. > > Secondly, as paul has already pointed out, there wouldn't be very many > operations you could do on these objects. You forgot to finish your thought :) "Therefore, I do not think we should cleave the distribution into a BSD core and GPL support package." I am not opposed to that philisophically, but I really do think that Bruce's suggestion regarding becoming more integrated has merits around acceptance by the larger PgSQL community. Being a good neighbor means both receiving and giving. Perhaps we could back up at this point and revisit 'contrib' ... at what point in the size/licence/redundace spectrum do we become reasonable candidates for 'contrib', if ever? The current tenor seems to be that at 600K/GPL/point-line-polygon we are "too big"/"too restrictive and/or too free"/"overlapping". Would moving on any of those axes be sufficient, or do we have to address all three (practically speaking, I not think there is anything to be done about size).
Paul Ramsey writes: > Perhaps we could back up at this point and revisit 'contrib' ... at what > point in the size/licence/redundace spectrum do we become reasonable > candidates for 'contrib', if ever? The current tenor seems to be that at > 600K/GPL/point-line-polygon we are "too big"/"too restrictive and/or too > free"/"overlapping". Would moving on any of those axes be sufficient, or > do we have to address all three (practically speaking, I not think there > is anything to be done about size). Historically, contrib was the place for small pieces of code that a) could/would/should not go into the core for some reason, b) were unreasonable to distribute otherwise (too small, not general enough), and c) served as examples of how to use the type/functione extension features. You satisfy a), you do not satisfy b), and I doubt that c) is still applicable. Projects that are as organized, professional, and value-adding as yours is can surely stand on their own. I compare this to the recently released OpenFTS. If we start including projects of this size we'd explode in size and maintenance overhead. I don't want to make the impression that I don't like you guys. It's just that we have to realize that there is a *lot* of coding using PostgreSQL these days, and it's unreasonable to include all of this in our distribution, while at the other end people are crying about removing the documentation from the tarball because it's too big already. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Peter Eisentraut wrote: > Projects that are as organized, professional, and value-adding as yours is > can surely stand on their own. I compare this to the recently released > OpenFTS. If we start including projects of this size we'd explode in size > and maintenance overhead. Fair enough... perhaps we should turn then to some kind of discussion on packaging standards for postgresql extensions? - One of the things we have run up against is that for most linux distributions, the postgresql-devel package does not include postgres.h in the header package. This is not necessary for client-side programs, but it is for server-side extensions. So people cannot compile our extension without jettisoning their RPM version of postgresql and moving to the tarball. - Compile our own RPM you say? Yes and no. We could provide a SRPM, but then we have the same problem: absent a complete postgresql source tree, we cannot compile. And even if we *do* provide our own RPM... - Where should extensions be installed by default? The RPM package has some rules, the tarball has some other rules. Should extensions spread themselves out over the postgresql tree (libs under lib, docs under doc, etc) or should they be self-contained (postgis/lib postgis/doc) under some other location? In order to provide a rational RPM source package I ended up having to provide a complete SRPM of postgresql with the postgis stuff bundled in. You must build the whole package in order to get the postgis component. The issue of the extensions dependance on the core is pretty important. -- __ / | Paul Ramsey | Refractions Research | Email: pramsey@refractions.net | Phone: (250) 885-0632 \_
Paul Ramsey writes: > - One of the things we have run up against is that for most linux > distributions, the postgresql-devel package does not include postgres.h > in the header package. This is not necessary for client-side programs, > but it is for server-side extensions. So people cannot compile our > extension without jettisoning their RPM version of postgresql and moving > to the tarball. The 7.1 RPMs should contain the server side headers somewhere. Earlier versions only included a not very well defined subset of them. > - Where should extensions be installed by default? The RPM package has > some rules, the tarball has some other rules. Should extensions spread > themselves out over the postgresql tree (libs under lib, docs under doc, > etc) or should they be self-contained (postgis/lib postgis/doc) under > some other location? This is a matter taste, or of the file system standard of the system you use. If you use autoconf and thus the GNU layout for your source package then the default is going to end up something like /usr/local/lib/postgis/postgis.so /usr/local/share/postgis/install-postgis.sql For binary distributions you'd fiddly with the configure --xxxdir flags a little. Maybe you had in mind some sort of standard layout under a standard directory, such as /usr/lib/postgresql/site-stuff (cf. perl), but this sort of a arrangement is a major pain. For instance, it won't allow non-root installs. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Peter Eisentraut wrote: > The 7.1 RPMs should contain the server side headers somewhere. Earlier > versions only included a not very well defined subset of them. Indeed they do (nice!), which brings me to a different question: 1 - I download the tarball 2 - ./configure ; make ; make install 3 - Delete the source tree I now have a complete working pgsql installation, with all the libs to run the server, and all the headers to build custom clients, but *not* enough headers to build server extensions, because postgres.h is missing. However, if I have an RPM-based installation, I *will* have the server headers I need. Why do we discriminate against people who compile from the tarball? > This is a matter taste, or of the file system standard of the system you > use. If you use autoconf and thus the GNU layout for your source package > then the default is going to end up something like > > /usr/local/lib/postgis/postgis.so > /usr/local/share/postgis/install-postgis.sql > > For binary distributions you'd fiddly with the configure --xxxdir flags a > little. > > Maybe you had in mind some sort of standard layout under a standard > directory, such as /usr/lib/postgresql/site-stuff (cf. perl), but this > sort of a arrangement is a major pain. For instance, it won't allow > non-root installs. I am tempted to start moving the postgis release to a completely independant package (not living in contrib by default), with its own configure script, etc etc, but until the availability of postgres.h is resolved that might be ill-advised.
I would take a hard look at R's extension packaging system (www.r-project.org). Its the best in the business. It consolidates all aspects of creating packages, including configuring, building, run-time linking, documentation and testing. It also allows non-root users to install packages in their own account. Tim Peter Eisentraut wrote: >Paul Ramsey writes: > >>- One of the things we have run up against is that for most linux >>distributions, the postgresql-devel package does not include postgres.h >>in the header package. This is not necessary for client-side programs, >>but it is for server-side extensions. So people cannot compile our >>extension without jettisoning their RPM version of postgresql and moving >>to the tarball. >> > >The 7.1 RPMs should contain the server side headers somewhere. Earlier >versions only included a not very well defined subset of them. > >>- Where should extensions be installed by default? The RPM package has >>some rules, the tarball has some other rules. Should extensions spread >>themselves out over the postgresql tree (libs under lib, docs under doc, >>etc) or should they be self-contained (postgis/lib postgis/doc) under >>some other location? >> > >This is a matter taste, or of the file system standard of the system you >use. If you use autoconf and thus the GNU layout for your source package >then the default is going to end up something like > >/usr/local/lib/postgis/postgis.so >/usr/local/share/postgis/install-postgis.sql > >For binary distributions you'd fiddly with the configure --xxxdir flags a >little. > >Maybe you had in mind some sort of standard layout under a standard >directory, such as /usr/lib/postgresql/site-stuff (cf. perl), but this >sort of a arrangement is a major pain. For instance, it won't allow >non-root installs. > -- Timothy H. Keitt Department of Ecology and Evolution State University of New York at Stony Brook Stony Brook, New York 11794 USA Phone: 631-632-1101, FAX: 631-632-7626 http://life.bio.sunysb.edu/ee/keitt/
Paul Ramsey <pramsey@refractions.net> writes: > However, if I have an RPM-based installation, I *will* have > the server headers I need. Why do we discriminate against people who > compile from the tarball? We don't. We do, however, assume that they read the installation instructions: The standard install installs only the header files needed for client application development. If you plan to do any server-side program development (such as custom functions or datatypes written in C), then you may want to install the entire PostgreSQL include tree into your target include directory. To do that, enter gmake install-all-headers This adds a megabyte or two to the install footprint, and is only useful if you don't plan to keep the whole source tree around for reference. (If you do, you can just use the source's include directory when building server-side software.) If Peter's notion of installing server-side headers into a separate subdirectory pans out, it might be worth thinking about installing all headers all the time. Right now I'd vote against it, on the grounds that it adds too much include-file clutter for something that very few people need. regards, tom lane
Projects that are as organized, professional, and value-adding as yours is can surely stand on their own. I comparethis to the recently released OpenFTS. If we start including projects of this size we'd explode in size and maintenanceoverhead. Doesn't this discussion indicate that the time is fast approaching, if not already past, for some type of system for handling installation of 3rd party software? It would seem that two prerequisites would need to be satisfied to do this: - Definition and implementation of the interface to be provided for extensions. Presumably, this would involve defininga well-designed set of public header files and associated libraries at the right level of granularity. For example,encapsulating each type in its own header file with a standardized set of operations defined in a server-side librarywould be extremely valuable. The library (or libraries) could be used to link the backend and installed for extensionsto take advantage of preexisting types when others need to construct new more complex ones. - Definition and implementation of a consistent extension management system for retrieving, compiling, and installing extensions. This could even be used for installing the system itself, thereby making the entire operation of managing thesoftware consistent. I point out that the NetBSD pkgsrc system[1] does the latter in an extremely flexible and well-designed manner, and has been a major foundation for the openpackages project. It even includes 7 distinct packages[2] for different elements of PostgreSQL, not including a number of other packages broken out for different interfaces. The same system could be adopted for managing 3rd party extensions. Having been involved in defining what header files to install, and having been actively involved in developing new types for use in our installation, I can say that external packaging of PostgreSQL, local extension of PostgreSQL, and management of 3rd party software would be greatly enhanced by an effort to address the two prerequisites mentioned above. Cheers, Brook --------------------------------------------------------------------------- [1] http://www.netbsd.org/Documentation/software/packages.html [2] ftp://ftp.netbsd.org/pub/NetBSD/packages/pkgsrc/databases/README.html