Thread: The Contrib Roundup (long)
Folks, I had a lot of time to kill on airplanes recently so I've gone digging through /contrib in an effort to sort out what's in there and try to apply some consistent rules to it. Before people read further, please understand that this is just an initial discussion on what will and won't be in contrib for 8.1; nobody has made any decisions yet. What Should Be In Contrib? ------------------------------- Looking over what's in there most of the reasonable contrib options fall into 3 groups: extra data types, extra functions and backend utilities. These all seem reasonable things to put into contrib, with the addition of other code being tested for inclusion in the core. These categories also pretty much cover things that need to be inside the PostgreSQL source to build. What Shouldn't Be In Contrib? ------------------------------- The things I think we should exclude from contrib are rather more varied. Based on examples: a) Code with major external dependencies other than a programming language. Partly this is because this means they are useful to less users; more importantly, this is because the external dependencies mean that the release cycle for these tools is likely to be determined by the external dependency and not by PostgreSQL's release cycle. Further, the external dependencies mean that it's less likely that the postgresql core programmers can maintain them in the event that the original developer goes away. The Mysql conversion scripts are a good example of this; I don't believe that my2pg even works with MySQL 4. b) Alpha-quality code and unfinished projects. Shipping something with the PostgreSQL source code implies a certain level of stability, completeness and quality. We shouldn't be including scripts which took 2 hours to write and have only been tested on one platform. This stuff can get developed on pgFoundry and moved to contrib when it's close to mature. c) Differently licensed code. I'm not an attorney: I won't pretend to know which licenses it's legal to bundle in our tarballs and which are not. But I do know that most users and redistributors aren't going to grep contrib looking for other licenses, and putting differently licensed stuff in there is bad pr at best, and a legal booby trap at worst. (Particularly, there are 3 contrib modules by Massimo del Zotto, which are GPL licensed. According to the FSF's licensing admin, installing any these contrib modules will instantly make that copy of PostgreSQL GPL.) d) Application code and example code. Contrib is *not* a good place for "here's how you do this in an application" kind of code. It's not visible enough to be documentation, and such examples aren't generally useful to the majority of users as code. Moving to PgFoundry is NOT "Demotion" ---------------------------------------- I know that I'm going to get a lot of resistance for the idea of moving some project to pgFoundry, because authors feel that it's a "demotion" for their code not to be shipped with the PostgreSQL source. However, being on pgFoundry increases the visibility of your code and allows a wider array of people to contribute to it -- and even find it. And for items of particularly broad utility, stuff can always go from pgFoundry into the core when mature or when utility is demonstrated. Contrib Subdirectories? ------------------------------------- I think it would also be helpful to users if we could create subdirectories to organize contrib into categories. This would help users and packagers find what they want. These directories would be: data_types/ functions/ utilities/ I've noted below which contrib code I think should go in those subdirs. Contrib Build Options? --------------------------- I'll point out that several people (including one of our RPM builders) spoke up in favor of the idea of adding ./contrib command line options for individual contrib items. Discussion was dropped without a decision being reached. That would work like: ./configure --with-perl --prefix=/usr/pgsql --with-tsearch2 --with-fuzzystrmatch Documentation -------------------------- As previously mentioned, all contrib modules need to have documentation in the main postgreSQL docs. Probably their own section, called "Optional Modules". Contrib Item Listing -------------------------------- What follows is my notes on individual contrib projects. Many contain questions because I don't know enough about the item. Please read through them an provide what feedback you can. Especially, provide feedback on the items I'm suggesting eliminating or moving out. I've noted the author contact info where I'm thinking of moving modules, and will be attempting to contact those authors if we decide to change status. adddepend: is this still needed, or would a proper dump-and-reload from 7.2 add the dependancy information anyway? array: placeholder for old array module; contains only a readme. Should probably be dropped for 8.2. btree_gist: data_types/ chkpass: data_types/ cube: README needs documentation on what the module is *for*. dbmirror: should be on pgfoundry/gborg with other replication systems. Stephen Singer (ssinger@navtechinc.com) dbsize: functions/ earthdistance: data_types/ findoidjoins: again, it's not clear what this module is for. Bruce? fulltextindex: Obsolesced by Tsearch2. Also rather a brute-force technique for FTI possibly more useful as an illustration of advance trigger use than as an index. Move to pgfoundry or techdocs? Maarten Boekhold (maartenb@dutepp0.et.tudelft.nl) fuzzystrmatch: functions/ intagg: what does this module do which is not already available through the built-in array functions and operators? Maybe I don't understand what it does. Unnatributed in the README. Move to pgfoundry? intarray: data_types/ ipc_check: nice idea, possibly useful but works only on FreeBSD. Needs to be vastly expanded to support multiple platforms. Work on replacing with "Configurator" project at pgfoundry. Author unattributed. Recommend removal. isbn_issn: more data types. Has anyone tested this one lately? It appears not to have been modified since 7.2. data_types/ lo: another special data type. Is its functionality required anymore? It appears to be a workaround to some limitations of our large object interface which may no longer exist. Author Peter Mount ( peter@retep.org.uk ) data_types/ ltree: data_types/ msql_interface: does anyone use mSQL anymore? In any case, conversion and foriegn-database-connection tools definitely belong on pgFoundry. Author Aldrin Leal ( aldrin@americasnet.com ). mac: A special purpose script which I doubt works on all platforms. Belongs on pgFoundry so that maybe someone will take an interest in expanding it. misc_utils: I believe that all of these utils are obsolesced by builtin system commands or easily written userspace functions (like max(x,y)). Also, is under the GPL (see above). Author Massimo Dal Zotto (dz@cs.unitn.it) mysql: these utilities have been moved to project sites (such as GBorg), and I believe that my2pg is broken with current versions of MySQL. Can we remove this from contrib? noupdate: this is a cool example of a simple C trigger and would be lovely to have in a doc somewhere. However, its functionality is easily replicated through a simple PL/pgSQL trigger so it seems unnecessary as a contrib module. Author unattributed. oid2name: a useful backend utility which is used by a number of external tools. What would it take to make this a builtin binary? utilities/ oracle: again, very useful and I wish to move it to pgFoundry and take over maintenance of it. Author Gilles Darold (gilles@darold.net). pg_autovaccuum: moving into the backend. pg_buffercache: another useful backend utility. Seems perfect for contrib. utilities/ pg_dumplo: is this still required for pg large objects? If so, can't we integrate it into the core? utilities/ pg_trgm: data_types/ pg_upgrade: what's the status of this, Bruce? Does it work at all? Shouldn't this be moved to the pgfoundry project of the same name until it's stable? pgbench: I see repeated complaints on -performance about how pgbench results are misleading. Why are we shipping it with PostgreSQL then? Shouldn't this be on pgFoundry, maybe in the testperf project? Shouldn't all performance tests be on pgFoundry instead of in the code, unless they're part of regression tests? pgcrypto: more for /functions. And a good reason to keep the main PostgreSQL ftp servers outside the US :-b pgstattuple: utilities/ reindexdb: now obsolete per the REINDEX {database} command. Remove from contrib. rtree_gist: data_types/ seg: data_types/ spi: contains TimeTravel functions. Do these actually still work? The spi stuff is good for documentation purposes anyway ... but if the functions aren't working, should be in the docs and not /contrib. start-scripts: utilities/. Needs to be expanded and checked against more oses. string: data_types/ Same problem as Massimo's other library; it's GPL. Also, is it really needed at this point? Massimo (dz@cs.unitn.it). tablefunc: functions/ tips: this is a proto-apache-log-slurping project, in *alpha*. As such, it really needs to be on pgFoundry. Author Terry Mackintosh (terry@terrym.com) tools: Two of these are emacs scripts, and would be better on pgFoundry if not on Savannah. The find-sources shell script is again GPL and should probably be removed, and moreover appears to have nothing to do with PostgreSQL. tsearch: obsolesced by tsearch2. Should be moved to pgfoundry where it can be maintained by users needing backwards compatibility. userlocks: another GPL script, with the problems that entails. Also problematic as it relies heavily on per-record OIDs, something we tell users not to do. Overall, should be removed. Author: Massimo. vacuumlo: is this still required? If utilities/. xml and xml2: both by John Gray (jgray@azuli.co.uk). John, why do we have two of these? Otherwise, data_types/. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
a few comments scattered inline... On Tue, Jun 07, 2005 at 02:53:32PM -0300, Josh Berkus wrote: > Folks, > > I had a lot of time to kill on airplanes recently so I've gone > digging through /contrib in an effort to sort out what's in > there and try to apply some consistent rules to it. Before > people read further, please understand that this is just an > initial discussion on what will and won't be in contrib for > 8.1; nobody has made any decisions yet. > > What Should Be In Contrib? > ------------------------------- > Looking over what's in there most of the reasonable contrib > options fall into 3 groups: extra data types, extra functions > and backend utilities. These all seem reasonable things to put > into contrib, with the addition of other code being > tested for inclusion in the core. These categories also > pretty much cover things that need to be inside the PostgreSQL > source to build. > > What Shouldn't Be In Contrib? > ------------------------------- > The things I think we should exclude from contrib are rather > more varied. Based on examples: > > a) Code with major external dependencies other than a > programming language. Partly this is because this means they > are useful to less users; more importantly, this is because the > external dependencies mean that the release cycle for these > tools is likely to be determined by the external dependency and > not by PostgreSQL's release cycle. Further, the external > dependencies mean that it's less likely that the postgresql > core programmers can maintain them in the event that the > original developer goes away. The Mysql conversion scripts are > a good example of this; I don't believe that my2pg even works > with MySQL 4. > > b) Alpha-quality code and unfinished projects. Shipping > something with the PostgreSQL source code implies a certain > level of stability, completeness and quality. We shouldn't be > including scripts which took 2 hours to write and have only > been tested on one platform. This stuff can get developed on > pgFoundry and moved to contrib when it's close to mature. > > c) Differently licensed code. I'm not an attorney: I won't > pretend to know which licenses it's legal to bundle in our > tarballs and which are not. But I do know that most users and > redistributors aren't going to grep contrib looking for other > licenses, and putting differently licensed stuff in there is > bad pr at best, and a legal booby trap at worst. > (Particularly, there are 3 contrib modules by Massimo del Zotto, > which are GPL licensed. According to the FSF's licensing admin, > installing any these contrib modules will instantly make that > copy of PostgreSQL GPL.) I agree that anything that is not BSD licensed should not go into contrib. > > d) Application code and example code. Contrib is *not* a good > place for "here's how you do this in an application" kind of > code. It's not visible enough to be documentation, and such > examples aren't generally useful to the majority of users as > code. > > Moving to PgFoundry is NOT "Demotion" > ---------------------------------------- > I know that I'm going to get a lot of resistance for the idea > of moving some project to pgFoundry, because authors feel that > it's a "demotion" for their code not to be shipped with the > PostgreSQL source. However, being on pgFoundry increases the > visibility of your code and allows a wider array of people to > contribute to it -- and even find it. And for items of > particularly broad utility, stuff can always go from pgFoundry > into the core when mature or when utility is demonstrated. > > Contrib Subdirectories? > ------------------------------------- > I think it would also be helpful to users if we could create > subdirectories to organize contrib into categories. This would > help users and packagers find what they want. These > directories would be: > data_types/ > functions/ > utilities/ > I've noted below which contrib code I think should go in those > subdirs. These directories are misleading since all data types include functions. If we are paring down contrib, I see no reason to reorganize them. > > Contrib Build Options? > --------------------------- > I'll point out that several people (including one of our > RPM builders) spoke up in favor of the idea of adding ./contrib > command line options for individual contrib items. Discussion > was dropped without a decision being reached. That would work > like: > ./configure --with-perl --prefix=/usr/pgsql --with-tsearch2 > --with-fuzzystrmatch > > Documentation > -------------------------- > As previously mentioned, all contrib modules need to have > documentation in the main postgreSQL docs. Probably their own > section, called "Optional Modules". > > Contrib Item Listing > -------------------------------- > What follows is my notes on individual contrib projects. Many > contain questions because I don't know enough about the item. > Please read through them an provide what feedback you can. > Especially, provide feedback on the items I'm suggesting > eliminating or moving out. I've noted the author contact info > where I'm thinking of moving modules, and will be attempting to > contact those authors if we decide to change status. > > adddepend: is this still needed, or would a proper > dump-and-reload from 7.2 add the dependancy information anyway? > > array: placeholder for old array module; contains only a > readme. Should probably be dropped for 8.2. > > btree_gist: data_types/ Actually this is an index, not a datatype > > chkpass: data_types/ > > cube: README needs documentation on what the module is *for*. > > dbmirror: should be on pgfoundry/gborg with other replication > systems. Stephen Singer (ssinger@navtechinc.com) > > dbsize: functions/ > > earthdistance: data_types/ Isn't this just a function? > > findoidjoins: again, it's not clear what this module is for. > Bruce? > > fulltextindex: Obsolesced by Tsearch2. Also rather a > brute-force technique for FTI possibly more useful as an > illustration of advance trigger use than as an index. Move to > pgfoundry or techdocs? Maarten Boekhold > (maartenb@dutepp0.et.tudelft.nl) > > fuzzystrmatch: functions/ > > intagg: what does this module do which is not already available > through the built-in array functions and operators? Maybe I > don't understand what it does. Unnatributed in the README. Move > to pgfoundry? > > intarray: data_types/ what does this do that arrays do not? > > ipc_check: nice idea, possibly useful but works only on FreeBSD. > Needs to be vastly expanded to support multiple platforms. > Work on replacing with "Configurator" project at pgfoundry. > Author unattributed. Recommend removal. > > isbn_issn: more data types. Has anyone tested this one lately? > It appears not to have been modified since 7.2. data_types/ > > lo: another special data type. Is its functionality required > anymore? It appears to be a workaround to some limitations of > our large object interface which may no longer exist. Author > Peter Mount ( peter@retep.org.uk ) data_types/ > > ltree: data_types/ > > msql_interface: does anyone use mSQL anymore? In any case, > conversion and foriegn-database-connection tools definitely > belong on pgFoundry. Author Aldrin Leal ( > aldrin@americasnet.com ). > > mac: A special purpose script which I doubt works on all > platforms. Belongs on pgFoundry so that maybe someone will > take an interest in expanding it. > > misc_utils: I believe that all of these utils are obsolesced by > builtin system commands or easily written userspace functions > (like max(x,y)). Also, is under the GPL (see above). Author > Massimo Dal Zotto (dz@cs.unitn.it) > > mysql: these utilities have been moved to project sites (such as > GBorg), and I believe that my2pg is broken with current versions > of MySQL. Can we remove this from contrib? > > noupdate: this is a cool example of a simple C trigger and would > be lovely to have in a doc somewhere. However, its > functionality is easily replicated through a simple PL/pgSQL > trigger so it seems unnecessary as a contrib module. Author > unattributed. > > oid2name: a useful backend utility which is used by a number of > external tools. What would it take to make this a builtin > binary? utilities/ > > oracle: again, very useful and I wish to move it to pgFoundry > and take over maintenance of it. Author Gilles Darold > (gilles@darold.net). > > pg_autovaccuum: moving into the backend. > > pg_buffercache: another useful backend utility. Seems perfect > for contrib. utilities/ > > pg_dumplo: is this still required for pg large objects? If > so, can't we integrate it into the core? utilities/ > > pg_trgm: data_types/ > > pg_upgrade: what's the status of this, Bruce? Does it work at > all? Shouldn't this be moved to the pgfoundry project of the > same name until it's stable? > > pgbench: I see repeated complaints on -performance about how > pgbench results are misleading. Why are we shipping it with > PostgreSQL then? Shouldn't this be on pgFoundry, maybe in the > testperf project? Shouldn't all performance tests be on > pgFoundry instead of in the code, unless they're part of > regression tests? > > pgcrypto: more for /functions. And a good reason to keep the > main PostgreSQL ftp servers outside the US :-b > > pgstattuple: utilities/ > > reindexdb: now obsolete per the REINDEX {database} command. > Remove from contrib. > > rtree_gist: data_types/ > > seg: data_types/ > > spi: contains TimeTravel functions. Do these actually still > work? The spi stuff is good for documentation purposes anyway > ... but if the functions aren't working, should be in the docs > and not /contrib. > > start-scripts: utilities/. Needs to be expanded and > checked against more oses. > > string: data_types/ Same problem as Massimo's > other library; it's GPL. Also, is it really needed at this > point? Massimo (dz@cs.unitn.it). > > tablefunc: functions/ > > tips: this is a proto-apache-log-slurping project, in *alpha*. > As such, it really needs to be on pgFoundry. Author Terry > Mackintosh (terry@terrym.com) > > tools: Two of these are emacs scripts, and would be better > on pgFoundry if not on Savannah. The find-sources shell > script is again GPL and should probably be removed, and moreover > appears to have nothing to do with PostgreSQL. > > tsearch: obsolesced by tsearch2. Should be moved to pgfoundry > where it can be maintained by users needing backwards > compatibility. > > userlocks: another GPL script, with the problems that entails. > Also problematic as it relies heavily on per-record OIDs, > something we tell users not to do. Overall, should be removed. > Author: Massimo. > > vacuumlo: is this still required? If utilities/. > > xml and xml2: both by John Gray (jgray@azuli.co.uk). John, why > do we have two of these? Otherwise, data_types/. > -- > --Josh > > Josh Berkus > Aglio Database Solutions > San Francisco > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >
On Tue, 2005-06-07 at 13:53, Josh Berkus wrote: > mysql: these utilities have been moved to project sites (such as > GBorg), and I believe that my2pg is broken with current versions > of MySQL. Can we remove this from contrib? > I believe this version now lives at http://gborg.postgresql.org/project/mysql2psql/projdisplay.php, although there are other versions. I agree it should be removed. > reindexdb: now obsolete per the REINDEX {database} command. > Remove from contrib. actually I think part of the point of this was to give a command line version of the reindex command, like we have for vaccum. If that still matters, then it should probably stay. Actually it should probably be converted to C and moved to /src/bin. > > xml and xml2: both by John Gray (jgray@azuli.co.uk). John, why > do we have two of these? Otherwise, data_types/. istr that xml2 had some expanded capabilties at the expense of additional security issues, but we should wait for the author to jump in. Josh, was this comprehensive? I don't see dblink, and was thinking there was some others missing... soundex ?. Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
On 2005-06-07, Josh Berkus <josh@agliodbs.com> wrote: > userlocks: another GPL script, with the problems that entails. > Also problematic as it relies heavily on per-record OIDs, > something we tell users not to do. Overall, should be removed. > Author: Massimo. userlocks is just a very thin interface to functionality that's really in the backend. What's left in contrib/userlock probably isn't even copyrightable in any case. The best bet is probably to re-implement it in the backend directly. Removing it certainly isn't a good idea; the functionality is important. (It doesn't "rely on per-record OIDs" either.) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
On Tue, Jun 07, 2005 at 02:53:32PM -0300, Josh Berkus wrote: > Moving to PgFoundry is NOT "Demotion" > ---------------------------------------- Yeah, I agree. Lots of people understand "search in pgfoundry.org" much easily than "see contrib/adddepend". (I agree with most of the rest of your comments as well.) > adddepend: is this still needed, or would a proper > dump-and-reload from 7.2 add the dependancy information anyway? Yes, it's still needed: a normal dump/reload doesn't fix the problem. > findoidjoins: again, it's not clear what this module is for. > Bruce? I don't think this should be a contrib at all. It's more like a developer tool. > lo: another special data type. Is its functionality required > anymore? It appears to be a workaround to some limitations of > our large object interface which may no longer exist. No, it's still needed I think. It's somewhat redundant with vacuumlo apparently? The functionality of both should be incorporated into the backend somehow, I'd think. > pg_dumplo: is this still required for pg large objects? If > so, can't we integrate it into the core? utilities/ I believe pg_dump has this functionality, with -O. > reindexdb: now obsolete per the REINDEX {database} command. > Remove from contrib. No, this is different than REINDEX DATABASE. -- Alvaro Herrera (<alvherre[a]surnet.cl>) Oh, oh, las chicas galacianas, lo harán por las perlas, ¡Y las de Arrakis por el agua! Pero si buscas damas Que se consuman como llamas, ¡Prueba una hija de Caladan! (Gurney Halleck)
> >>lo: another special data type. Is its functionality required >>anymore? It appears to be a workaround to some limitations of >>our large object interface which may no longer exist. I **think** the lo datatype is for ODBC binary access. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
> adddepend: is this still needed, or would a proper > dump-and-reload from 7.2 add the dependancy information anyway? No, a 7.2 to 7.3 or later upgrade will not have full dependency information using pg_dump. That said, I would abandon the module anyway. I don't recall testing it for a 7.2 to 8.0 upgrade, let alone to 8.1. It's probably been broken in some way by now (table spaces?) --
Andrew, > userlocks is just a very thin interface to functionality that's really in > the backend. What's left in contrib/userlock probably isn't even > copyrightable in any case. The best bet is probably to re-implement it in > the backend directly. > > Removing it certainly isn't a good idea; the functionality is important. Hmm. It needs to be re-written from scratch then so that we can remove the GPL, or if you can get an attorney to say it's not copyrightable ... > (It doesn't "rely on per-record OIDs" either.) Ah, I misread the code then. It still seems like application code to me, but I'll happily admit to not really understanding it. -- Josh Berkus Aglio Database Solutions San Francisco
"Joshua D. Drake" <jd@commandprompt.com> writes: >> >>>lo: another special data type. Is its functionality required >>>anymore? It appears to be a workaround to some limitations of >>>our large object interface which may no longer exist. > > I **think** the lo datatype is for ODBC binary access. Yes, ISTR needing to install it to use ODBC BLOBs. I wonder if it should be packaged with the ODBC driver instead of being in contrib/? -Doug
Josh Berkus wrote: > > intagg: what does this module do which is not already available > through the built-in array functions and operators? Maybe I > don't understand what it does. Unnatributed in the README. Move > to pgfoundry? Short summary: Is there an equivalent of "int_array_enum()" built in? I use it for substantial (9X) performance improvements for doing joins similar to those described in its README. I think it can be used to do somewhat similar things with integer arrays that the SQL2003 UNNEST operator does on MULTISETs(but yeah, they're quite different too). Long and boring, but with examples: I find that it can speed up certain kinds of joins (like those described in it's readme) drastically. I have a pretty big application that has a lot of joins that use int_array_enum() to expand an array stored in one column into something that looks like a table instead of having a third join table connecting two tables. Note that this is often much faster than the array IN/ANY/SOME/NOT IN comparisons because when planning the join it can all the various join plans like hash joins; while the array operators seem to just do linear searches of the arrays. This trick is especially useful in conjunction with an aggregate based on the "_int_union" function from the intarray/ contrib module (similar to the FUSION operator for MULTISETS) when you only want distinct values for that type of join. Sample queries from an actual application showing a factor-of-9 performance improvement(7 seconds to 800ms) are shown below. -- similar to the standard FUSION operator for MULTISETS. create aggregate intarray_union_agg ( sfunc = _int_union, basetype = int[], stype = int[], initcond = '{-1}' ); explain analyze select fac_nam from userfeatures.point_features join entity_facets using (entity_id) where featureid=115group by fac_nam; -- Total runtime: 7125.322 ms explain analyze select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select distinct fac_ids fromentity_facids natural join point_features where featureid=115) as a) as a join facet_lookup using (fac_id); -- Total runtime: 1297.558 ms explain analyze select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select intarray_union_agg(fac_ids)as fac_ids from entity_facids natural join point_features where featureid=115) as a) as a joinfacet_lookup using (fac_id); -- Total runtime: 803.187 ms I don't have access to the system right now, so I don't have the full table definitions - but the basic problem is that there are many "facets" for each row in the "point_features" table and there are many "features" with featureid=115. The queries are trying to find the names of each facet available from that set of point_features. > intarray: data_types/ Well, the array of int's data type is built in, so I think this module is more about the functions, operators, and indexes that it provides that operate on arrays of ints. Would that make it fit better under functions/ in your new directory tree? If I had a vote, I'd think it nice if the intagg module got merged with the intarray module (wherever it ends up) because they really are quite complementary in providing useful tools for manipulating arrays of ints.
elein wrote: >> >>intarray: data_types/ > > what does this do that arrays do not? It provides lossy indexes that work well on big arrays; as well as some quite useful convenience functions that work on arrays of ints.
Am Dienstag, 7. Juni 2005 19:53 schrieb Josh Berkus: > I think it would also be helpful to users if we could create > subdirectories to organize contrib into categories. This would > help users and packagers find what they want. These > directories would be: > data_types/ > functions/ > utilities/ I think this is out of the question both because these categories are fuzzy and it would destroy the CVS history. It might be equally effective to organize the README file along these lines. > I'll point out that several people (including one of our > RPM builders) spoke up in favor of the idea of adding ./contrib > command line options for individual contrib items. Packagers should simply build all contrib items. No extra options are needed. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Hi Robert, > > reindexdb: now obsolete per the REINDEX {database} command. > > Remove from contrib. > > actually I think part of the point of this was to give a command line > version of the reindex command, like we have for vaccum. If that > still > matters, then it should probably stay. Actually it should probably > be > converted to C and moved to /src/bin. > I'm thinking of converting it so Windows users can benefit from it. Do we have to move it to /src/bin/scripts? It's similar to the other scripts and can benefit from some code shared by the other scripts. I'll submit a patch ASAP. Comments? Euler Taveira de Oliveira euler[at]yahoo_com_br __________________________________________________ Converse com seus amigos em tempo real com o Yahoo! Messenger http://br.download.yahoo.com/messenger/
On Tue, Jun 07, 2005 at 02:53:32PM -0300, Josh Berkus wrote: > > noupdate: this is a cool example of a simple C trigger and would > be lovely to have in a doc somewhere. However, its > functionality is easily replicated through a simple PL/pgSQL > trigger so it seems unnecessary as a contrib module. Author > unattributed. Does noupdate even work correctly? The README is pretty thin so maybe I've misunderstood something. First of all, the example fails due to a case problem: CREATE TABLE TEST ( COL1 INT, COL2 INT, COL3 INT ); CREATE TRIGGER BT BEFORE UPDATE ON TEST FOR EACH ROW EXECUTEPROCEDURE noup ('COL1'); INSERT INTO TEST VALUES (10,20,30); UPDATE TEST SET COL1 = 5; ERROR: noup: thereis no attribute COL1 in relation test If we fix the case problem then this particular example works: DROP TRIGGER BT ON TEST; CREATE TRIGGER BT BEFORE UPDATE ON TEST FOR EACH ROW EXECUTE PROCEDURE noup ('col1'); UPDATE TEST SET COL1 = 5; WARNING: col1: update not allowed UPDATE 0 But the trigger won't allow updates on other columns either: UPDATE TEST SET COL2 = 15; WARNING: col1: update not allowed UPDATE 0 ...unless we *do* change COL1 to NULL: UPDATE TEST SET COL1 = NULL, COL2 = 15; UPDATE 1 The code rejects the update if the new value for the designated column (col1 in this case) is not NULL, rather than checking if its value has changed. Is that the intended behavior? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Peter, > Packagers should simply build all contrib items. No extra options are > needed. No, they shoudn't. 3 of the packages currently in /contrib are GPL. Building them makes all of PostgreSQL GPL. -- Josh Berkus Aglio Database Solutions San Francisco
Peter, > I think this is out of the question both because these categories are fuzzy > and it would destroy the CVS history. It might be equally effective to > organize the README file along these lines. Ach, I forgot about this lovely property of CVS. Well, scratch that proposal. SVN is looking better and better ... > Packagers should simply build all contrib items. No extra options are > needed. Hmmm, when an RPM builds a contrib item, where does the .sql file go? How does an RPM user actually add the functions/datatypes/etc to their database? -- Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: >> Packagers should simply build all contrib items. No extra options are >> needed. > No, they shoudn't. 3 of the packages currently in /contrib are GPL. > Building them makes all of PostgreSQL GPL. The fix for that is to remove or relicense those packages, not to complicate the build process. regards, tom lane
Tom, > The fix for that is to remove or relicense those packages, not to > complicate the build process. OK. Then we'll make BSD licensing an absolute requirement for /contrib? Also, we'll add --build-all-contrib to ./configure? -- Josh Berkus Aglio Database Solutions San Francisco
On Wed, Jun 08, 2005 at 08:45:42AM -0700, Josh Berkus wrote: > Peter, > > > Packagers should simply build all contrib items. No extra options are > > needed. > > No, they shoudn't. 3 of the packages currently in /contrib are GPL. > Building them makes all of PostgreSQL GPL. No, it means the distributors are illegally distributing software they don't have permission to distribute. The GPL doesn't make everything else GPL right away, that's a myth. The only entity that can change PostgreSQL's license is the copyright owner. Since it's a rather big and unidentified entity, that's difficult. So the only lawful (legal?) way to distribute a binary PostgreSQL distribution is to refrain from distributing GPL-licensed contrib modules. Or we could remove them from contrib. -- Alvaro Herrera (<alvherre[a]surnet.cl>) "Hoy es el primer día del resto de mi vida"
Josh Berkus <josh@agliodbs.com> writes: > Tom, >> The fix for that is to remove or relicense those packages, not to >> complicate the build process. > OK. Then we'll make BSD licensing an absolute requirement for /contrib? That's been the intention for a very long time: everything in the core tarball should be under the same license. Someone's got to do the legwork of contacting the module authors involved to see if they're willing to relicense ... and so far it just hasn't gotten to the top of the to-do queue. regards, tom lane
On Wed, Jun 08, 2005 at 08:59:37AM -0700, Josh Berkus wrote: > Peter, > > > I think this is out of the question both because these categories are fuzzy > > and it would destroy the CVS history. It might be equally effective to > > organize the README file along these lines. > > Ach, I forgot about this lovely property of CVS. Well, scratch that proposal. > SVN is looking better and better ... I'll argue for a change as soon as Monotone is able to import our current repository. I think a distributed SCM is the way to go ... -- Alvaro Herrera (<alvherre[a]surnet.cl>) "Acepta los honores y aplausos y perderás tu libertad"
On Wednesday 08 June 2005 12:05, Alvaro Herrera wrote: > On Wed, Jun 08, 2005 at 08:45:42AM -0700, Josh Berkus wrote: > > Peter, > > > > > Packagers should simply build all contrib items. No extra options are > > > needed. > > > > No, they shoudn't. 3 of the packages currently in /contrib are GPL. > > Building them makes all of PostgreSQL GPL. > > No, it means the distributors are illegally distributing software they > don't have permission to distribute. The GPL doesn't make everything > else GPL right away, that's a myth. > In the above scenario, the packages must be distributed under the GPL. This is perfectly legal for both postgresql and those gpl contrib modules. It would be incorrect (and therefore technically illegal) to distribute the above combo with postgresql as bsd and the contribs as gpl, since that violates the license that has been granted by the contrib modules. > The only entity that can change PostgreSQL's license is the copyright > owner. Since it's a rather big and unidentified entity, that's > difficult. So the only lawful (legal?) way to distribute a binary > PostgreSQL distribution is to refrain from distributing GPL-licensed > contrib modules. > Thats just not true. Anyone can relicense thier own distribution of postgresql under any license they see fit, as long as they adhere to the license that they were given with thier copy of postgresql (which basically just means keeping the copyrights intact). Thats how folks can sell proprietary packages under closed licenses. > Or we could remove them from contrib. That's what I would recommend if we cant them relicensed. -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
People: > > No, it means the distributors are illegally distributing software they > > don't have permission to distribute. The GPL doesn't make everything > > else GPL right away, that's a myth. I'm not talking out of my hat here. I consulted a staff member of the FSF about it (will give name as soon as I sort through my business cards from the conference). According to him, if someone builds PostgreSQL with a GPL contrib module, then all of *their copy* of PostgreSQL becomes GPL. While there is nothing illegal about this, it is would not be desirable for most PostgreSQL users and they would be absolutely right to be mad at us for building a "licensing booby trap" into /contrib. > That's what I would recommend if we cant them relicensed. I will point out that all three "GPL" modules are currently unmaintained. I don't know that anyone has seen Massimo in years. Simply dropping them seems the easiest answer. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > I will point out that all three "GPL" modules are currently unmaintained. > I don't know that anyone has seen Massimo in years. Simply dropping them > seems the easiest answer. The original authors of the backend code haven't been seen on this list in a long time, either ;-). That doesn't make either the backend or these contrib modules "unmaintained". userlock in particular is not practical to drop without a replacement, because people are definitely using it. A quick grep for "General Public License" finds these files: contrib/dbmirror/clean_pending.pl contrib/miscutil/README.misc_utils contrib/miscutil/misc_utils.c contrib/miscutil/misc_utils.sql.in contrib/string/README.string_io contrib/string/string_io.c contrib/string/string_io.sql.in contrib/tools/find-sources contrib/tsearch/dict/porter_english.dct contrib/userlock/README.user_locks contrib/userlock/user_locks.c contrib/userlock/user_locks.sql.in I think the dbmirror one is just a mistake, since the README has a BSD-type license, but we'd need to get Steven Singer to confirm that. The tsearch one is more of a problem, but Oleg and Teodor have been wanting to obsolete tsearch anyway, so dropping it would work. That leaves us with four modules to be looked at ... and yeah, they do all seem to be Massimo's. Anyone want to try to contact him? regards, tom lane
On Wed, Jun 08, 2005 at 11:13:01AM -0700, Josh Berkus wrote: > People: > > > > No, it means the distributors are illegally distributing software they > > > don't have permission to distribute. The GPL doesn't make everything > > > else GPL right away, that's a myth. > > I'm not talking out of my hat here. I don't expect you to talk inside your hat either. Anyway, I realized that our license does allow this to happen without any sort of problem; but I don't see what's so bad about it. The person who receives a GPL'd Postgres can get a BSD Postgres just as easily, should the need arise. The opinion I gave earlier, quoted above, is correct for other licenses, e.g. any commercial license. Open source detractors say to everyone who listen to them that including GPL code in their packages automatically make them also GPL'ed, which is false. -- Alvaro Herrera (<alvherre[a]surnet.cl>) "You knock on that door or the sun will be shining on places inside you that the sun doesn't usually shine" (en Death: "The High Cost of Living")
On Wed, 8 Jun 2005, Josh Berkus wrote: > Peter, > >> Packagers should simply build all contrib items. No extra options are >> needed. > > No, they shoudn't. 3 of the packages currently in /contrib are GPL. > Building them makes all of PostgreSQL GPL. Then they should be removed ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
On Wed, 8 Jun 2005, Peter Eisentraut wrote: > Am Dienstag, 7. Juni 2005 19:53 schrieb Josh Berkus: >> I think it would also be helpful to users if we could create >> subdirectories to organize contrib into categories. This would >> help users and packagers find what they want. These >> directories would be: >> data_types/ >> functions/ >> utilities/ > > I think this is out of the question both because these categories are fuzzy > and it would destroy the CVS history. Why would it destroy the history? Its easy enough to move the files to a subdirectory without losing any history ... hell, we did it when we moved JDBC/ODBC out of core, the history was maintained ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
"Marc G. Fournier" <scrappy@postgresql.org> writes: > On Wed, 8 Jun 2005, Peter Eisentraut wrote: >> I think this is out of the question both because these categories are fuzzy >> and it would destroy the CVS history. > Why would it destroy the history? Its easy enough to move the files to a > subdirectory without losing any history ... hell, we did it when we moved > JDBC/ODBC out of core, the history was maintained ... I don't think you can just move the files --- that will break future builds of the back branches (unless you intend to make the rearrangement retroactive). I agree with Peter's point anyway: the main value of classifying these things is for documentation, and just structuring the top-level README that way would be sufficient. Physically changing the hierarchy is just much more work than it's worth. regards, tom lane
On Wed, Jun 08, 2005 at 04:21:46PM -0300, Marc G. Fournier wrote: > On Wed, 8 Jun 2005, Peter Eisentraut wrote: > > >Am Dienstag, 7. Juni 2005 19:53 schrieb Josh Berkus: > >>I think it would also be helpful to users if we could create > >>subdirectories to organize contrib into categories. This would > >>help users and packagers find what they want. These > >>directories would be: > >>data_types/ > >>functions/ > >>utilities/ > > > >I think this is out of the question both because these categories are fuzzy > >and it would destroy the CVS history. > > Why would it destroy the history? Its easy enough to move the files to a > subdirectory without losing any history ... hell, we did it when we moved > JDBC/ODBC out of core, the history was maintained ... Can't do that, because if you do the files will disappear in (say) 7.4 releases, and we don't want that, do we? -- Alvaro Herrera (<alvherre[a]surnet.cl>) "No reniegues de lo que alguna vez creíste"
On Wed, 8 Jun 2005, Alvaro Herrera wrote: > On Wed, Jun 08, 2005 at 04:21:46PM -0300, Marc G. Fournier wrote: >> On Wed, 8 Jun 2005, Peter Eisentraut wrote: >> >>> Am Dienstag, 7. Juni 2005 19:53 schrieb Josh Berkus: >>>> I think it would also be helpful to users if we could create >>>> subdirectories to organize contrib into categories. This would >>>> help users and packagers find what they want. These >>>> directories would be: >>>> data_types/ >>>> functions/ >>>> utilities/ >>> >>> I think this is out of the question both because these categories are fuzzy >>> and it would destroy the CVS history. >> >> Why would it destroy the history? Its easy enough to move the files to a >> subdirectory without losing any history ... hell, we did it when we moved >> JDBC/ODBC out of core, the history was maintained ... > > Can't do that, because if you do the files will disappear in (say) 7.4 > releases, and we don't want that, do we? Hrmmm, good point that I hadn't thought of ... unless, of course, we back-patch the build changes ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
On Tue, 07 Jun 2005 14:53:32 -0300, Josh Berkus wrote: [Discussion snipped] > xml and xml2: both by John Gray (jgray@azuli.co.uk). John, why do we have > two of these? Otherwise, data_types/. contrib/xml2 is a lot better than /xml. When I submitted the new code, Bruce felt that /xml should be kept for compatibility in case there were people using it (the API changed completely). Personally, I'd be very happy for /xml to go - it's not nearly as good as /xml2 (/xml2 has some serious production users as far as I can tell- anyone who's ever asked me about building /xml has been pointed to /xml2). There's no maintenance effort for /xml, but I'm still working on /xml2. Hope that clears things up a bit! Regards John
On Wed, Jun 08, 2005 at 06:50:06PM -0300 I heard the voice of Marc G. Fournier, and lo! it spake thus: > On Wed, 8 Jun 2005, Alvaro Herrera wrote: > >On Wed, Jun 08, 2005 at 04:21:46PM -0300, Marc G. Fournier wrote: > >> > >>Why would it destroy the history? Its easy enough to move the files to a > >>subdirectory without losing any history ... hell, we did it when we moved > >>JDBC/ODBC out of core, the history was maintained ... > > > >Can't do that, because if you do the files will disappear in (say) 7.4 > >releases, and we don't want that, do we? > > Hrmmm, good point that I hadn't thought of ... unless, of course, we > back-patch the build changes ... That's why you COPY the files in the repo, cvs rm the old locations (so they still exist on older tags/branches), and do some surgery on the new locations to remove the old tags (though you can't remove branches last I checked, without more serious magic; you could go in and cvs rm on the branches I guess, which is better than nothing, though more work) so they don't start showing up on old release co's. It's nasty, but it works. -- Matthew Fuller (MF4839) | fullermd@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream.
On Wed, Jun 08, 2005 at 05:54:08PM -0500, Matthew D. Fuller wrote: > That's why you COPY the files in the repo, cvs rm the old locations > (so they still exist on older tags/branches), and do some surgery on Hmm, while we are at the subject of playing with our CVS server, could we fix some other things? * cvsweb could display the correct $PostgreSQL$ tags ... right now it's only getting what's changed at each commit (i.e.it's always one version behind) (Additionaly it'd be nice to know what on earth does one have to set so that a CVSup-acquired repository behaves the same with local checkouts.) * the cvsweb could display tabs as 4 spaces, just like the code expects. This is mostly a minor annoyance. I had other gripes but I forget them right now :-( -- Alvaro Herrera (<alvherre[a]surnet.cl>) "El realista sabe lo que quiere; el idealista quiere lo que sabe" (Anónimo)
On Wed, 8 Jun 2005, Matthew D. Fuller wrote: > On Wed, Jun 08, 2005 at 06:50:06PM -0300 I heard the voice of > Marc G. Fournier, and lo! it spake thus: >> On Wed, 8 Jun 2005, Alvaro Herrera wrote: >>> On Wed, Jun 08, 2005 at 04:21:46PM -0300, Marc G. Fournier wrote: >>>> >>>> Why would it destroy the history? Its easy enough to move the files to a >>>> subdirectory without losing any history ... hell, we did it when we moved >>>> JDBC/ODBC out of core, the history was maintained ... >>> >>> Can't do that, because if you do the files will disappear in (say) 7.4 >>> releases, and we don't want that, do we? >> >> Hrmmm, good point that I hadn't thought of ... unless, of course, we >> back-patch the build changes ... > > That's why you COPY the files in the repo, cvs rm the old locations > (so they still exist on older tags/branches), and do some surgery on > the new locations to remove the old tags (though you can't remove > branches last I checked, without more serious magic; you could go in > and cvs rm on the branches I guess, which is better than nothing, > though more work) so they don't start showing up on old release co's. Actually, simplier yet would be to just 'cvs add' everything in th enew place, and cvs remove it on HEAD in the old ... the 'cvs add' log entry should state something like 'moved from ...' Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
"Marc G. Fournier" <scrappy@postgresql.org> writes: > On Wed, 8 Jun 2005, Matthew D. Fuller wrote: >> That's why you COPY the files in the repo, cvs rm the old locations >> (so they still exist on older tags/branches), and do some surgery on >> the new locations to remove the old tags (though you can't remove >> branches last I checked, without more serious magic; you could go in >> and cvs rm on the branches I guess, which is better than nothing, >> though more work) so they don't start showing up on old release co's. > Actually, simplier yet would be to just 'cvs add' everything in th enew > place, and cvs remove it on HEAD in the old ... the 'cvs add' log entry > should state something like 'moved from ...' Yeah ... at that point you've more or less erased the history from the new files anyway, no? Might as well just cut it over. regards, tom lane
Tom Lane wrote: > That's been the intention for a very long time: everything in the core > tarball should be under the same license. Someone's got to do the > legwork of contacting the module authors involved to see if they're > willing to relicense ... and so far it just hasn't gotten to the top > of the to-do queue. I've volunteered to do this in the past, and the response was that it is something that only members of core are in a position to do this. That is perfectly reasonable, but that was quite some time ago -- it would be nice to see some movement on this... I think shipping a pure-BSD 8.1 is a reasonable goal. -Neil
Neil, > I've volunteered to do this in the past, and the response was that it is > something that only members of core are in a position to do this. That > is perfectly reasonable, but that was quite some time ago -- it would be > nice to see some movement on this... I thought I *was* moving on this. Frankly, until Marc posted I wasn't aware that it was *possible* to have differently-licensed stuff except in /contrib. -- Josh Berkus Aglio Database Solutions San Francisco
On Wed, 8 Jun 2005, Josh Berkus wrote: > Neil, > >> I've volunteered to do this in the past, and the response was that it is >> something that only members of core are in a position to do this. That >> is perfectly reasonable, but that was quite some time ago -- it would be >> nice to see some movement on this... > > I thought I *was* moving on this. Frankly, until Marc posted I wasn't > aware that it was *possible* to have differently-licensed stuff except > in /contrib. What did I post? *raised eyebrow* ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Marc, > What did I post? *raised eyebrow* Didn't you grep the source for "GPL"? Or was it someone else? -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > Didn't you grep the source for "GPL"? Or was it someone else? That was me... regards, tom lane
On Thu, 9 Jun 2005, Josh Berkus wrote: > Marc, > >> What did I post? *raised eyebrow* > > Didn't you grep the source for "GPL"? Or was it someone else? Someone else :) ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
> actually I think part of the point of this was to give a command line > version of the reindex command, like we have for vaccum. If that still > matters, then it should probably stay. Actually it should probably be > converted to C and moved to /src/bin. > Wouldn't something like echo 'REINDEX DATABASE {database};' | psql {database} be easier? Of course it would need a working shell, but even Windows can do this, I believe.
On Friday 10 June 2005 10:54 am, Kaare Rasmussen wrote: > > actually I think part of the point of this was to give a command > > line version of the reindex command, like we have for vaccum. If > > that still matters, then it should probably stay. Actually it > > should probably be converted to C and moved to /src/bin. > > Wouldn't something like > > echo 'REINDEX DATABASE {database};' | psql {database} > > be easier? But not as easy as: psql -c "reindex database {database}" {database} Add connection options as desired. Cheers, Steve
Josh Berkus <josh@agliodbs.com> writes: > I had a lot of time to kill on airplanes recently so I've gone > digging through /contrib in an effort to sort out what's in > there and try to apply some consistent rules to it. Sorry for not responding sooner; I'm catching up on back email. As already noted, I agree with most of your goals here, though I'm with Peter that restructuring the directory hierarchy is more trouble than it's worth; just organizing the docs that way should be sufficient. Here are some comments about the individual modules (where not stated, I agree with your evaluation): > adddepend: is this still needed, or would a proper > dump-and-reload from 7.2 add the dependancy information anyway? It is still theoretically useful, but given that the author feels it is unmaintained and possibly broken, I would agree with removing it (or maybe better, push to pgfoundry). Anyone trying to jump straight from 7.2-or-earlier to 8.1 is probably going to have worse issues than lack of dependencies anyway. >> dbase You seem to have missed this one. I would argue that it should go to pgfoundry as it is a data conversion tool. > findoidjoins: again, it's not clear what this module is for. We need it to generate the oidjoins regression test. Possibly it should move into src/tools; I can't see any reason that ordinary users would want it. > intagg: what does this module do which is not already available > through the built-in array functions and operators? Maybe I > don't understand what it does. Unnatributed in the README. Move > to pgfoundry? The aggregate is functionally equivalent to ARRAY(sub-SELECT) but I think that the aggregate notation is probably easier to use in many scenarios. The other function is basically the reverse conversion: given an array, return a setof integers. I don't think that we currently have a built-in equivalent to that. The functionality is useful but severely limited by the fact that it only works on one datatype (int4) --- I'd like to see it reimplemented as polymorphic functions. > lo: another special data type. Is its functionality required > anymore? The datatype as such is pretty much a waste of time --- you might as well use OID. (We could replace the datatype with a domain over OID and have a compatible one-line implementation...) The useful part of this is the "lo_manage" trigger, which essentially supports automatic dropping of large objects when the (assumed unique) references to them from the main database go away. It'd perhaps make sense to migrate lo_manage into the main backend and lose the rest. > misc_utils: I believe that all of these utils are obsolesced by > builtin system commands or easily written userspace functions > (like max(x,y)). Also, is under the GPL (see above). Author > Massimo Dal Zotto (dz@cs.unitn.it) I agree with just summarily removing this one. > noupdate: this is a cool example of a simple C trigger and would > be lovely to have in a doc somewhere. As somebody else noted, it's completely broken: it does not do at all what the documentation claims. There are much more interesting trigger examples under spi/, so I'd agree with removal. > pg_dumplo: is this still required for pg large objects? If > so, can't we integrate it into the core? utilities/ Probably drop; this was long ago superseded by functionality in pg_dump. > pg_upgrade: what's the status of this, Bruce? Does it work at > all? Shouldn't this be moved to the pgfoundry project of the > same name until it's stable? Doesn't work and hasn't worked in a long time. I'd agree with removal. > pgbench: I see repeated complaints on -performance about how > pgbench results are misleading. Why are we shipping it with > PostgreSQL then? It's handy to have *some* simple concurrent-behavior test included, even if it's not something we put a lot of stock in. The parallel regression tests are a joke as far as exercising concurrent updates go --- I think pg_bench is an important test tool for that reason. I'd not vote to remove this without a better replacement. > reindexdb: now obsolete per the REINDEX {database} command. > Remove from contrib. Per other followups, this isn't obsolete at all. Possibly the functionality could be merged into vacuumdb, rather than writing a whole 'nother program? > spi: contains TimeTravel functions. Do these actually still > work? The spi stuff is good for documentation purposes anyway > ... but if the functions aren't working, should be in the docs > and not /contrib. Not only do they work, several of them are used in the regression tests. > string: data_types/ Same problem as Massimo's > other library; it's GPL. Also, is it really needed at this > point? Massimo (dz@cs.unitn.it). Actually, I've never looked closely at this before, and now that I have I've got a serious problem with the proposed mode of use: overwriting the typoutput functions for standard datatypes is just a guaranteed recipe for breaking client code left and right. The functions might be safe and useful if invoked manually though. > userlocks: another GPL script, with the problems that entails. As already pointed out, we should rewrite this from scratch in the main backend. > vacuumlo: is this still required? If utilities/. Yes, it is if you aren't using the lo_manage trigger ... > xml and xml2: both by John Gray (jgray@azuli.co.uk). John, why > do we have two of these? Otherwise, data_types/. xml needs to be retired, same as tsearch. regards, tom lane
> But not as easy as: > psql -c "reindex database {database}" {database} Well it was just to show that there really is no need for a program just for this functionality.
On 2005-06-10, Kaare Rasmussen <kar@kakidata.dk> wrote: >> But not as easy as: >> psql -c "reindex database {database}" {database} > > Well it was just to show that there really is no need for a program just for > this functionality. Either you're misunderstanding what "reindex database" does (it reindexes only the system catalogs), or you're misunderstanding what reindexdb does (it reindexes all indexes in the specified database, or all databases, unless told otherwise). -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
> Either you're misunderstanding what "reindex database" does (it reindexes > only the system catalogs), or you're misunderstanding what reindexdb does OK, I was taking the face value here. I consider this a bug, or at least a badly thought out name. I can't understand that someone approved 'reindex database' to mean 'reindex the system tables of a database'.
Kaare Rasmussen wrote: > > > Either you're misunderstanding what "reindex database" does (it reindexes > > only the system catalogs), or you're misunderstanding what reindexdb does > > OK, I was taking the face value here. > > I consider this a bug, or at least a badly thought out name. I can't > understand that someone approved 'reindex database' to mean 'reindex the > system tables of a database'. Agreed. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On 6/10/2005 3:04 PM, Tom Lane wrote: >> pgbench: I see repeated complaints on -performance about how >> pgbench results are misleading. Why are we shipping it with >> PostgreSQL then? > > It's handy to have *some* simple concurrent-behavior test included, > even if it's not something we put a lot of stock in. The parallel > regression tests are a joke as far as exercising concurrent updates > go --- I think pg_bench is an important test tool for that reason. > I'd not vote to remove this without a better replacement. In any case it shouldn't have a name that suggests any relationship with performance measurement. Maybe we can rename this one to pgstresstest1 or something similar? >> spi: contains TimeTravel functions. Do these actually still >> work? The spi stuff is good for documentation purposes anyway >> ... but if the functions aren't working, should be in the docs >> and not /contrib. > > Not only do they work, several of them are used in the regression tests. But I wonder about their general usefullness. Most of the functions in here are rather examples how to develop simple triggers in C. Triggers that can be defined in 5 lines of pl/pgsql and looking at the logs they all predate procedural languages (and foreign keys in the refint case). I'd say they have more educational character and should move into documentation. Those functions used by the regression test are supposed to be under src/test/regression. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Kaare Rasmussen wrote: >> I consider this a bug, or at least a badly thought out name. I can't >> understand that someone approved 'reindex database' to mean 'reindex the >> system tables of a database'. > Agreed. It's always bothered me too. How about REINDEX SYSTEM -> system tables (current meaning of R. DATABASE)REINDEX USER -> all non-system tablesREINDEX DATABASE ->both of the above regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Kaare Rasmussen wrote: > >> I consider this a bug, or at least a badly thought out name. I can't > >> understand that someone approved 'reindex database' to mean 'reindex the > >> system tables of a database'. > > > Agreed. > > It's always bothered me too. How about > > REINDEX SYSTEM -> system tables (current meaning of R. DATABASE) > REINDEX USER -> all non-system tables > REINDEX DATABASE -> both of the above I like that. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Fri, Jun 10, 2005 at 03:04:49PM -0400, Tom Lane wrote:> misc_utils: I believe that all of these utils are obsolesced by > > builtin system commands or easily written userspace functions > > (like max(x,y)). Also, is under the GPL (see above). Author > > Massimo Dal Zotto (dz@cs.unitn.it) What about just adding max(x,y) and min(x,y) to the system functions? Personally I always end up adding these anyway... Versions that accept arrays would probably be useful as well. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Sat, 11 Jun 2005, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: >> Kaare Rasmussen wrote: >>> I consider this a bug, or at least a badly thought out name. I can't >>> understand that someone approved 'reindex database' to mean 'reindex the >>> system tables of a database'. > >> Agreed. > > It's always bothered me too. How about > > REINDEX SYSTEM -> system tables (current meaning of R. DATABASE) > REINDEX USER -> all non-system tables > REINDEX DATABASE -> both of the above Why all the choices? What cases are there for doing one without the other? If you want to get 'fine tuned', do a 'REINDEX TABLE' ... I can see REINDEX SYSTEM and REINDEX DATABASE (includes SYSTEM), but not the USER one .. ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
"Marc G. Fournier" <scrappy@postgresql.org> writes: > On Sat, 11 Jun 2005, Tom Lane wrote: >> It's always bothered me too. How about >> >> REINDEX SYSTEM -> system tables (current meaning of R. DATABASE) >> REINDEX USER -> all non-system tables >> REINDEX DATABASE -> both of the above > Why all the choices? What cases are there for doing one without the > other? If you want to get 'fine tuned', do a 'REINDEX TABLE' ... I can > see REINDEX SYSTEM and REINDEX DATABASE (includes SYSTEM), but not the > USER one .. The main argument I can think of for REINDEX USER is that it could be executed by someone who isn't necessarily superuser. Not sure how important that is, though. regards, tom lane
"Jim C. Nasby" <decibel@decibel.org> writes: > What about just adding max(x,y) and min(x,y) to the system functions? There's already a patch in the queue to do these using the Oracle spellings (ie, GREATEST(...), LEAST(...)). ISTM that those are better choices of name, because (a) they are a de facto standard, and (b) they don't invite confusion with the max() and min() aggregates, which do something significantly different. regards, tom lane
On Sun, 12 Jun 2005, Tom Lane wrote: > "Marc G. Fournier" <scrappy@postgresql.org> writes: >> On Sat, 11 Jun 2005, Tom Lane wrote: >>> It's always bothered me too. How about >>> >>> REINDEX SYSTEM -> system tables (current meaning of R. DATABASE) >>> REINDEX USER -> all non-system tables >>> REINDEX DATABASE -> both of the above > >> Why all the choices? What cases are there for doing one without the >> other? If you want to get 'fine tuned', do a 'REINDEX TABLE' ... I can >> see REINDEX SYSTEM and REINDEX DATABASE (includes SYSTEM), but not the >> USER one .. > > The main argument I can think of for REINDEX USER is that it could be > executed by someone who isn't necessarily superuser. Not sure how > important that is, though. Couldn't behaviour of REINDEX DATABASE not take that into account, and 'skip' the system indices if not superuser? ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
"Marc G. Fournier" <scrappy@postgresql.org> writes: > >> Why all the choices? What cases are there for doing one without the > >> other? If you want to get 'fine tuned', do a 'REINDEX TABLE' ... I can > >> see REINDEX SYSTEM and REINDEX DATABASE (includes SYSTEM), but not the > >> USER one .. > > > > The main argument I can think of for REINDEX USER is that it could be > > executed by someone who isn't necessarily superuser. Not sure how > > important that is, though. > > Couldn't behaviour of REINDEX DATABASE not take that into account, and 'skip' > the system indices if not superuser? I can see a reasonable argument for them to be separated like this. If I wanted to reindex everything in sight in a large database I would want to control when each of my user tables was reindexed -- some of them would take all night for a single table. But all the system tables together should never be so large as to be a problem doing them in a single batch and I would never be able to enumerate them all myself. So I would probably start with a REINDEX SYSTEM and then go through my tables and group them into chunks to run in each maintenance window available. Of course online index rebuilds would be even better :) -- greg
On Mon, 13 Jun 2005, Greg Stark wrote: > "Marc G. Fournier" <scrappy@postgresql.org> writes: > >>>> Why all the choices? What cases are there for doing one without the >>>> other? If you want to get 'fine tuned', do a 'REINDEX TABLE' ... I can >>>> see REINDEX SYSTEM and REINDEX DATABASE (includes SYSTEM), but not the >>>> USER one .. >>> >>> The main argument I can think of for REINDEX USER is that it could be >>> executed by someone who isn't necessarily superuser. Not sure how >>> important that is, though. >> >> Couldn't behaviour of REINDEX DATABASE not take that into account, and 'skip' >> the system indices if not superuser? > > I can see a reasonable argument for them to be separated like this. If I > wanted to reindex everything in sight in a large database I would want to > control when each of my user tables was reindexed -- some of them would take > all night for a single table. > > But all the system tables together should never be so large as to be a problem > doing them in a single batch and I would never be able to enumerate them all > myself. > > So I would probably start with a REINDEX SYSTEM and then go through my tables > and group them into chunks to run in each maintenance window available. > > Of course online index rebuilds would be even better :) Right, so that would be in favor of REINDEX {SYSTEM,DATABASE}, which I don't question ... the only one I question is adding a third USER one ... it can't hurt, but is it required? ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
On 6/12/2005 8:03 PM, Marc G. Fournier wrote: > Couldn't behaviour of REINDEX DATABASE not take that into account, and > 'skip' the system indices if not superuser? Silently doing something other than what the user requested ... I don't think this is the right way to become the most popular open source database in the world. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Am Montag, den 13.06.2005, 08:16 -0400 schrieb Jan Wieck: > On 6/12/2005 8:03 PM, Marc G. Fournier wrote: > > > Couldn't behaviour of REINDEX DATABASE not take that into account, and > > 'skip' the system indices if not superuser? > > Silently doing something other than what the user requested ... I don't > think this is the right way to become the most popular open source > database in the world. Hehe. The currently most "popular" database actually has this habit of doing all silently different from what the user requested. But I would agree no matter of the bad example postgres should not work like this ;) -- Tino Wildenhain <tino@wildenhain.de>
On Mon, 13 Jun 2005, Jan Wieck wrote: > On 6/12/2005 8:03 PM, Marc G. Fournier wrote: > >> Couldn't behaviour of REINDEX DATABASE not take that into account, and >> 'skip' the system indices if not superuser? > > Silently doing something other than what the user requested ... I don't think > this is the right way to become the most popular open source database in the > world. But, we are already doing that, no? :) I know I'm one that has been bitten by 'DATABASE' != "all tables in the database" :) ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
"Marc G. Fournier" <scrappy@postgresql.org> writes: > On Mon, 13 Jun 2005, Jan Wieck wrote: >> Silently doing something other than what the user requested ... I >> don't think this is the right way to become the most popular open >> source database in the world. > But, we are already doing that, no? :) I know I'm one that has been > bitten by 'DATABASE' != "all tables in the database" :) But that's what we're trying to fix here --- ie, eliminate surprises. Perhaps we should follow the precedent of VACUUM: it skips over tables you don't have permission to process, but emits a WARNING. regards, tom lane
Marc G. Fournier wrote: > On Mon, 13 Jun 2005, Jan Wieck wrote: > >> On 6/12/2005 8:03 PM, Marc G. Fournier wrote: >> >>> Couldn't behaviour of REINDEX DATABASE not take that into account, >>> and 'skip' the system indices if not superuser? >> >> >> Silently doing something other than what the user requested ... I >> don't think this is the right way to become the most popular open >> source database in the world. > > > But, we are already doing that, no? :) I know I'm one that has been > bitten by 'DATABASE' != "all tables in the database" :) > > If we are, then we should stop. cheers andrew
On Mon, 13 Jun 2005, Tom Lane wrote: > "Marc G. Fournier" <scrappy@postgresql.org> writes: >> On Mon, 13 Jun 2005, Jan Wieck wrote: >>> Silently doing something other than what the user requested ... I >>> don't think this is the right way to become the most popular open >>> source database in the world. > >> But, we are already doing that, no? :) I know I'm one that has been >> bitten by 'DATABASE' != "all tables in the database" :) > > But that's what we're trying to fix here --- ie, eliminate surprises. > > Perhaps we should follow the precedent of VACUUM: it skips over tables > you don't have permission to process, but emits a WARNING. That sounds perfect ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
On Mon, 13 Jun 2005, Andrew Dunstan wrote: > > > Marc G. Fournier wrote: > >> On Mon, 13 Jun 2005, Jan Wieck wrote: >> >>> On 6/12/2005 8:03 PM, Marc G. Fournier wrote: >>> >>>> Couldn't behaviour of REINDEX DATABASE not take that into account, and >>>> 'skip' the system indices if not superuser? >>> >>> >>> Silently doing something other than what the user requested ... I don't >>> think this is the right way to become the most popular open source >>> database in the world. >> >> >> But, we are already doing that, no? :) I know I'm one that has been bitten >> by 'DATABASE' != "all tables in the database" :) >> >> > > If we are, then we should stop. sorry, I wasn't advocating that we do so, just pointing out that we were ... I like Tom's suggestion of following our VACUUM precedent, where we do skip, but warn that we are doing so ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
On 6/13/2005 2:29 PM, Marc G. Fournier wrote: > On Mon, 13 Jun 2005, Andrew Dunstan wrote: > >> >> >> Marc G. Fournier wrote: >> >>> On Mon, 13 Jun 2005, Jan Wieck wrote: >>> >>>> On 6/12/2005 8:03 PM, Marc G. Fournier wrote: >>>> >>>>> Couldn't behaviour of REINDEX DATABASE not take that into account, and >>>>> 'skip' the system indices if not superuser? >>>> >>>> >>>> Silently doing something other than what the user requested ... I don't >>>> think this is the right way to become the most popular open source >>>> database in the world. >>> >>> >>> But, we are already doing that, no? :) I know I'm one that has been bitten >>> by 'DATABASE' != "all tables in the database" :) >>> >>> >> >> If we are, then we should stop. > > sorry, I wasn't advocating that we do so, just pointing out that we were > ... I like Tom's suggestion of following our VACUUM precedent, where we do > skip, but warn that we are doing so ... That is a perfectly fine compromise. Do what's possible but warn the user that not all she asked for has been accomplished. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #