Thread: Re: [GENERAL] Future of PostgreSQL
At 06:25 PM 12/25/99 +0100, Peter Eisentraut wrote: >On Sat, 25 Dec 1999, Bruce Momjian wrote: > > > Consider this: > > > > The stock market going crazy over Linux stocks > > Interbase users are considering moving en-mass to PostgreSQL > > Publishers are crawling all over each other to publish a > PostgreSQL book > > > > With these signs, it is possible we may be _very_ popular in the near > > future. I am not sure this will happen, but I didn't think this would > > happen to Linux. > >The worst thing we could do is to intentionally try to stay less than >popular. There's a reason Linux is taking off and *BSD isn't really, and >it's not technology. (Sorry, Marc.) I don't think the *BSD's have intentionally tried any such thing. You could possibly have picked up these vibes from certain members of the Open BSD camp, but I wouldn't extend them to encompass the *BSD community at large. (And I wonder if I should comment about how Linux people are migrating to the *BSD camps in droves.... But I guess it'd be best to just let it slide ;-o) > > > > My big question is, what new challenges will we face as we become more > > popular? > > > > Do we have the proper license? I know this has the possibility of > > opening up a GPL vs. BSD slugfest. However, I will not let such a > > discussion get out of control. > > > > > >One thing we should definitely attempt to do before 7.0 is write our own >license or at least our own copyright in addition to the BSD license, >since none of us (?) actually work at UCB. I come to pgsql from MySQL. I've not done much of anything with it yet really, so I should probably keep my mouth shut. But I thought you might be interested in my perspective. And, after all, you did ask... My hands on experience with pgsql is minimal, but follows is the sense I get from the larger community and having lurked on this list for a bit. The primary "feature" that has me looking at pgsql again is the license. I like MySQL. The MySQL community is great. I don't like their license, however, and feel pretty strongly about it. I would counsel against developing your own. Why reinvent the wheel unless you've got some special agenda that requires it? I prefer the more liberal BSD, but GPL is fine. Transaction support is also nice, but secondary to license issues. There are mysql workarounds in for the absence of transaction support, but it's hard to get around the license and still be honest. I would hope that future development continues to focus on reliability, functionality, and speed. Your work will then speak for itself and more people will adopt pgsql. I initially ruled it out because of reliability and speed concerns. The past year has seen much improvement in these areas, enough to have piqued my interest once again. The *perception* remains, however, that pgsql still leaves a bit to be desired in the areas of reliability and maintainability. This needs to be remedied. Like I said, progress has been mad, but it appears pgsql isn't quite out of the woods yet. Well, there's my $0.02. Thanks for your indulgence. Regards-- Ken
> I don't think the *BSD's have intentionally tried any such thing. You > could possibly have picked up these vibes from certain members of the Open > BSD camp, but I wouldn't extend them to encompass the *BSD community at > large. (And I wonder if I should comment about how Linux people are > migrating to the *BSD camps in droves.... But I guess it'd be best to just > let it slide ;-o) There are alot of us that are finding that...one of my colleagues comments to friends that ask him about Linux that "he's matured"...Linux, IMHO, is the biggest thing that has happened to the "Unix Environment"...but as Linux increases his market share in leaps and bounds, its also making it easier for those of us using *BSDs to slip it into our work environments (I've so far succeeded in migrating 3 co-workers from MicroSoft -> FreeBSD) ... I don't really care what OS someone runs, as anyone that has been here for a long time already knows .. the Linux "fanatics" are just soooo much easier to pick on, that's all :) > The primary "feature" that has me looking at pgsql again is the > license. I like MySQL. The MySQL community is great. I don't like > their license, however, and feel pretty strongly about it. I would > counsel against developing your own. Why reinvent the wheel unless > you've got some special agenda that requires it? I prefer the more > liberal BSD, but GPL is fine. I'm against any change in license, except for the upcoming extension of hte copyright dates to include our work (one of my projects for the new year)...PostgreSQL will always be open source...BSD vs GPL doesn't change that...Postgres is a BSD project that we, as a community, have extended to where it is now...as long as there are ppl developing on it, the source will always be available too, and I really can't see development really ever stopping (too many ppl are having too much fun)... The BSD license has served us perfectly for the past 4+ years, and I haven't heard, over those years, any argument to the effect that it won't continue to serve us perfectly for the next 4+ years... > once again. The *perception* remains, however, that pgsql still > leaves a bit to be desired in the areas of reliability and > maintainability. This needs to be remedied. Like I said, progress > has been mad, but it appears pgsql isn't quite out of the woods yet. I keep hearing the old "reliability" argument...there are alot of us using PostgreSQL for "mission critical" apps, and haven't seen these problems. Can you provide more details on this? I'm not doubting that you are hitting a "little known bug" that makes PostgreSQL unreliable for you, but without details, we have no way of diagnosing and improving it...
Howdy A question, if you still have some code in the source that originated at Berkeley how do you change the license? Do you breakout new code from old code and have a different license for old vs. new code? > I'm against any change in license, except for the upcoming extension of > hte copyright dates to include our work (one of my projects for the new > year)...PostgreSQL will always be open source...BSD vs GPL doesn't change > that...Postgres is a BSD project that we, as a community, have extended to > where it is now...as long as there are ppl developing on it, the source > will always be available too, and I really can't see development really > ever stopping (too many ppl are having too much fun)... > > The BSD license has served us perfectly for the past 4+ years, and I > haven't heard, over those years, any argument to the effect that it won't > continue to serve us perfectly for the next 4+ years... Diana Eichert VP Technical Services Nothing in Particular at the Moment, Inc. deichert@wrench.com
> Howdy > > A question, if you still have some code in the source that > originated at Berkeley how do you change the license? > > Do you breakout new code from old code and have a different > license for old vs. new code? Just add it to the top. If someone wants it without oure name on it, they have to get the tarball from Berkeley. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> > > once again. The *perception* remains, however, that pgsql still > > leaves a bit to be desired in the areas of reliability and > > maintainability. This needs to be remedied. Like I said, progress > > has been mad, but it appears pgsql isn't quite out of the woods yet. > > I keep hearing the old "reliability" argument...there are alot of us using > PostgreSQL for "mission critical" apps, and haven't seen these > problems. Can you provide more details on this? I'm not doubting that > you are hitting a "little known bug" that makes PostgreSQL unreliable for > you, but without details, we have no way of diagnosing and improving it... > > ************ As an org that uses postgres as _THE_ SQL database for our activities, I'll provide some details about our reliability problems: 1) Up front, I'll state that we use 6.3, so a number of the technical glitches may have been solved since... 2) We could never reliably use multiple tasks accessing the database at the same time. I could _reliably_ crash a back-end (and thus cause all back-ends to quit) by having 3-4 tasks actively doing inserts, updates, and selects. (Our workaround - a db semaphore built into our apps that allow only single tasks at a time to access the db) 3) We cannot use vacuum. Why? Because it takes indefinitely longer to vacuum a database than it does to dump and reload. An example case: a table declared as fld1 varchar(80), fld2 int4, fld3 varchar(32), fld4 varchar(80), fld5 varchar(20) with indices unique index index1 on table(fld1, fl2) index index on table(fld3) We have NEVER been able to successfully vacuum the table after only one day of churn through the database, churn being defined as 600,000 updates of fld3,fld4 and fld5 in a table with 2 million rows. (Heap assertion error given, on a system with 128Meg Ram, and 96Meg swap space.) 4) We could never get any answers to reliability related questions answered by any of the development team or by anyone else on the various postgres discussion groups. We would ask the question, post the relevant error log message, describe the scenario we thought cause the problem, and it's as if the question disappeared into a black hole. Believe it or not, it's actually item #4 that annoys us the most. Work arounds are a pain, but at least they accomplish something - the problem no longer occurs. But when you bang your head against a problem, and no one seems to have heard of the issue ever, or even acknowledges the post in question, it definitely detracts from the value of the product. Case in point: a long time ago we found a problem affecting insertions into the database - doing many inserts (I believe where the record already existed) caused a memory leak when the insert was rejected due to duplicate index entries. This forced us to inject a drop/reconnect sequence into the code to avoid using up all of our memory. We asked about the problem - no response; we posted the bug in the PR database - no response; 6 months later, we saw someone else ask the exact same question (not sure of release, i thought he was on 6.4, but don't hold me on that one). It's that kind of non-responsiveness that in our mind makes the db reliability an issue. Now don't get me wrong - I realize that you get what you pay for. But I believe in at the very least responding to user's questions/problems. A simple "We've seen/not seen that problem before, and haven't had the time to track down the root cause and fix it." would have been much preferable, and gone a long way to making us feel that problems are being addressed for subsequent releases. Cheers, Thomas -- ------------------------------------------------------------ Thomas Reinke Tel: (905) 331-2260 Director of Technology Fax: (905) 331-2504 E-Soft Inc. http://www.e-softinc.com
Thomas Reinke wrote: > 1) Up front, I'll state that we use 6.3, so a number of > the technical glitches may have been solved since... 6.3 is unbelievably old. Perhaps you weren't getting responses since most people don't use versions of PostgreSQL that old? I know I tend not to respond to posts about versions that old. Perhaps that's wrong... > > 4) We could never get any answers to reliability related > questions answered by any of the development team or > by anyone else on the various postgres discussion groups. We > would ask the question, post the relevant error log > message, describe the scenario we thought cause the > problem, and it's as if the question disappeared into > a black hole. > > Believe it or not, it's actually item #4 that annoys us > the most. Work arounds are a pain, but at least they > accomplish something - the problem no longer occurs. > But when you bang your head against a problem, and no > one seems to have heard of the issue ever, or even > acknowledges the post in question, it definitely > detracts from the value of the product. > > Case in point: a long time ago we found a problem affecting > insertions into the database - doing many inserts (I believe > where the record already existed) caused a memory leak when > the insert was rejected due to duplicate index entries. > This forced us to inject a drop/reconnect sequence into the > code to avoid using up all of our memory. We asked about > the problem - no response; we posted the bug in the PR > database - no response; 6 months later, we saw someone > else ask the exact same question (not sure of release, > i thought he was on 6.4, but don't hold me on that one). > > It's that kind of non-responsiveness that in our mind makes > the db reliability an issue. > The following are fixes relating to problems you described since 6.3: Bug Fixes --------- Fix for a tiny memory leak in PQsetdb/PQfinish(Bryan) Fix for buffer leaks in large object calls(Pascal) Fix memory leak in libpgtcl's pg_select(Constantin) libpq memory overrun fix pg_dump fixes for memory leak, inheritance constraints, layout change Fix memory overruns(Tatsuo) Memory overrun cleanups(Tatsuo) Drop buffers before destroying database files(Bruce) Fix for memory leak in executor with fjIsNull Fix for aggregate memory leaks(Erik Riedel) Fix for memory leak in failed queries(Tom) Fix vacuum's memory consumption(Hiroshi,Tatsuo) Reduce the total memory consumption of vacuum(Tom) This is to re-use space on index pages freed by vacuum(Vadim) Repair incorrect cleanup of heap memory allocation during transaction abort(Tom) These are just the ones I pulled from the change logs on www.postgresql.org related to memory. There are hundreds of other fixes listed as well. I realize that the answer of "upgrade" to problems you are experiencing is not a definitive solution, since the bugs might very well be present in current releases. For example, I can guess its still going to take quite a while for vacuum to remove 600,000 rows from your database. Some have suggested (and I agree) that vacuum ought to: 1) Estimate the number of rows to be removed 2) If over a certain threshold: A. drop all indexes on tables B. vacuum away dead tuples C. rebuild indexes Having said that, I must say that my general impression has been that the major code developers took over code which was probably 50% bug-ridden garbage and worked away at it with each release performing MAJOR bug fixes. Just read Bruce Momjian's HISTORY document to get an idea of the monumental tasks they have undertaken. I normally don't upgrade other software at each minor release -- but I do with PostgreSQL. You can tell that they've made huge advances against the otherwise, uncharted, bug-ridden pieces of 1980's Berkley code... They're getting closer and closer to what one might call "robustness" at an accelerated pace, so keep the faith! :-) Merry Christmas, Mike Mascari P.S.: We've been running 6.5beta in a production envirnoment with similar record counts as the ones you've described and only run into trouble twice. Once was due to aborting a transaction which contained DDL statements, and the other was, what I believe, a problem with the 2.0.36 Linux kernel. I hope you can read into this that our eagerness to get to 6.5 meant using 6.5 beta in production over using 6.4.2....
Mike Mascari wrote: > > Thomas Reinke wrote: > > > 1) Up front, I'll state that we use 6.3, so a number of > > the technical glitches may have been solved since... > > 6.3 is unbelievably old. Perhaps you weren't getting responses since most > people don't use versions of PostgreSQL that old? I know I tend not to respond > to posts about versions that old. Perhaps that's wrong... We have for a long time not posted, (since 6.4 was out more than 2-3 months) because we knew that we'd be told to upgrade. When we posted, 6.3 was current... > > Having said that, I must say that my general impression has been that the > major code developers took over code which was probably 50% bug-ridden garbage > and worked away at it with each release performing MAJOR bug fixes. Just read > Bruce Momjian's HISTORY document to get an idea of the monumental tasks they > have undertaken. I normally don't upgrade other software at each minor release > -- but I do with PostgreSQL. You can tell that they've made huge advances > against the otherwise, uncharted, bug-ridden pieces of 1980's Berkley code... > They're getting closer and closer to what one might call "robustness" at an > accelerated pace, so keep the faith! :-) Yup...and they're doing a damn good job, as far as I'm concerned. (Else I would have switched a long time ago.) My post here was simply to point out what our perception was on the robustness issue, and that is that although the code was a problem, it was _not_ the major problem... -- ------------------------------------------------------------ Thomas Reinke Tel: (905) 331-2260 Director of Technology Fax: (905) 331-2504 E-Soft Inc. http://www.e-softinc.com