Reliabilty, was [GENERAL] Future of PostgreSQL - Mailing list pgsql-general

From Thomas Reinke
Subject Reliabilty, was [GENERAL] Future of PostgreSQL
Date
Msg-id 386688D8.C714BDD1@e-softinc.com
Whole thread Raw
In response to Re: [GENERAL] Future of PostgreSQL  ("Marc G. Fournier" <scrappy@hub.org>)
List pgsql-general
>
> > once again.  The *perception* remains, however, that pgsql still
> > leaves a bit to be desired in the areas of reliability and
> > maintainability.  This needs to be remedied.  Like I said, progress
> > has been mad, but it appears pgsql isn't quite out of the woods yet.
>
> I keep hearing the old "reliability" argument...there are alot of us using
> PostgreSQL for "mission critical" apps, and haven't seen these
> problems.  Can you provide more details on this?  I'm not doubting that
> you are hitting a "little known bug" that makes PostgreSQL unreliable for
> you, but without details, we have no way of diagnosing and improving it...
>
> ************

As an org that uses postgres as _THE_ SQL database for our activities,
I'll provide some details about our reliability problems:

1) Up front, I'll state that we use 6.3, so a number of
   the technical glitches may have been solved since...

2) We could never reliably use multiple tasks accessing
   the database at the same time.  I could _reliably_
   crash a back-end (and thus cause all back-ends to quit)
   by having 3-4 tasks actively doing inserts, updates,
   and selects. (Our workaround - a db semaphore built
   into our apps that allow only single tasks at a time
   to access the db)

3) We cannot use vacuum. Why? Because it takes indefinitely
   longer to vacuum a database than it does to dump and
   reload. An example case: a table declared as
     fld1 varchar(80), fld2 int4, fld3 varchar(32),
    fld4 varchar(80), fld5 varchar(20)
   with indices
     unique index index1 on table(fld1, fl2)
     index index on table(fld3)

   We have NEVER been able to successfully vacuum the
   table after only one day of churn through the database,
   churn being defined as 600,000 updates of fld3,fld4 and fld5
   in a table with 2 million rows. (Heap assertion error given,
   on a system with 128Meg Ram, and 96Meg swap space.)

4) We could never get any answers to reliability related
   questions answered by any of the development team or
   by anyone else on the various postgres discussion groups. We
   would ask the question, post the relevant error log
   message, describe the scenario we thought cause the
   problem, and it's as if the question disappeared into
   a black hole.

Believe it or not, it's actually item #4 that annoys us
the most. Work arounds are a pain, but at least they
accomplish something - the problem no longer occurs.
But when you bang your head against a problem, and no
one seems to have heard of the issue ever, or even
acknowledges the post in question, it definitely
detracts from the value of the product.

Case in point: a long time ago we found a problem affecting
insertions into the database - doing many inserts (I believe
where the record already existed) caused a memory leak when
the insert was rejected due to duplicate index entries.
This forced us to inject a drop/reconnect sequence into the
code to avoid using up all of our memory. We asked about
the problem - no response; we posted the bug in the PR
database - no response; 6 months later, we saw someone
else ask the exact same question (not sure of release,
i thought he was on 6.4, but don't hold me on that one).

It's that kind of non-responsiveness that in our mind makes
the db reliability an issue.

Now don't get me wrong - I realize that you get what you
pay for. But I believe in at the very least responding
to user's questions/problems. A simple "We've seen/not seen
that problem before, and haven't had the time to track down
the root cause and fix it." would have been much
preferable, and gone a long way to making us feel that
problems are being addressed for subsequent releases.

Cheers, Thomas
--
------------------------------------------------------------
Thomas Reinke                            Tel: (905) 331-2260
Director of Technology                   Fax: (905) 331-2504
E-Soft Inc.                         http://www.e-softinc.com

pgsql-general by date:

Previous
From: Ed Loehr
Date:
Subject: Re: [GENERAL] Future of PostgreSQL
Next
From: Charles Tassell
Date:
Subject: Re: [GENERAL] Future of PostgreSQL