Thread: Perceived weaknesses of postgres

Perceived weaknesses of postgres

From
Csaba Nagy
Date:
http://www.theserverside.com/news/thread.tss?thread_id=48339

The interesting part is where somebody asks why NOT use postgres, and
it's answers could give some additional hints to those interested on
what people find missing from postgres to adopt it.

Just to summarize some of the answers:
* major PITA to upgrade between major versions;
* executing a single query on multiple cpus/cores;
* no direct table cache control;
* accent-insensitive text comparisons;
* fast select count(*);

Wrong assumptions (but people seem to be sure it's like this):
* no hotbackup except pg_dump (wrong: there are in fact a few different
ways to do that);
* pg_dump the only way to cleanly upgrade (wrong: slony is good for
that);
* missing 2 phase commit (wrong: it is now implemented);
* inadequate performance with really large databases (wrong: there are
known examples of really large postgres DBs);

There are other claims like (quoting): "RAC, enterprise backup
capabilities, database on raw partition, compatibility with enterprise
storage (SAN,...)" which I don't know if there are adequate solutions
for postgres or not.

Cheers,
Csaba.



Re: Perceived weaknesses of postgres

From
"Dawid Kuroczko"
Date:
On Feb 13, 2008 10:49 AM, Csaba Nagy <nagy@ecircle-ag.com> wrote:
> http://www.theserverside.com/news/thread.tss?thread_id=48339
>
> The interesting part is where somebody asks why NOT use postgres, and
> it's answers could give some additional hints to those interested on
> what people find missing from postgres to adopt it.
>
> Just to summarize some of the answers:
> * major PITA to upgrade between major versions;

Would be nice, though I must say that while the data migration is a pain, the
SQL compatibility PostgreSQL provides is a blessing.  Other open source
RDBMS have major PITA changing queries in applications. ;-)))

> * executing a single query on multiple cpus/cores;

I wonder if our most popular open source rival can do it.  I have
heard people claiming
so but I would not consider them authoritative. :)

> * no direct table cache control;

Could you elaborate more on this one?

> * accent-insensitive text comparisons;

Yesss.  That would be quite useful.  A more generic problem would be multiple
collations in one cluster.  Hopefully at column level.

> * fast select count(*);
> Wrong assumptions (but people seem to be sure it's like this):
> * no hotbackup except pg_dump (wrong: there are in fact a few different
> ways to do that);

...and pg_dump is not exactly hot.  And PITR using log archiving is a
great thing.
I've set up a backups using Tivoli Storage Manager, and it works quite fine.

BTW there is an urban legend, that Oracle stops writing to its datafiles
when you ALTER TABLESPACE into backup mode.  This is not true,
they are modifying the files quite same as we do. :-)

> * pg_dump the only way to cleanly upgrade (wrong: slony is good for
> that);

Slony is good as long as there are no DDLs issued.  And its easy to
shoot oneself in the foot if one is not careful (some time ago I have
lost all the triggers while upgrading from 8.1 to 8.2; it was my fault
since I did pg_dump -s on a slave database, not on the master...).

This could be solved with help of DDL/DCL triggers though, which
would be helpful for other things too, I guess.

> * missing 2 phase commit (wrong: it is now implemented);
> * inadequate performance with really large databases (wrong: there are
> known examples of really large postgres DBs);
>
> There are other claims like (quoting): "RAC, enterprise backup
> capabilities, database on raw partition, compatibility with enterprise
> storage (SAN,...)" which I don't know if there are adequate solutions
> for postgres or not.

  Regards,
     Dawid

Re: Perceived weaknesses of postgres

From
Csaba Nagy
Date:
On Wed, 2008-02-13 at 13:29 +0100, Dawid Kuroczko wrote:
> > * no direct table cache control;
>
> Could you elaborate more on this one?

Well, I was just summarizing what other people wrote :-)

But I guess they refer to table level control of how much cache memory
to use. I think there are DBMSs where you can tune the cache usage based
on tables/table groups, so you can make sure the important tables (where
you want instant response even if the table was not accessed in the last
hours) are always in cache. Currently that's not possible in postgres -
it is true that the cache memory will be used in an optimal way
regarding mean performance, but sometimes the mean performance is just
good enough even if not optimal, and you would like to eliminate worst
case scenarios for some queries. In that case more control (on the cache
strategy too) is better...

Cheers,
Csaba.



Re: Perceived weaknesses of postgres

From
Csaba Nagy
Date:
On Wed, 2008-02-13 at 13:39 +0100, Csaba Nagy wrote:
> On Wed, 2008-02-13 at 13:29 +0100, Dawid Kuroczko wrote:
> > > * no direct table cache control;
> >
> > Could you elaborate more on this one?
>

OK, re-reading what I just wrote makes me think it was not clear enough:
I think they mean you can _reserve_ some cache memory directly to
specific tables/table groups so that reserved cache is only used by
those tables, and thus will not be sensitive to other activities than
the access to those tables. Particularly a sequential scan on another,
big, table will not touch that reserved cache, or any other big sweep of
data access on other tables.

Not sure this time I got it actually clearer though :-)

Cheers,
Csaba.



Re: Perceived weaknesses of postgres

From
Tino Wildenhain
Date:
Csaba Nagy wrote:
> http://www.theserverside.com/news/thread.tss?thread_id=48339
>
> The interesting part is where somebody asks why NOT use postgres, and
> it's answers could give some additional hints to those interested on
> what people find missing from postgres to adopt it.
>
> Just to summarize some of the answers:
> * major PITA to upgrade between major versions;

I wonder which other databases (also commercial) can do that.

> * executing a single query on multiple cpus/cores;
> * no direct table cache control;
> * accent-insensitive text comparisons;

multiple collations would be nice...
just accent-insensitive would not make much sense imho.

> * fast select count(*);

SELECT count(*) WHERE False; is very fast for me ;)

Greets :-)
Tino


Re: Perceived weaknesses of postgres

From
Magnus Hagander
Date:
Dawid Kuroczko wrote:
> On Feb 13, 2008 10:49 AM, Csaba Nagy <nagy@ecircle-ag.com> wrote:
>> http://www.theserverside.com/news/thread.tss?thread_id=48339
>>
>> The interesting part is where somebody asks why NOT use postgres, and
>> it's answers could give some additional hints to those interested on
>> what people find missing from postgres to adopt it.
>>
>> Just to summarize some of the answers:
>> * major PITA to upgrade between major versions;
>
> Would be nice, though I must say that while the data migration is a pain, the
> SQL compatibility PostgreSQL provides is a blessing.  Other open source
> RDBMS have major PITA changing queries in applications. ;-)))

I don't think these people are comparing to other opensource ones...
They're comparing to the commercial ones (at least in this case)


>> * executing a single query on multiple cpus/cores;
>
> I wonder if our most popular open source rival can do it.  I have
> heard people claiming
> so but I would not consider them authoritative. :)

Again, the commercial ones do, and that's what we're compared to...


>> * no direct table cache control;
>
> Could you elaborate more on this one?

I would guess they're referring to the ability to "pin" a table into
memory, so that it always stays in the cache regardless of what else the
database is doing. There is a narrow use-case where this can be very
useful, but it can also be a very dangerous tool (hint: if you pin a
table that grows up to say 80-90% of your RAM size, your database will
not be fast for anything else)

>> * pg_dump the only way to cleanly upgrade (wrong: slony is good for
>> that);
>
> Slony is good as long as there are no DDLs issued.  And its easy to
> shoot oneself in the foot if one is not careful (some time ago I have
> lost all the triggers while upgrading from 8.1 to 8.2; it was my fault
> since I did pg_dump -s on a slave database, not on the master...).

You can do DDL with Slony, but it requires some planning (DDLSCRIPT).
Doesn't solve all problems, but it does solve some.

//Magnus

Re: Perceived weaknesses of postgres

From
Csaba Nagy
Date:
On Wed, 2008-02-13 at 13:56 +0100, Magnus Hagander wrote:
> I don't think these people are comparing to other opensource ones...
> They're comparing to the commercial ones (at least in this case)

Yes, that's definitely the case. And that can actually be taken as a
compliment to the already attained status of postgres...

> I would guess they're referring to the ability to "pin" a table into
> memory, so that it always stays in the cache regardless of what else the
> database is doing. There is a narrow use-case where this can be very
> useful, but it can also be a very dangerous tool (hint: if you pin a
> table that grows up to say 80-90% of your RAM size, your database will
> not be fast for anything else)

If you can not only pin the complete table, but just allocate a fixed
amount of memory for caching _only that table_, then you get a
compromise which would be better than the 2 extremes: the table is
cached in a mixed bag together with all the rest, resulting in
effectively being cached 0-100% depending on all the activities on the
DB, or it is always 100% cached independently of the any activities in
the DB. The intermediate solution is: the table is always cached using x
amount of memory independently of what happens on the DB. That makes
sure the table is completely cached if the allocated memory is enough,
and at least the most used records are always cached if the table grows
bigger than the allocated cache, while there's no risk of overrunning
the available memory. This could also be applied to groups of tables.

Cheers,
Csaba.


Re: Perceived weaknesses of postgres

From
Brad Nicholson
Date:
Dawid Kuroczko wrote:
> Slony is good as long as there are no DDLs issued.  And its easy to
> shoot oneself in the foot if one is not careful (some time ago I have
> lost all the triggers while upgrading from 8.1 to 8.2; it was my fault
> since I did pg_dump -s on a slave database, not on the master...).
>
>
It can also be a very time consuming process.  It often also involves
having to juggle resources around, find extra disk space floating around
somewhere to build extra replicas, or abandon your fail over target for
a period of time while upgrade that.

BTW, the trigger issue goes away with 8.3 and Slony 2.0.

Brad.

Re: Perceived weaknesses of postgres

From
Stephen Cook
Date:
Magnus Hagander wrote:
> I would guess they're referring to the ability to "pin" a table into
> memory, so that it always stays in the cache regardless of what else the
> database is doing. There is a narrow use-case where this can be very
> useful, but it can also be a very dangerous tool (hint: if you pin a
> table that grows up to say 80-90% of your RAM size, your database will
> not be fast for anything else)

I know that MS removed this ability in SQL Server 2005 for pretty much
this reason; it's usefulness was greatly outweighed by people screwing
up their systems by not calculating things correctly.

Re: Perceived weaknesses of postgres

From
paul rivers
Date:
Stephen Cook wrote:
> Magnus Hagander wrote:
>> I would guess they're referring to the ability to "pin" a table into
>> memory, so that it always stays in the cache regardless of what else
>> the database is doing. There is a narrow use-case where this can be
>> very useful, but it can also be a very dangerous tool (hint: if you
>> pin a table that grows up to say 80-90% of your RAM size, your
>> database will not be fast for anything else)
>
> I know that MS removed this ability in SQL Server 2005 for pretty much
> this reason; it's usefulness was greatly outweighed by people screwing
> up their systems by not calculating things correctly.
>

What they removed was "dbcc pintable", which would specify that data
pages for a table should be pinned in the general buffer cache as they
are requested.  This feature didn't allow you to divide up your buffer
cache, and so this rather pointless feature went away in SQL Server
2005.  A few large, active pinned tables is obviously going to really
wreck performance for most databases.

What SQL Server never had is more like what you get with Sybase, where
you can partition your buffer cache into different regions of whatever
fraction of the overall buffer cache you wish.  This is IMHO a far more
useful implementation.  You can specify which regions a particular (set
of) tables should use.  You can further specify different page sizes for
each buffer (say 2k pages for the intensely oltp stuff, and 16k pages
for the more olapy things).  You don't end up trying to leave *every*
page of a table in memory this way, since LRU (or whatever method) will
still recycle pages as needed within a named cache.  This was all there
in version 11 of the product, which was the last one I ever worked with
(and is very dated at this point).  This feature never made it to SQL
Server since Microsoft went off and did their own thing well before this.

It's more this Sybase-type implementation I assumed the original poster
was asking about?

You can do something kind of similar in MySQL with the MyISAM storage
engine, but I've not heard of too many shops actually doing this (who
knows).   The MySQL manual seems to strongly recommend it, anyway.

Paul