Re: - Mailing list pgsql-performance

From Stephen Frost
Subject Re:
Date
Msg-id 20050120142603.GM10437@ns.snowman.net
Whole thread Raw
In response to  ("Matt Casters" <Matt.Casters@advalvas.be>)
Responses Re:  (Alex Turner <armtuk@gmail.com>)
Re:  ("Matt Casters" <Matt.Casters@advalvas.be>)
List pgsql-performance
* Matt Casters (Matt.Casters@advalvas.be) wrote:
> I have the go ahead of a customer to do some testing on Postgresql in a couple of weeks as a
> replacement for Oracle.
> The reason for the test is that the number of users of the warehouse is going to increase and this
> will have a serious impact on licencing costs. (I bet that sounds familiar)

Rather familiar, yes... :)

> We're running a medium sized data warehouse on a Solaris box (4CPU, 8Gb RAM) on Oracle.
> Basically we have 2 large fact tables to deal with: one going for 400M rows, the other will be
> hitting 1B rows soon.
> (around 250Gb of data)

Quite a bit of data.  There's one big thing to note here I think-
Postgres will not take advantage of multiple CPUs for a given query,
Oracle will.  So, it depends on your workload as to how that may impact
you.  Situations where this will be unlikely to affect you:

Your main bottle-neck is IO/disk and not CPU.
You run multiple queries in parallel frequently.
There are other processes on the system which chew up CPU time anyway.

Situations where you're likely to be affected would be:

You periodically run one big query.
You run a set of queries in sequential order.

> My questions to the list are: has this sort of thing been attempted before? If so, what where the
> performance results compared to Oracle?

I'm pretty sure it's been attempted before but unfortunately I don't
have any numbers on it myself.  My data sets aren't that large (couple
million rows) but I've found PostgreSQL at least as fast as Oracle for
what we do, and much easier to work with.

> I've been reading up on partitioned tabes on pgsql, will the performance benefit will be
> comparable to Oracle partitioned tables?

In this case I would think so, except that PostgreSQL still won't use
multiple CPUs for a given query, even against partitioned tables, aiui.

> What are the gotchas?

See above? :)  Other issues are things having to do w/ your specific
SQL- Oracle's old join syntax isn't supported by PostgreSQL (what is it,
something like select x,y from a,b where x=%y; to do a right-join,
iirc).

> Should I be testing on 8 or the 7 version?

Now that 8.0 is out I'd say probably test with that and just watch for
8.0.x releases before you go production, if you have time before you
have to go into production with the new solution (sounds like you do-
changing databases takes time anyway).

> Thanks in advance for any help you may have, I'll do my best to keep pgsql-performance up to date
> on the results.

Hope that helps.  Others on here will correct me if I misspoke. :)

    Stephen

Attachment

pgsql-performance by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: PostgreSQL clustering VS MySQL clustering
Next
From: Stephen Frost
Date:
Subject: Re: PostgreSQL clustering VS MySQL clustering