Re: Data Warehouse Reevaluation - MySQL vs Postgres -- - Mailing list pgsql-performance

From Joe Conway
Subject Re: Data Warehouse Reevaluation - MySQL vs Postgres --
Date
Msg-id 4148AC71.80908@joeconway.com
Whole thread Raw
In response to Re: Data Warehouse Reevaluation - MySQL vs Postgres --  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-performance
Simon Riggs wrote:
> Joe,
>
> Your application is very interesting. I've just read your OSCON paper. I'd
> like to talk more about that. Very similar to Kalido.
>
> ...but back to partitioning momentarily: Does the performance gain come from
> partition elimination of the inherited tables under the root?

I think the major part of the peformance gain comes from the fact that
the source database has different needs in terms of partitioning
criteria because of it's different purpose. The data is basically
partitioned by customer installation instead of by date. Our converted
scheme partitions by date, which is in line with the analytical queries
run at the corporate office. Again, this is an argument in favor of not
simply porting what you're handed.

We might get similar query performance with a single large table and
multiple partial indexes (e.g. one per month), but there would be one
tradeoff and one disadvantage to that:
1) The indexes would need to be generated periodically -- this is a
tradeoff since we currently need to create inherited tables at the same
periodicity
2) It would be much more difficult to "roll off" a month's worth of data
when needed. The general idea is that each month we create a new monthly
table, then archive and drop the oldest monthly table. If all the data
were in one big table we would have to delete many millions of rows from
a (possibly) multibillion row table, and then vacuum that table -- no
thanks ;-)

Joe

pgsql-performance by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: Data Warehouse Reevaluation - MySQL vs Postgres --
Next
From: "J. Andrew Rogers"
Date:
Subject: Partitioning