Thread: Setup for large database

Setup for large database

From

"mlartz@gmail.com"

Date:

21 April 2006, 14:49:18

Couple of questions.  I have a project that promises to generate a very
large database of network-style data (think banners, flows, etc).  I
was fortunate enough to fall into some kick ass hardware (quad Opteron,
16GB RAM, 3+ TB of fibre channel HDs).

As I'm still in the design phase, I was wondering if I could get any
decent recommendations on hardware/partition setup, and perhaps some
database sanity checks.  My use case is mostly datawarehouse-style
stuff:  scheduled bulk batch inserts and lots of queries.  Like I said
before, my rows are based on network data and are all keyed by an IP
address, and I'm hoping to keep each row under 1K.  I'm estimating
ending up with about 3TB of total data after a year of operation.

1) How anal should I be about my hardware setup?  I have about 15 300GB
10K RPM SCSI drives, 4 of which I can directly attach to the server and
the rest one the FC array.  Should I just put the OS and transaction
logs on the direct attached storage and and then RAID10 the rest of
them and be done, or would I significantly benefit from separating out
the indexes and partitioning across tablespaces across drives?  Would
RAID5 across 10+ drives yield acceptable performance numbers?

3) I've currently installed RHEL4 AS for my OS, which I am very
comfortable with.  I was going to go with EXT3 on everything (noatime)
... sound good?

2) Assuming that my data is roughly evenly distributed among IP
addresses, I figured that a naive partitioning based on the first octet
of the IP (i.e. ~255 partitions) would suffice for such a table, making
each partition ~12GB and keeping the IPs clustered to easily to quickly
query network blocks.  Would it be wise to go to even more partitions?
How does Pg do under a *lot* of partitions (655356)?  Would it be wise
to put each partition in a separate tablespace?

3) I guess I don't quite understand Bizgres.  At the moment, it seems
to be just a development beta of Postgres ... is this true?  I realize
that the focus is on BI/ETL stuff, but the current improvements seem to
benefit Postgres as a whole.  Is there currently or can you imagine a
case where a feature in Bizgres won't get integrated into Postgres?
How significant is the fork between Bizgres and Postgres?  I've also
considered taking a look at Bizgres MPP.  I know that its the wrong
forum, but any comments?

4) Not to start any sort of flame war, but my company has an Oracle
license and there are a bunch of people wanting me to go that way.
I've been doing just fine with Postgres at the moment and am quite
comfortable with it, but am being pressured to go with our Oracle
license.  Cost (and prejudices) aside, do you think it would be wise to
go with Oracle to begin with, considering the size of the database that
I'm planning?

Thanks for any comments,
-Mike

Re: Setup for large database

From

Vivek Khera

Date:

21 April 2006, 15:34:11

On Apr 20, 2006, at 9:02 PM, mlartz@gmail.com wrote:

> 1) How anal should I be about my hardware setup?  I have about 15
> 300GB
> 10K RPM SCSI drives, 4 of which I can directly attach to the server
> and
> the rest one the FC array.  Should I just put the OS and transaction
> logs on the direct attached storage and and then RAID10 the rest of
> them and be done, or would I significantly benefit from separating out
> the indexes and partitioning across tablespaces across drives?  Would
> RAID5 across 10+ drives yield acceptable performance numbers?

If you have that many drives, make one RAID1 pair for the OS and a
dedicated RAID1 pair for the pg_xlog.  I'd put the rest into a RAID10
if you're not willing to do experimentation first...  If I had the
time I'd put a RAID5 on it and simulate the expected load on it, then
compare the RAID10 under same load.

It depends a lot on your RAID controller, how much cache (battery
backed!!!) it has, and your usage patterns.

The pgsql-performance list may have more helpful responses.

As for partitioning based on octet, you should look at your
distribution of addresses and decide if it scatters the data evenly
enough for you.

Re: Setup for large database

From

"Jim C. Nasby"

Date:

21 April 2006, 19:55:42

On Fri, Apr 21, 2006 at 11:34:00AM -0400, Vivek Khera wrote:
> As for partitioning based on octet, you should look at your
> distribution of addresses and decide if it scatters the data evenly
> enough for you.

A much more important question: how will you be querying the data?

Partitioning is not a magic-bullet to performance, and when done
incorrectly it can end up hurting.

In this case, if the OP will be querying mostly on things that fit
within a class A, then partitioning on the first octet probably makes a
lot of sense. In fact, partitioning on the first two octets might make a
lot of sense, so long as there's very littly querying across partitions.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Setup for large database

From

"Jim C. Nasby"

Date:

21 April 2006, 20:02:25

On Thu, Apr 20, 2006 at 06:02:17PM -0700, mlartz@gmail.com wrote:
> 1) How anal should I be about my hardware setup?  I have about 15 300GB
> 10K RPM SCSI drives, 4 of which I can directly attach to the server and
> the rest one the FC array.  Should I just put the OS and transaction
> logs on the direct attached storage and and then RAID10 the rest of
> them and be done, or would I significantly benefit from separating out
> the indexes and partitioning across tablespaces across drives?  Would
> RAID5 across 10+ drives yield acceptable performance numbers?

My experience is more OLTP than OLAP, but for a warehouse envirenment
RAID5 can be a good solution since there's typically not a lot of
updating going on.

I've yet to see much gain from moving pg_xlog onto it's own seperate set
of drives; there's usually not enough traffic from the OS to justify it.
But it is possible that you could end up generating enough WAL traffic
that pg_xlog would become a performance limiter on only 2 drives, though
I suspect you'd have to have over 20-30 drives for data before that
happened.

> 3) I've currently installed RHEL4 AS for my OS, which I am very
> comfortable with.  I was going to go with EXT3 on everything (noatime)
> ... sound good?

There's a data=writeback option for ext3 that can make a big performance
difference.

> 2) Assuming that my data is roughly evenly distributed among IP
> addresses, I figured that a naive partitioning based on the first octet

See my other reply...

> 3) I guess I don't quite understand Bizgres.  At the moment, it seems
> to be just a development beta of Postgres ... is this true?  I realize
> that the focus is on BI/ETL stuff, but the current improvements seem to
> benefit Postgres as a whole.  Is there currently or can you imagine a
> case where a feature in Bizgres won't get integrated into Postgres?
> How significant is the fork between Bizgres and Postgres?  I've also
> considered taking a look at Bizgres MPP.  I know that its the wrong
> forum, but any comments?

You'd probably be better off asking on a bizgres list...

> 4) Not to start any sort of flame war, but my company has an Oracle
> license and there are a bunch of people wanting me to go that way.
> I've been doing just fine with Postgres at the moment and am quite
> comfortable with it, but am being pressured to go with our Oracle
> license.  Cost (and prejudices) aside, do you think it would be wise to
> go with Oracle to begin with, considering the size of the database that
> I'm planning?

There's certainly people out there running multi-terrabyte databases on
PostgreSQL. Unless there's a sound technical reason to switch, I'd stick
with PostgreSQL, especially because migrating to Oracle from PostgreSQL
is fairly easy.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461