Re: very large db performance question - Mailing list pgsql-performance

From Thierry Missimilly
Subject Re: very large db performance question
Date
Msg-id 3FC5B8F6.49A663E7@BULL.NET
Whole thread Raw
In response to very large db performance question  (LIANHE SHAO <lshao2@jhmi.edu>)
List pgsql-performance
Hi,

I have done some performance tests on 1Gb and 4 Gb Databases on a mono
Pentium 4 , 1 Gb RAM, IDE disk, SCSI disks and RAID0 LUN on DAS 5300 on
Linux RedHat 7.3.

In each cases my tests make select, update and insert.
One of them is pgbench. You can find it in Postgres/contrib/pgbench.
The other one is DBT1 form OSDL. I have port it on Postgres and you can find
it in Source Forge. Jenny Wang is writting a better Postgres DBT1 based on C
transactions instead of PL/pgSQL Transactions.

With this size of database, even after a fine tuning of Postgres the problem
is I/O Wait. So in you case with 100 Gb, you will have I/O Wait.
To resume my observations regarding diskss performances :
1) IDE disk are the slower.
2) SCSI disks are a little more faster but you can decrease I/O Wait by 25%
by creating a stripped volume group on 3 disks.
3) A RAID0 on 5 DAS5300 disks improve again performances by 20% as the DAS
Storage Processeur use internal caches

One thing, very important in my case was the time of (hot) backup / restore.

In that case the pgbench database schema is to simple to have an idea but
DBT1 schema is enough complex and on the RAID0 LUN the backup takes 12 min
but the restore takes 16 min + 10 min to recreate the indexes + 255 min to
recreate the Foreign Keys. So 4h41 for a 4Gb database.

That means for a 100 Gb database, if your schema as Foreign keys and indexes
: about 5 hours to backup and 117 hours to restore (~5 days).
So, if your database in a critical database, it is better to use cold backup
with Snapshot tools.

Regards,
Thierry Missimilly

LIANHE SHAO wrote:

> Hello All,
> We will have a very large database to store
> microarray data (may exceed 80-100G some day). now
> we have 1G RAM, 2G Hz Pentium 4, 1 CPU. and enough
> hard disk.
>
> I never touched such large database before. I ask
> several dbas if the hardware is ok, some said it is
> ok for the query, but I am not so convinced. Because
> I check the mailing list and learned that it is not
> unreasonable to take several minutes to do the
> query. But  I want to query to be as fast as possible.
>
> Could anybody tell me that our hardware is an issue
> or not? do we really need better hardware to make
> real difference?
>
> Regards,
> William
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

Attachment

pgsql-performance by date:

Previous
From: Steve Atkins
Date:
Subject: Re: For full text indexing, which is better, tsearch2 or
Next
From: Stefan Champailler
Date:
Subject: Re: Impossibly slow DELETEs