Re: Which hardware ? - Mailing list pgsql-performance

From Jon D
Subject Re: Which hardware ?
Date
Msg-id BAY102-W12ED7E7A2005665400F9D2A5AB0@phx.gbl
Whole thread Raw
In response to Which hardware ?  ("Lionel" <lionel@art-informatique.com>)
List pgsql-performance

----------------------------------------
> From: lionel@art-informatique.com
> Subject: [PERFORM] Which hardware ?
> Date: Tue, 17 Jun 2008 15:38:59 +0200
> To: pgsql-performance@postgresql.org
>
> Hi,
>
> I need to install a 8.3 database and was wondering which hardware would be
> sufficient to have good performances (less than 30s for² slowest select).
>
> Database size: 25 Go /year, 5 years of history
> One main table containing 40 million lines per year.
> Batch inserts of 100000 lines. Very very few deletes, few updates.
>
> 30 other tables, 4 levels of hierarchy, containing from 10 lines up to 20000
> lines.
> 5 of them have forein keys on the main table.
>
> I will use table partitionning on the year column.
>
> Statements will mainly do sums on the main table, grouped by whatever column
> of the database (3-5 joined tables, or join on join), with some criterions
> that may vary, lots of "joined varchar in ('a','b',...,'z')".
> It's almost impossible to predict what users will do via the webapplication
> that queries this database: almost all select, join, group by, where...
> possibilities are available.
>
> Up to 4 simultaneous users.
>
> I'm planning to host it on a quad xeon 2.66Ghz with 8Go of DDR2, and a dual
> (RAID1) SATA2 750Go HD.
> Perharps with another HD for indexes.
>
> Do you think it will be enough ?
> Is another RAID for better performances a minimum requirement ?
> Will a secondary HD for indexes help ?
>
> Which OS would you use ? (knowing that there will be a JDK 1.6 installed
> too)
>
> With 5 millions of lines, the same application runs quite fast on windows
> 2000 on a single P4 2.8 GHz (very few statements last more than 10s, mostly
> when concurrent statements are made). Each statement consumes 100% of the
> CPU.
>
>
> thanks for advices.
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


I think hardware isnt going to solve your problem,  especially the cpu.  You only have four users..  and postgres can
onlyuse 1 core per query.   If you have sequential scans that span this table and say it has 60-80 million rows,  It
cancould take longer then 30 seconds.  Even if you have alot of ram.  Just imagine what postgres is doing... if its
targetsearch is going to end in searching 40 million rows and it has to aggregate on two,  or three columns its going
tobe slow.  No amount of hardware is going to fix this.   Sure you can gain some speed by having entire tables in ram.
No magic bullet here.   Disk is definitely not a magic bullet.    Even if you have a bunch of fast disks its still much
slowerthen RAM in performing reads.    So if you read heavy then adding more disk isnt going to just solve all your
problems.  RAM is nice.   The more pages you can keep in ram the less reading from the disk.   

Even with that all said and done...   aggregating lots of rows takes time.    I suggest you come up with a system from
preaggregatingyour data if possible.  Identify all of your target dimensions.   If your lucky,  you only have a few key
dimensionswhich can reduce size of table by lots and reduce queries to 1-2 seconds.   There are a number of ways to
tacklethis,  but postgres is a nice db to do this with, since writers do not block readers.    

I think you should focus on getting this system to work well with minimal hardware first.  Then you can upgrade.   Over
thenext few years the db is only going to get larger.   You have 4 users now.. but who's to say what it will evolve
into.    
_________________________________________________________________
Earn cashback on your purchases with Live Search - the search that pays you back!
http://search.live.com/cashback/?&pkw=form=MIJAAF/publ=HMTGL/crea=earncashback

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: Which hardware ?
Next
From: Howard Cole
Date:
Subject: Re: Tsearch2 Initial Search Speed