Re: scalablility problem

From: Xiaoning Ding
Subject: Re: scalablility problem
Date: ,
Msg-id: 460E8198.1010604@cse.ohio-state.edu
(view: Whole thread, Raw)
In response to: Re: scalablility problem  (Scott Marlowe)
Responses: Re: scalablility problem  ("Joshua D. Drake")
List: pgsql-performance

Tree view

scalablility problem  (Xiaoning Ding <-state.edu>, )
 Re: scalablility problem  ("Dave Dutcher", )
 Re: scalablility problem  (Tom Lane, )
  Re: scalablility problem  (Xiaoning Ding <-state.edu>, )
   Re: scalablility problem  (Scott Marlowe, )
   Re: scalablility problem  (Ron Mayer, )
    Re: scalablility problem  (Xiaoning Ding <-state.edu>, )
     Re: scalablility problem  (Dave Cramer, )
 Re: scalablility problem  (Scott Marlowe, )
  Re: scalablility problem  (Xiaoning Ding <-state.edu>, )
   Re: scalablility problem  ("Joshua D. Drake", )
    Re: scalablility problem  (Tom Lane, )
     Re: scalablility problem  (Xiaoning Ding <-state.edu>, )
 Re: scalablility problem  (Christopher Browne, )
  Re: scalablility problem  (Guido Neitzer, )
   Re: scalablility problem  (Michael Stone, )
    Re: scalablility problem  (Xiaoning Ding <-state.edu>, )
  Re: scalablility problem  (Xiaoning Ding <-state.edu>, )

Scott Marlowe wrote:
> On Fri, 2007-03-30 at 15:25, Xiaoning Ding wrote:
>> Hi all,
>>
>> When I run multiple TPC-H queries (DBT3) on  postgresql, I found the system
>> is not scalable.  My machine has 8GB memory, and 4 Xeon Dual Core processor
>> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I
>> run multiple
>> q2 queries simultaneously. The results are:
>>
>> 1 process takes 0.65 second to finish.
>> 2 processes take 1.07 seconds.
>> 4 processes take 4.93 seconds.
>> 8 processes take 16.95 seconds.
>>
>> For 4-process case and 8-process case, queries takes even more time than
>> they are executed serially one after another. Because the system has 8GB
>> memory, which is much bigger than the DB size(SF=1), and I warmed the cache
>> before I run the test, I do not think the problem was caused by disk I/O.
>
> You may be right, you may be wrong.  What did top / vmstat have to say
> about IO wait / disk idle time?
>
> PostgreSQL has to commit transactions to disk.  TPC-H does both business
> decision mostly read queries, as well as mixing in writes.  If you have
> one hard drive, it may well be that activity is stacking up waiting on
> those writes.

Shouldn't writes be asynchronous in linux ?

>> I think it might be caused by some contentions. But I do not know postgresql
>> much. May anybody give me some clue to find the reasons?
>
> Others have mentioned your version of postgresql.  7.3 is quite old, as
> it came out at the end of 2002.  Seeing as 7.3 is the standard pgsql
> version supported by RHEL3, and RHEL came with a 2.6.9 kernel, I'm gonna
> guess your OS is about that old too.
>
> pgsql 7.3 cannot take advantage of lots of shared memory, and has some
> issues scaling to lots of CPUs / processes.

I use RHEL 4. I can not understand how the scalability related with
shared memory?

> While RHEL won't be EOLed for a few more years (redhat promises 7 years
> I think) it's really not a great choice for getting started today.
> RHEL5 just released and RHEL4 is very stable.
>
> There are several things to look at to get better performance.
>
> 1:  Late model PostgreSQL.  Go with 8.2.3 or as a minimum 8.1.8
> 2:  Late model Unix.
> 3:  RAID controller with battery backed cache
> 4:  Plenty of memory.
> 5:  Lots of hard drives
> 6:  4 to 8 CPUs.
>
> Then, google postgresql performance tuning. There are three or four good
> tuning guides that pop up right at the top.
>
>



pgsql-performance by date:

From: Xiaoning Ding
Date:
Subject: Re: scalablility problem
From: Andrew - Supernews
Date:
Subject: Re: Scaling SELECT:s with the number of disks on a stripe