Re: weird performances problem - Mailing list pgsql-performance

From Ron
Subject Re: weird performances problem
Date
Msg-id 6.2.5.6.0.20051122093227.040fd900@earthlink.net
Whole thread Raw
In response to Re: weird performances problem  (Guillaume Smet <guillaume.smet@openwide.fr>)
Responses Re: weird performances problem
List pgsql-performance
At 09:26 AM 11/22/2005, Guillaume Smet wrote:
>Ron wrote:
>>If I understand your HW config correctly, all of the pg stuff is on
>>the same RAID 10 set?
>
>No, the system and the WAL are on a RAID 1 array and the data on
>their own RAID 10 array.

As has been noted many times around here, put the WAL on its own
dedicated HD's.  You don't want any head movement on those HD's.


>As I said earlier, there's only a few writes in the database so I'm
>not really sure the WAL can be a limitation: IIRC, it's only used
>for writes isn't it?

When you reach a WAL checkpoint, pg commits WAL data to HD... ...and
does almost nothing else until said commit is done.


>Don't you think we should have some io wait if the database was
>waiting for the WAL? We _never_ have any io wait on this server but
>our CPUs are still 30-40% idle.
_Something_ is doing long bursts of write IO on sdb and sdb1 every 30
minutes or so according to your previous posts.

Profile your DBMS and find out what.


>A typical top we have on this server is:
>  15:22:39  up 24 days, 13:30,  2 users,  load average: 3.86, 3.96, 3.99
>156 processes: 153 sleeping, 3 running, 0 zombie, 0 stopped
>CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
>            total   50.6%    0.0%    4.7%   0.0%     0.6%    0.0%   43.8%
>            cpu00   47.4%    0.0%    3.1%   0.3%     1.5%    0.0%   47.4%
>            cpu01   43.7%    0.0%    3.7%   0.0%     0.5%    0.0%   51.8%
>            cpu02   58.9%    0.0%    7.7%   0.0%     0.1%    0.0%   33.0%
>            cpu03   52.5%    0.0%    4.1%   0.0%     0.1%    0.0%   43.0%
>Mem:  3857224k av, 3307416k used,  549808k free,       0k shrd,   80640k buff
>                    2224424k actv,  482552k in_d,   49416k in_c
>Swap: 4281272k av,   10032k used, 4271240k
>free                 2602424k cached
>
>As you can see, we don't swap, we have free memory, we have all our
>data cached (our database size is 1.5 GB).
>
>Context switch are between 10,000 and 20,000 per seconds.
That's actually a reasonably high CS rate.  Again, why?


>>This concept works for other tables as well.  If you have tables
>>that both want services at the same time, disk arm contention will
>>drag performance into the floor when they are on the same HW set.
>>Profile your HD access and put tables that want to be accessed at
>>the same time on different HD sets.  Even if you have to buy more HW to do it.
>
>I use iostat and I can only see a little write activity and no read
>activity on both raid arrays.
Remember it's not just the overall amount, it's _when_and _where_ the
write activity takes place.  If you have almost no write activity,
but whenever it happens it all happens to the same place by multiple
things contending for the same HDs, your performance during that time
will be poor.

Since the behavior you are describing fits that cause very well, I'd
see if you can verify that's what's going on.

Ron



pgsql-performance by date:

Previous
From: Guillaume Smet
Date:
Subject: Re: weird performances problem
Next
From: Guillaume Smet
Date:
Subject: Re: weird performances problem