Re: weird performances problem - Mailing list pgsql-performance
From | Ron |
---|---|
Subject | Re: weird performances problem |
Date | |
Msg-id | 6.2.5.6.0.20051122093227.040fd900@earthlink.net Whole thread Raw |
In response to | Re: weird performances problem (Guillaume Smet <guillaume.smet@openwide.fr>) |
Responses |
Re: weird performances problem
|
List | pgsql-performance |
At 09:26 AM 11/22/2005, Guillaume Smet wrote: >Ron wrote: >>If I understand your HW config correctly, all of the pg stuff is on >>the same RAID 10 set? > >No, the system and the WAL are on a RAID 1 array and the data on >their own RAID 10 array. As has been noted many times around here, put the WAL on its own dedicated HD's. You don't want any head movement on those HD's. >As I said earlier, there's only a few writes in the database so I'm >not really sure the WAL can be a limitation: IIRC, it's only used >for writes isn't it? When you reach a WAL checkpoint, pg commits WAL data to HD... ...and does almost nothing else until said commit is done. >Don't you think we should have some io wait if the database was >waiting for the WAL? We _never_ have any io wait on this server but >our CPUs are still 30-40% idle. _Something_ is doing long bursts of write IO on sdb and sdb1 every 30 minutes or so according to your previous posts. Profile your DBMS and find out what. >A typical top we have on this server is: > 15:22:39 up 24 days, 13:30, 2 users, load average: 3.86, 3.96, 3.99 >156 processes: 153 sleeping, 3 running, 0 zombie, 0 stopped >CPU states: cpu user nice system irq softirq iowait idle > total 50.6% 0.0% 4.7% 0.0% 0.6% 0.0% 43.8% > cpu00 47.4% 0.0% 3.1% 0.3% 1.5% 0.0% 47.4% > cpu01 43.7% 0.0% 3.7% 0.0% 0.5% 0.0% 51.8% > cpu02 58.9% 0.0% 7.7% 0.0% 0.1% 0.0% 33.0% > cpu03 52.5% 0.0% 4.1% 0.0% 0.1% 0.0% 43.0% >Mem: 3857224k av, 3307416k used, 549808k free, 0k shrd, 80640k buff > 2224424k actv, 482552k in_d, 49416k in_c >Swap: 4281272k av, 10032k used, 4271240k >free 2602424k cached > >As you can see, we don't swap, we have free memory, we have all our >data cached (our database size is 1.5 GB). > >Context switch are between 10,000 and 20,000 per seconds. That's actually a reasonably high CS rate. Again, why? >>This concept works for other tables as well. If you have tables >>that both want services at the same time, disk arm contention will >>drag performance into the floor when they are on the same HW set. >>Profile your HD access and put tables that want to be accessed at >>the same time on different HD sets. Even if you have to buy more HW to do it. > >I use iostat and I can only see a little write activity and no read >activity on both raid arrays. Remember it's not just the overall amount, it's _when_and _where_ the write activity takes place. If you have almost no write activity, but whenever it happens it all happens to the same place by multiple things contending for the same HDs, your performance during that time will be poor. Since the behavior you are describing fits that cause very well, I'd see if you can verify that's what's going on. Ron
pgsql-performance by date: