Re: Finding bottleneck - Mailing list pgsql-performance
From | Ron |
---|---|
Subject | Re: Finding bottleneck |
Date | |
Msg-id | 6.2.3.4.0.20050817001405.01fc3cd8@pop.earthlink.net Whole thread Raw |
In response to | Finding bottleneck (Kari Lavikka <tuner@bdb.fi>) |
List | pgsql-performance |
I think I have a solution for you. You have posted that you presently have these RAID volumes and behaviors: sda: data (10 spindles, raid10) sdb: xlog & clog (2 spindles, raid1) sdc: os and other stuff Usually iostat (2 second interval) says: avg-cpu: %user %nice %sys %iowait %idle 32.38 0.00 12.88 11.62 43.12 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 202.00 1720.00 0.00 3440 0 sdb 152.50 4.00 2724.00 8 5448 sdc 0.00 0.00 0.00 0 0 And during checkpoint: avg-cpu: %user %nice %sys %iowait %idle 31.25 0.00 14.75 54.00 0.00 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 3225.50 1562.00 35144.00 3124 70288 sdb 104.50 10.00 2348.00 20 4696 sdc 0.00 0.00 0.00 0 0 During checkpoints sda is becoming saturated, essentially halting all other DB activity involving sda. A lesser version of the porblem is probably occurring every time multiple entities on sda are being accessed simultaneously, particularly simultaneous writes. My Proposed Solution: Put comment and its index on it's own dedicated RAID volume. Put comment_archive and its index on its own dedicated RAID volume. Put the rest of the tables currently part of "data" on their own dedicated RAID volume. Put the rest if the indexes to the tables currently part of "data" on their own dedicated RAID volume. Put xlog on its own dedicated RAID volume. The general idea here is to put any tables or indexes that tend to require simultaneous access, particularly write access, on different spindles. Like all things, there's a point of diminishing returns that is dependent on the HW used and the DB load. If you must wring every last bit of IO out of the HD subsystem, a more exact set of spindle assignments can be made by analyzing your queries and then 1) make sure writes that tend to be simultaneous are to different spindles, then (if you still need better IO) 2) make sure reads that tend to be simultaneous are to different spindles. At some point, your controller will become the bottleneck. At some point beyond that, the IO channels on the mainboard will become the bottleneck. My suggestion should get you to within 80-90% of optimal if I've understood the implications of your posts correctly. The other suggestion I'd make is to bump your RAM from 16GB to 32GB as soon as you can afford it and then tune your PostgreSQL parameters to make best use of it. The more RAM resident your DB, the better. Hope this helps, Ron Peacetree ===========Original Message Follows=========== From: Kari Lavikka <tuner ( at ) bdb ( dot ) fi> To: Merlin Moncure <merlin ( dot ) moncure ( at ) rcsonline ( dot ) com> Subject: Re: Finding bottleneck Date: Mon, 8 Aug 2005 19:19:09 +0300 (EETDST) ---------- Actually I modified postgresql.conf a bit and there isn't commit delay any more. That didn't make noticeable difference though.. Workload is generated by a website with about 1000 dynamic page views a second. Finland's biggest site among youths btw. Anyway, there are about 70 tables and here's some of the most important: relname | reltuples ----------------------------------+------------- comment | 1.00723e+08 comment_archive | 9.12764e+07 channel_comment | 6.93912e+06 image | 5.80314e+06 admin_event | 5.1936e+06 user_channel | 3.36877e+06 users | 325929 channel | 252267 Queries to "comment" table are mostly IO-bound but are performing quite well. Here's an example: (SELECT u.nick, c.comment, c.private, c.admin, c.visible, c.parsable, c.uid_sender, to_char(c.stamp, 'DD.MM.YY HH24:MI') AS stamp, c.comment_id FROM comment c INNER JOIN users u ON u.uid = c.uid_sender WHERE u.status = 'a' AND c.image_id = 15500900 AND c.uid_target = 780345 ORDER BY uid_target DESC, image_id DESC, c.comment_id DESC) LIMIT 36 And explain analyze: Limit (cost=0.00..6.81 rows=1 width=103) (actual time=0.263..17.522 rows=12 loops=1) -> Nested Loop (cost=0.00..6.81 rows=1 width=103) (actual time=0.261..17.509 rows=12 loops=1) -> Index Scan Backward using comment_uid_target_image_id_comment_id_20050527 on "comment" c (cost=0.00..3.39 rows=1 width=92) (actual time=0.129..16.213 rows=12 loops=1) Index Cond: ((uid_target = 780345) AND (image_id = 15500900)) -> Index Scan using users_pkey on users u (cost=0.00..3.40 rows=1 width=15) (actual time=0.084..0.085 rows=1 loops=12) Index Cond: (u.uid = "outer".uid_sender) Filter: (status = 'a'::bpchar) Total runtime: 17.653 ms We are having performance problems with some smaller tables and very simple queries. For example: SELECT u.uid, u.nick, extract(epoch from uc.stamp) AS stamp FROM user_channel uc INNER JOIN users u USING (uid) WHERE channel_id = 281321 AND u.status = 'a' ORDER BY uc.channel_id, upper(uc.nick) And explain analyze: Nested Loop (cost=0.00..200.85 rows=35 width=48) (actual time=0.414..38.128 rows=656 loops=1) -> Index Scan using user_channel_channel_id_nick on user_channel uc (cost=0.00..40.18 rows=47 width=27) (actual time=0.090..0.866 rows=667 loops=1) Index Cond: (channel_id = 281321) -> Index Scan using users_pkey on users u (cost=0.00..3.40 rows=1 width=25) (actual time=0.048..0.051 rows=1 loops=667) Index Cond: ("outer".uid = u.uid) Filter: (status = 'a'::bpchar) Total runtime: 38.753 ms Under heavy load these queries tend to take several minutes to execute although there's plenty of free cpu available. There aren't any blocking locks in pg_locks. |\__/| ( oo ) Kari Lavikka - tuner ( at ) bdb ( dot ) fi - (050) 380 3808 __ooO( )Ooo_______ _____ ___ _ _ _ _ _ _ _ ""
pgsql-performance by date: