Finding bottleneck

From: Kari Lavikka
Subject: Finding bottleneck
Date: ,
Msg-id: Pine.HPX.4.62.0507281215420.2164@purple.bdb.fi
(view: Whole thread, Raw)
Responses: Re: Finding bottleneck  (Gavin Sherry)
Re: Finding bottleneck  (Claus Guttesen)
Re: Finding bottleneck  ("Luke Lonergan")
List: pgsql-performance

Tree view

Finding bottleneck  (Kari Lavikka, )
 Re: Finding bottleneck  (Gavin Sherry, )
 Re: Finding bottleneck  (Claus Guttesen, )
 Re: Finding bottleneck  ("Merlin Moncure", )
 Re: Finding bottleneck  ("Luke Lonergan", )
  Re: Finding bottleneck  (Kari Lavikka, )
   Re: Finding bottleneck  (Tom Lane, )
 Re: Finding bottleneck  ("Merlin Moncure", )
  Re: Finding bottleneck  (Tom Lane, )
  Re: Finding bottleneck  (Kari Lavikka, )
   Re: Finding bottleneck  (Tom Lane, )
    Re: Finding bottleneck  (Kari Lavikka, )
     Re: Finding bottleneck  (Tom Lane, )
    Re: Finding bottleneck  (Kari Lavikka, )
     Re: Finding bottleneck  (Tom Lane, )
 Re: Finding bottleneck  (Ron, )
 Re: Finding bottleneck  ("Merlin Moncure", )
  Re: Finding bottleneck  (Tom Lane, )
 Re: Finding bottleneck  ("Merlin Moncure", )
 Re: Finding bottleneck  ("Merlin Moncure", )
  Re: Finding bottleneck  (Tom Lane, )
 Re: Finding bottleneck  ("Merlin Moncure", )

Hello,

we recently upgraded our dual Xeon Dell to a brand new Sun v40z with 4
opterons, 16GB of memory and MegaRAID with enough disks. OS is Debian
Sarge amd64, PostgreSQL is 8.0.3. Size of database is something like 80GB
and our website performs about 600 selects and several updates/inserts a
second.

v40z performs somewhat better than our old Dell but mostly due to
increased amount of memory. The problem is.. there seems to by plenty of
free CPU available and almost no IO-wait but CPU bound queries seem to
linger for some reason. Problem appears very clearly during checkpointing.
Queries accumulate and when checkpointing is over, there can be something
like 400 queries running but over 50% of cpu is just idling.

procs -----------memory------------ ---swap-- -----io---- --system-- ----cpu----
  r  b   swpd   free   buff    cache   si   so    bi    bo   in    cs us sy id wa
  3  1      0 494008 159492 14107180    0    0   919  3164 3176 13031 29 12 52  8
  5  3      0 477508 159508 14118452    0    0  1071  4479 3474 13298 27 13 47 13
  0  0      0 463604 159532 14128832    0    0   922  2903 3352 12627 29 11 52  8
  3  1      0 442260 159616 14141668    0    0  1208  3153 3357 13163 28 12 52  9

An example of a lingering query (there's usually several of these or similar):

SELECT u.uid, u.nick, u.name, u.showname, i.status, i.stamp, i.image_id,
i.info, i.t_width, i.t_height FROM users u INNER JOIN image i ON i.uid =
u.uid INNER JOIN user_online uo ON u.uid = uo.uid WHERE u.city_id = 5 AND
i.status = 'd' AND u.status = 'a' ORDER BY city_id, upper(u.nick) LIMIT
(40 + 1) OFFSET 320

Tables involved contain no more than 4 million rows. Those are constantly
accessed and should fit nicely to cache. But database is just slow because
of some unknown reason. Any ideas?

----------------->8 Relevant rows from postgresql.conf 8<-----------------

shared_buffers = 15000          # min 16, at least max_connections*2, 8KB each
work_mem = 1536                 # min 64, size in KB
maintenance_work_mem = 32768    # min 1024, size in KB

max_fsm_pages = 1000000         # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 5000        # min 100, ~50 bytes each

vacuum_cost_delay = 15          # 0-1000 milliseconds
vacuum_cost_limit = 120         # 0-10000 credits

bgwriter_percent = 2            # 0-100% of dirty buffers in each round

fsync = true                    # turns forced synchronization on or off
                                 # fsync, fdatasync, open_sync, or open_datasync
wal_buffers = 128               # min 4, 8KB each
commit_delay = 80000            # range 0-100000, in microseconds
commit_siblings = 10            # range 1-1000

checkpoint_segments = 200       # in logfile segments, min 1, 16MB each
checkpoint_timeout = 1800       # range 30-3600, in seconds

effective_cache_size = 1000000  # typically 8KB each
random_page_cost = 1.8          # units are one sequential page fetch cost

default_statistics_target = 150 # range 1-1000

stats_start_collector = true
stats_command_string = true

     |\__/|
     ( oo )    Kari Lavikka -  - (050) 380 3808
__ooO(  )Ooo_______ _____ ___ _ _  _   _    _      _                  _
       ""


pgsql-performance by date:

From: Chris Hoover
Date:
Subject: Fwd: Help with view performance problem
From: Dan Harris
Date:
Subject: Re: Fwd: Help with view performance problem