Re: Intermittent slowdowns, connection delays - Mailing list pgsql-performance
From | Jason Coene |
---|---|
Subject | Re: Intermittent slowdowns, connection delays |
Date | |
Msg-id | 200405120104.i4C14Fam018939@mail.gotfrag.com Whole thread Raw |
In response to | Intermittent slowdowns, connection delays ("Jason Coene" <jcoene@gotfrag.com>) |
List | pgsql-performance |
Hi Paul, Thanks for the valuable feedback. I suspect you're correct about the serialization in some capacity, but the actual cause is eluding me. Basically, every time a registered user checks a page, the site has to authenticate them (with a query against a table with > 200,000 records). It doesn't update this table, however - it updates another table with "user stats" information (last click, last ip, etc). From what I've seen, there doesn't seem to be any serious locking issues. It does make sense when a number of users whose information isn't in cache, it could take a bit longer - but AFAIK this shouldn't prevent other simultaneous queries. What else could cause such serialization? If I look at open locks (this is a view, info from pg tables): relname | mode | numlocks ----------------------+------------------+---------- users | AccessShareLock | 4 userstats | AccessShareLock | 4 pg_statistic | AccessShareLock | 2 users_ix_id | AccessShareLock | 2 countries | AccessShareLock | 2 comments | AccessShareLock | 2 countries_ix_id | AccessShareLock | 2 userstats_ix_id | AccessShareLock | 2 comments_ix_parentid | AccessShareLock | 2 users | RowExclusiveLock | 1 filequeue_ix_id | AccessShareLock | 1 pg_class | AccessShareLock | 1 vopenlocks | AccessShareLock | 1 pg_locks | AccessShareLock | 1 userstats | RowExclusiveLock | 1 filequeue | AccessShareLock | 1 pg_class_oid_index | AccessShareLock | 1 Also of note, executing a random "in the blue" query on our "users" table returns results very fast. While there's no doubt that caching may help, returning a row that is definitely not cached is very fast: < 0.05 sec. Top tells me that the system isn't using much memory - almost always under 100MB (of the 2GB we have). Is there a way to increase the amount of physical RAM that PG uses? It seems there's a lot of room there. Postgresql.conf has: shared_buffers = 16384 sort_mem = 8192 vacuum_mem = 8192 Also, would queries becoming serialized effect connection delays? I think there's still something else at large here... I've attached a vmstat output, while running dd. The RAID array is tw0. It does show the tw0 device getting significantly more work, numbers not seen during normal operation. Thanks, Jason Coene Gotfrag eSports 585-598-6621 Phone 585-598-6633 Fax jcoene@gotfrag.com http://www.gotfrag.com -----Original Message----- From: Paul Tuckfield [mailto:paul@tuckfield.com] Sent: Tuesday, May 11, 2004 7:50 PM To: Jason Coene Subject: Re: [PERFORM] Intermittent slowdowns, connection delays The things you point out suggest a heavy dependence on good cache performance (typical of OLTP mind you) Do not be fooled if a query runs in 2 seconds then the second run takes < .01 secons: the first run put it in cache the second got all cache hits :) But beyond that, in an OLTP system, and typical website backing database, "cache is king". And serialization is the devil So look for reasons why your cache performance might deteriorate during peak, (like large historical tables that users pull up dozens of scattered rows from, flooding cache) or why you may be serializing somewhere inside postgres (ex. if every page hit re-logs in, then theres probably serialization trying to spawn what must be 40 processes/sec assuming your 11hit/sec avg peaks at about 40/sec) Also: I am really surprised you see zero IO in the vmstat you sent, but I'm unfamiliar with BSD version of vmstat. AFAIR, Solaris shows cached filesystem reads as "page faults" which is rather confusing. Since you have 1500 page faults per second, yet no paging (bi bo) does thins mean the 1500 page faults are filesystem IO that pg is doing? do an objective test on an idle system by dd'ing a large file in and watching what vmstat does. On May 11, 2004, at 3:10 PM, Jason Coene wrote: > Hi All, > > We have a Postgres 7.4.1 server running on FreeBSD 5.2. Hardware is a > Dual > Xeon 2.6 (HT enabled), 2 GB Memory, 3Ware SATA RAID-5 w/ 4 7200 RPM > Seagate > disks and gigabit Intel Server Ethernet. The server is dedicated to > serving > data to our web-based CMS. > > We have a few web servers load balanced, and we do around 1M page > impressions per day. Our website is highly personalized, and we've > optimized it to limit the number of queries, but we still see between > 2 and > 3 SELECT's (with JOIN's) and 1 UPDATE per page load, selectively more > - a > fair volume. > > The single UPDATE per page load is updating a timestamp in a small > table > (about 150,000 rows) with only 1 index (on the 1 field that needs to be > matched). > > We're seeing some intermittent spikes in query time as actual > connection > time. I.e., during these seemingly random spikes, our debug output > looks > like this (times from start of HTTP request): > > SQL CONNECTION CREATING 'gf' > 0.0015 - ESTABLISHING CONNECTION > 1.7113 - CONNECTION OK > SQL QUERY ID 1 COST 0.8155 ROWS 1 > SQL QUERY ID 2 COST 0.5607 ROWS 14 > .. etc.. (all queries taking more time than normal, see below) > > Refresh the page 2 seconds later, and we'll get: > > SQL CONNECTION CREATING 'gf' > 0.0017 - ESTABLISHING CONNECTION > 0.0086 - CONNECTION OK > SQL QUERY ID 1 COST 0.0128 ROWS 1 > SQL QUERY ID 2 COST 0.0033 ROWS 14 > .. etc.. (with same queries) > > Indeed, during these types, it takes a moment for "psql" to connect on > the > command line (from the same machine using a local file socket), so > it's not > a network issue or a web-server issue. During these spurts, there's > nothing > too out of the ordinary in vmstat, systat or top. > > These programs show that we're not using much CPU (usually 60-80% > idle), and > disks usage is virtually nil. I've attached 60 seconds of "vmstat 5". > Memory usage looks like this (constantly): > > Mem: 110M Active, 1470M Inact, 206M Wired, 61M Cache, 112M Buf, 26M > Free > > I've cleaned up and tested query after query, and nothing is a "hog". > On an > idle server, every query will execute in < 0.05 sec. Perhaps some of > you > veterans have ideas? > > Thanks, > > Jason Coene > Gotfrag eSports > 585-598-6621 Phone > 585-598-6633 Fax > jcoene@gotfrag.com > http://www.gotfrag.com > > > <vmstat51min.txt> > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
Attachment
pgsql-performance by date: