Thread: Very slow catalog query
Hi, I have a DB with a large number schemas (around 10K) and a large number of tables (400K). The app became slow lately, and logging the slow queries, I see more than a few like this: SELECT: LOG: duration: 169547.424 ms statement: SELECT attr.attname, name.nspname, seq.relname FROM pg_class seq, pg_attribute attr, pg_depend dep, pg_namespace name, pg_constraint cons WHERE seq.oid = dep.objid AND seq.relnamespace = name.oid AND seq.relkind = 'S' AND attr.attrelid = dep.refobjid AND attr.attnum = dep.refobjsubid AND attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1] AND cons.contype = 'p' AND dep.refobjid = 'activities'::regclass Almost all slow queries are of this type, though most of those do finish really fast. From time to time it gets really slow. Some details on the setup: Dual Opteron with 4GB RAM RAID1 for WAL on 10K SCSI RAID10 over 6 x 10K scsi drives for main the rest for the DB files Auto vaccum is on, and in addition I do some vacuuming for specific high use tables nightly Any ideas how to start finding the culprit? Bye, Guy. -- Family management on rails: http://www.famundo.com My development related blog: http://devblog.famundo.com
On Monday 31 March 2008 15:13:25 Just Someone wrote: > Hi, > > I have a DB with a large number schemas (around 10K) and a large > number of tables (400K). The app became slow lately, and logging the > slow queries, I see more than a few like this: > > SELECT: LOG: duration: 169547.424 ms statement: SELECT > attr.attname, name.nspname, seq.relname > FROM pg_class seq, > pg_attribute attr, > pg_depend dep, > pg_namespace name, > pg_constraint cons > WHERE seq.oid = dep.objid > AND seq.relnamespace = name.oid > AND seq.relkind = 'S' > AND attr.attrelid = dep.refobjid > AND attr.attnum = dep.refobjsubid > AND attr.attrelid = cons.conrelid > AND attr.attnum = cons.conkey[1] > AND cons.contype = 'p' > AND dep.refobjid = 'activities'::regclass > > Almost all slow queries are of this type, though most of those do > finish really fast. From time to time it gets really slow. > > Some details on the setup: > Dual Opteron with 4GB RAM > RAID1 for WAL on 10K SCSI > RAID10 over 6 x 10K scsi drives for main the rest for the DB files > > Auto vaccum is on, and in addition I do some vacuuming for specific > high use tables nightly > > Any ideas how to start finding the culprit? > > Bye, > > Guy. > > > -- > Family management on rails: http://www.famundo.com > My development related blog: http://devblog.famundo.com Hi 'Just Someone' I'm wondering... just 4GB of ram? What's the "normal" "hammering" -- a.k.a. user access -- to all of this? PG, as expected, launches a separate process for each connection. this eats up resources quite quickly.... Did you check your system processes with 'top' ? how's it looking for swap usage? Regards, -- Pedro Doria Meunier Ips. da Olaria, Edf. Jardins do Garajau, 4, r/c Y 9125-162 Caniço Madeira - Portugal -------------------------------------------------- Skype : pdoriam Mobile: +351961720188
Attachment
"Just Someone" <just.some@gmail.com> writes: > Any ideas how to start finding the culprit? EXPLAIN ANALYZE? What would be particularly interesting is to compare the results for fast and slow cases of the "same" query. regards, tom lane
Hi, > I'm wondering... just 4GB of ram? > What's the "normal" "hammering" -- a.k.a. user access -- to all of this? > PG, as expected, launches a separate process for each connection. this eats up > resources quite quickly.... > Did you check your system processes with 'top' ? how's it looking for swap > usage? Swap usage is almost nil. And I only have a constant number of connections (about 10-15) as it serves as the backend for a Web application. Transactions run at about 20-40 per second, but mostly very short and simple ones. Thanks, Guy.
Hi Tom, Here is the result of explain analyze (though this one took 1500ms and not 169000): On Mon, Mar 31, 2008 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Just Someone" <just.some@gmail.com> writes: > > Any ideas how to start finding the culprit? > > EXPLAIN ANALYZE? explain analyze SELECT attr.attname, name.nspname, seq.relname FROM pg_class seq, pg_attribute attr, pg_depend dep, pg_namespace name, pg_constraint cons WHERE seq.oid = dep.objid AND seq.relnamespace = name.oid AND seq.relkind = 'S' AND attr.attrelid = dep.refobjid AND attr.attnum = dep.refobjsubid AND attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1] AND cons.contype = 'p' AND dep.refobjid = 'activities'::regclass; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=64956.07..65025.73 rows=31 width=192) (actual time=1547.720..1547.749 rows=1 loops=1) -> Nested Loop (cost=64956.07..64987.48 rows=1 width=166) (actual time=1547.662..1547.684 rows=1 loops=1) -> Nested Loop (cost=64956.07..64981.47 rows=1 width=106) (actual time=1547.616..1547.631 rows=1 loops=1) -> Merge Join (cost=64956.07..64957.36 rows=4 width=42) (actual time=1547.484..1547.502 rows=3 loops=1) Merge Cond: ("outer"."?column3?" = "inner".refobjsubid) -> Sort (cost=4.08..4.08 rows=3 width=30) (actual time=0.149..0.151 rows=1 loops=1) Sort Key: cons.conkey[1] -> Index Scan using pg_constraint_conrelid_index on pg_constraint cons (cost=0.00..4.05 rows=3 width=30) (actual time=0.110..0.112 rows=1 loops=1) Index Cond: (30375069::oid = conrelid) Filter: (contype = 'p'::"char") -> Sort (cost=64951.99..64952.61 rows=247 width=12) (actual time=1547.303..1547.318 rows=9 loops=1) Sort Key: dep.refobjsubid -> Index Scan using pg_depend_reference_index on pg_depend dep (cost=0.00..64942.17 rows=247 width=12) (actual time=396.542..1547.172 rows=22 loops=1) Index Cond: (refobjid = 30375069::oid) -> Index Scan using pg_class_oid_index on pg_class seq (cost=0.00..6.02 rows=1 width=72) (actual time=0.034..0.035 rows=0 loops=3) Index Cond: (seq.oid = "outer".objid) Filter: (relkind = 'S'::"char") -> Index Scan using pg_namespace_oid_index on pg_namespace name (cost=0.00..6.00 rows=1 width=68) (actual time=0.039..0.041 rows=1 loops=1) Index Cond: ("outer".relnamespace = name.oid) -> Index Scan using pg_attribute_relid_attnum_index on pg_attribute attr (cost=0.00..38.00 rows=20 width=70) (actual time=0.050..0.052 rows=1 loops=1) Index Cond: ((30375069::oid = attr.attrelid) AND (attr.attnum = "outer".refobjsubid)) Total runtime: 1548.082 ms Bye, Guy.
"Just Someone" <just.some@gmail.com> writes: > Here is the result of explain analyze (though this one took 1500ms and > not 169000): Well, it's hard to be sure what the problem is when you're not showing us a problem case ... but I notice that this indexscan is estimated awfully high: > -> Index Scan using > pg_depend_reference_index on pg_depend dep (cost=0.00..64942.17 > rows=247 width=12) (actual time=396.542..1547.172 rows=22 loops=1) > Index Cond: (refobjid = 30375069::oid) The reason is not far to seek: the scan is checking only the second index key, meaning that it has to scan the entire index. (I am surprised it didn't use a seqscan instead. Are you using enable_seqscan = off? Not a great idea.) Since you know you are looking for a table, you could improve matters by adding a constraint on refclassid: dep.refclassid = 'pg_class'::regclass regards, tom lane
Hi Tom, > Well, it's hard to be sure what the problem is when you're not showing > us a problem case ... but I notice that this indexscan is estimated > awfully high: Whenever I do it manually it works fast. But in the log I see lots of slow ones. Could it be caused by auto vacuum? Or by check pointing or WAL writing? Are there way to check that? > > -> Index Scan using > > pg_depend_reference_index on pg_depend dep (cost=0.00..64942.17 > > rows=247 width=12) (actual time=396.542..1547.172 rows=22 loops=1) > > Index Cond: (refobjid = 30375069::oid) > > The reason is not far to seek: the scan is checking only the second > index key, meaning that it has to scan the entire index. (I am > surprised it didn't use a seqscan instead. Are you using enable_seqscan > = off? Not a great idea.) Since you know you are looking for a table, > you could improve matters by adding a constraint on refclassid: > > dep.refclassid = 'pg_class'::regclass enable_setscan is on. Is there a way to analyze/vacuum those tables? I will look if I can also improve the query to be more exact. Bye, Guy.