Query performance PLEASE HELP - Mailing list pgsql-general
From | Dmitry Tkach |
---|---|
Subject | Query performance PLEASE HELP |
Date | |
Msg-id | 3E3ACEE6.7060603@openratings.com Whole thread Raw |
Responses |
Re: Query performance PLEASE HELP
|
List | pgsql-general |
Hi, everybody! I have spent the last few days trying to figure out why one (or rather some) of my queries takes so long time, but I am aboutto give up :-( Something seems to be definitely wrong here, but I am out of ideas :-( I was wonderring if somebody could please take a look at this and, perhaps, give me some clue what is going on? I would greatlyappreciate that... There are two tables involved: create table tradestyle ( duns int, type int, name text not null ); create index tradestyle_idx on tradestyle (name); create unique index tradestyle_duns_idx on tradestyle (duns,type); create table managed_supplier ( id serial primary key, duns int not null, subscriber int not null ); create unique index managed_supplier_idx on managed_supplier (subscriber, duns); tradestyle table is huge (about 50 million rows), managed_supplier is much smaller (under 100K) The query I am fighting with is something like: select * from tradestyle ts, managed_supplier ms where ts.duns=ms.duns and ts.name like 'POST%' and ms.subscriber=74 orderby ts.name limit 10; This takes close to 20 minutes (!) to run for the first time (it returns almost instanteneouly if I run it again, but willtake about 20 minutes again if I modify the name parameter). Explain analyze says: Limit (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.50 rows=10 loops=1) -> Sort (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.48 rows=11 loops=1) -> Nested Loop (cost=0.00..61.90 rows=1 width=192) (actual time=7589.68..439423.75 rows=110 loops=1) -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..22.02 rows=5 width=157) (actualtime=6.72..3009.90 rows=14365 loops=1) -> Index Scan using tradestyle_duns_idx on tradestyle ts (cost=0.00..6.97 rows=1 width=35) (actual time=30.34..30.37rows=0 loops=14365) Total runtime: 439436.45 msec Well... this time it was just 7 minutes, but I've seen it beeing 20 (with a different param for the name) too, and, 7 minutesis still kinda a lot! I figure, what's killing me here, is having to read the large number of rows from disk for the nested loop... But here iswhat I get from the stats: (The total number of blocks read times block size over 439 seconds it took to run): select (sum(heap_blks_read) + sum(idx_blks_read))*8192/439 from pg_statio_user_tables where relname in ('tradestyle', 'managed_supplier'); -[ RECORD 1 ]--------------- ?column? | 574391.6902050114 500K per second???? This is slower than my DSL line!!! I do believe, it is really the physical read that takes most of the time (I sampled it by stopping in the debugger periodically - and every time it was sitting in __libc_read (), and also, if I run the same exact query again, it returns instanteneously), but I can't understand why is it taking THIS long???? It is also NOT a faulty drive problem - I have this same database installed on a few different boxes, and I have the sameproblem everywhere. If you guys have any idea what can possibly be going on here, PLEASE HELP!!!! Thanks a lot! Dima. P.S. Here is one more example, with a different name param, it chooses to use a different plan (that actually makes moresense to me, but whatever), and took just under 5 minutes to run (but, the "improvement" is just an illusion, because I ran this thing after the previousone - so, most of the stuff was cached): Limit (cost=0.00..16.14 rows=1 width=192) (actual time=6926.37..297527.99 rows=10 loops=1) -> Nested Loop (cost=0.00..16.14 rows=1 width=192) (actual time=6926.36..297527.94 rows=11 loops=1) -> Index Scan using tradestyle_name_idx on tradestyle ts (cost=0.00..7.98 rows=1 width=35) (actual time=51.99..295646.78rows=41020 loops=1) -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..5.82 rows=1 width=157) (actual time=0.04..0.04rows=0 loops=41020) Total runtime: 297528.31 msec I'll greatly appreciate any ideas you could come up with.
pgsql-general by date: