Thread: Query performance PLEASE HELP
Hi, everybody! I have spent the last few days trying to figure out why one (or rather some) of my queries takes so long time, but I am aboutto give up :-( Something seems to be definitely wrong here, but I am out of ideas :-( I was wonderring if somebody could please take a look at this and, perhaps, give me some clue what is going on? I would greatlyappreciate that... There are two tables involved: create table tradestyle ( duns int, type int, name text not null ); create index tradestyle_idx on tradestyle (name); create unique index tradestyle_duns_idx on tradestyle (duns,type); create table managed_supplier ( id serial primary key, duns int not null, subscriber int not null ); create unique index managed_supplier_idx on managed_supplier (subscriber, duns); tradestyle table is huge (about 50 million rows), managed_supplier is much smaller (under 100K) The query I am fighting with is something like: select * from tradestyle ts, managed_supplier ms where ts.duns=ms.duns and ts.name like 'POST%' and ms.subscriber=74 orderby ts.name limit 10; This takes close to 20 minutes (!) to run for the first time (it returns almost instanteneouly if I run it again, but willtake about 20 minutes again if I modify the name parameter). Explain analyze says: Limit (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.50 rows=10 loops=1) -> Sort (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.48 rows=11 loops=1) -> Nested Loop (cost=0.00..61.90 rows=1 width=192) (actual time=7589.68..439423.75 rows=110 loops=1) -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..22.02 rows=5 width=157) (actualtime=6.72..3009.90 rows=14365 loops=1) -> Index Scan using tradestyle_duns_idx on tradestyle ts (cost=0.00..6.97 rows=1 width=35) (actual time=30.34..30.37rows=0 loops=14365) Total runtime: 439436.45 msec Well... this time it was just 7 minutes, but I've seen it beeing 20 (with a different param for the name) too, and, 7 minutesis still kinda a lot! I figure, what's killing me here, is having to read the large number of rows from disk for the nested loop... But here iswhat I get from the stats: (The total number of blocks read times block size over 439 seconds it took to run): select (sum(heap_blks_read) + sum(idx_blks_read))*8192/439 from pg_statio_user_tables where relname in ('tradestyle', 'managed_supplier'); -[ RECORD 1 ]--------------- ?column? | 574391.6902050114 500K per second???? This is slower than my DSL line!!! I do believe, it is really the physical read that takes most of the time (I sampled it by stopping in the debugger periodically - and every time it was sitting in __libc_read (), and also, if I run the same exact query again, it returns instanteneously), but I can't understand why is it taking THIS long???? It is also NOT a faulty drive problem - I have this same database installed on a few different boxes, and I have the sameproblem everywhere. If you guys have any idea what can possibly be going on here, PLEASE HELP!!!! Thanks a lot! Dima. P.S. Here is one more example, with a different name param, it chooses to use a different plan (that actually makes moresense to me, but whatever), and took just under 5 minutes to run (but, the "improvement" is just an illusion, because I ran this thing after the previousone - so, most of the stuff was cached): Limit (cost=0.00..16.14 rows=1 width=192) (actual time=6926.37..297527.99 rows=10 loops=1) -> Nested Loop (cost=0.00..16.14 rows=1 width=192) (actual time=6926.36..297527.94 rows=11 loops=1) -> Index Scan using tradestyle_name_idx on tradestyle ts (cost=0.00..7.98 rows=1 width=35) (actual time=51.99..295646.78rows=41020 loops=1) -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..5.82 rows=1 width=157) (actual time=0.04..0.04rows=0 loops=41020) Total runtime: 297528.31 msec I'll greatly appreciate any ideas you could come up with.
Tom Lane wrote: >Dmitry Tkach <dmitry@openratings.com> writes: > >>Explain analyze says: >> > >>Limit (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.50 rows=10 loops=1) >> -> Sort (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.48 rows=11 loops=1) >> -> Nested Loop (cost=0.00..61.90 rows=1 width=192) (actual time=7589.68..439423.75 rows=110 loops=1) >> -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..22.02 rows=5 width=157) (actualtime=6.72..3009.90 rows=14365 loops=1) >> -> Index Scan using tradestyle_duns_idx on tradestyle ts (cost=0.00..6.97 rows=1 width=35) (actual time=30.34..30.37rows=0 loops=14365) >>Total runtime: 439436.45 msec >> > >Judging from the tiny cost estimates, the planner thinks these tables >are tiny. Have you done a VACUUM ANALYZE lately? > Well... Yes. I am doing that daily. Actually, I was wonderring about those estimates too, but that's not my primary concern right now - perhaps, it should be, but, as far as I understand, the estimate only matter for the query plan selection, and I don't have a problem with the query plan - it seems fairly decent to me (as I said in the 'PS', the other way around does make a little more sense to me, but it doesn't seem to help much either)... What I am primarily concerned about is the ACTUAL execution time - 30 milliseconds per row in that tradestyle table seems really excessive, and 7 minutes total time just doesn't make any sense at all... Are you saying this is the planner's problem? What do you think should the correct plan look like then? Thanks a lot! Dima
Dmitry Tkach <dmitry@openratings.com> writes: > Explain analyze says: > Limit (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.50 rows=10 loops=1) > -> Sort (cost=61.91..61.91 rows=1 width=192) (actual time=439435.47..439435.48 rows=11 loops=1) > -> Nested Loop (cost=0.00..61.90 rows=1 width=192) (actual time=7589.68..439423.75 rows=110 loops=1) > -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..22.02 rows=5 width=157) (actualtime=6.72..3009.90 rows=14365 loops=1) > -> Index Scan using tradestyle_duns_idx on tradestyle ts (cost=0.00..6.97 rows=1 width=35) (actual time=30.34..30.37rows=0 loops=14365) > Total runtime: 439436.45 msec Judging from the tiny cost estimates, the planner thinks these tables are tiny. Have you done a VACUUM ANALYZE lately? regards, tom lane
Dmitry Tkach <dmitry@openratings.com> writes: > Well... Yes. I am doing that daily. Actually, I was wonderring about > those estimates too, but that's not my primary concern right now - > perhaps, it should be, but, as far as I understand, the estimate only > matter for the query plan selection, and I don't have a problem with the > query plan You should. If the query can be done any faster, it will be by picking a different query plan. I'm not sure what would be a better plan, but certainly a large part of the problem is that the planner is so far off about the rowcount estimates. One thing I'm wondering is if the index on tradestyle.name could be helpful. How selective is "ts.name like 'POST%'", exactly --- does that eliminate a lot of rows, or not? Is the thing able to use that as an indexqual (ie, are you in C locale)? To tell you the truth, I do not believe your assertion that these tables have been analyzed. I don't see how the rowcount estimates could be so small if the planner were aware of the true table statistics. What does pg_stats show for the columns used in the query? regards, tom lane
Dmitry Tkach <dmitry@openratings.com> writes: > What does NEGATIVE n_distinct mean (for managed_supplier.duns)? :-) Fractional estimate --- in particular, -1 means it thinks the column is unique. tradestyle.duns is not really unique by itself, is it? > It lies about many things... For example - n_distinct for > tradestyle.name = 385825 is about 100 times small than the actual number > (which is a little over 30 million) It might help to raise the statistics target for these columns and re-ANALYZE. Try 100 or so instead of the default 10. However, even with these stats I'd think it would pick up on the tradestyle.name index as a likely thing to use. What was the database's locale setting, again? regards, tom lane
Tom Lane wrote: >Dmitry Tkach <dmitry@openratings.com> writes: > >>Well... Yes. I am doing that daily. Actually, I was wonderring about >>those estimates too, but that's not my primary concern right now - >>perhaps, it should be, but, as far as I understand, the estimate only >>matter for the query plan selection, and I don't have a problem with the >>query plan >> > >You should. If the query can be done any faster, it will be by picking >a different query plan. I'm not sure what would be a better plan, but >certainly a large part of the problem is that the planner is so far off >about the rowcount estimates. > So, what would be the alternative query plan be to make it quicker? I tried forcing it to do the join the other way around (by doing set enable_sort=off; - this forces it to use tradestyle for the outer loop), but that doesn't make it any better... What other possibilities are there that would be likely to improve things? > > >One thing I'm wondering is if the index on tradestyle.name could be helpful. >How selective is "ts.name like 'POST%'", exactly --- does that eliminate >a lot of rows, or not? Is the thing able to use that as an indexqual >(ie, are you in C locale)? > Yes. There is an index on ts.name. I mentioned that in the 'PS' of the original message - if I use another parameter for the name criteria, it decides to use that index, but, as I said, it does not help very much. In that particular case I mentioned, the execution took about 5 minutes - still a lot, although a little better then the 7 I had for the first query, but I believe, this "improvement" is because of caching, not really due to a different query plan. > >To tell you the truth, I do not believe your assertion that these tables >have been analyzed. I don't see how the rowcount estimates could be so >small if the planner were aware of the true table statistics. > Well... It doesn't actually COUNT all the rows when you do the analyze, right? Those are just estimates, based on subsampling... Business names isn't something perfectly ditributed statistically, so, those estimates being off doesn't surprise me much really... > What does >pg_stats show for the columns used in the query? > select * from pg_stats where tablename in ('tradestyle', 'managed_supplier') and attname in ('name', 'duns', 'subscriber'); -[ RECORD 1 ]-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tablename | tradestyle attname | duns null_frac | 0 avg_width | 4 n_distinct | -1 most_common_vals | most_common_freqs | histogram_bounds | {1145200,22833532,49988608,79446436,102189545,135526259,196661250,612015735,806455895,876539727,969917566} correlation | -0.18264 -[ RECORD 2 ]-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tablename | tradestyle attname | name null_frac | 0 avg_width | 24 n_distinct | 385825 most_common_vals | {"UNITED STATES POSTAL SERVICE","G N C","H & R BLOCK","CARL'S JR","EDWARD D JONES & CO L P","FARMERS INSURANCE","UNITED STATES DEPARTMENT OF THE AIR FORCE","WOMENS INTERNATIONAL BOWL"} most_common_freqs | {0.00233333,0.001,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667} histogram_bounds | {".COM LLC","BEAU OLSON JOHN","COLONIAL LIFE & ACCIDENT INSURANCE COMPANY INC","EISENHOWER PARTNERSHIP","GROWTH & OPPORTUNITY INC","JUNCTION AUTOLAND","MC EXPORT","PANTS HANGER","SALISBURY NEWS","THE BACK STORE II LLC","ZR ENTERPRISES, INC"} correlation | 0.010456 -[ RECORD 3 ]-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tablename | managed_supplier attname | duns null_frac | 0 avg_width | 4 n_distinct | -0.688082 most_common_vals | {1021435,1213214,1307974,2190528,2593051,3292620,17543877,27399237,41869355,49591852} most_common_freqs | {0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667} histogram_bounds | {1002229,6557045,22423490,50341502,73508350,112086095,168224033,252940564,627776784,847204914,2100000158} correlation | -0.151041 -[ RECORD 4]-----+----------------------------------------------------------------------------------------------- tablename | managed_supplier attname | subscriber null_frac | 0 avg_width | 4 n_distinct | 44 most_common_vals | {74,81,20,111,67,45,66,108,75,68} most_common_freqs | {0.181333,0.169667,0.138333,0.124,0.0396667,0.0386667,0.0376667,0.0336667,0.0333333,0.0303333} histogram_bounds | {3,63,72,78,98,107,110,112,118,124,1001} correlation | -0.912398 What does NEGATIVE n_distinct mean (for managed_supplier.duns)? :-) It lies about many things... For example - n_distinct for tradestyle.name = 385825 is about 100 times small than the actual number (which is a little over 30 million) Thanks! Dima
Tom Lane wrote: >Dmitry Tkach <dmitry@openratings.com> writes: > >>What does NEGATIVE n_distinct mean (for managed_supplier.duns)? :-) >> > >Fractional estimate --- in particular, -1 means it thinks the column >is unique. tradestyle.duns is not really unique by itself, is it? > No. The (duns,type) combination is unique (type is 0 through 5). > >>It lies about many things... For example - n_distinct for >>tradestyle.name = 385825 is about 100 times small than the actual number >>(which is a little over 30 million) >> > >It might help to raise the statistics target for these columns and >re-ANALYZE. Try 100 or so instead of the default 10. > >However, even with these stats I'd think it would pick up on the >tradestyle.name index as a likely thing to use. What was the >database's locale setting, again? > > > The locale is C. And, once again, I am afraid, we are heading in the wrong direction here - it DOES choose the name index sometimes (for some of the values for the name in the criteria), but it just doesn't seem to make any difference. Here is an example: Limit (cost=0.00..16.14 rows=1 width=192) (actual time=6926.37..297527.99 rows=10 loops=1) -> Nested Loop (cost=0.00..16.14 rows=1 width=192) (actual time=6926.36..297527.94 rows=11 loops=1) -> Index Scan using tradestyle_name_idx on tradestyle ts (cost=0.00..7.98 rows=1 width=35) (actual time=51.99..295646.78rows=41020 loops=1) -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..5.82 rows=1 width=157) (actual time=0.04..0.04rows=0 loops=41020) Total runtime: 297528.31 msec Dima
Dmitry Tkach <dmitry@openratings.com> writes: >> tradestyle.duns is not really unique by itself, is it? > No. The (duns,type) combination is unique (type is 0 through 5). Well, if there are at most six of any duns value, then it's close enough to unique for planning purposes. So that's not the problem. > - it DOES choose the name index sometimes (for some of the values for > the name in the criteria), but it just doesn't seem to make any > difference. Here is an example: The plan's not very intelligible when you don't show us the query ... regards, tom lane
Tom Lane wrote: >Dmitry Tkach <dmitry@openratings.com> writes: > >>>tradestyle.duns is not really unique by itself, is it? >>> > >>No. The (duns,type) combination is unique (type is 0 through 5). >> > >Well, if there are at most six of any duns value, then it's close enough >to unique for planning purposes. So that's not the problem. > >>- it DOES choose the name index sometimes (for some of the values for >>the name in the criteria), but it just doesn't seem to make any >>difference. Here is an example: >> > >The plan's not very intelligible when you don't show us the query ... > > regards, tom lane > Sorry, it was the same query as before - just had 'COMP%' instead of 'POST%': rapidb# explain analyze select * from tradestyle ts, managed_supplier ms where ts.duns=ms.duns and ts.name like 'COMP%' and ms.subscriber=74 orderby ts.name limit 10; NOTICE: QUERY PLAN: Limit (cost=0.00..16.14 rows=1 width=192) (actual time=6926.37..297527.99 rows=10 loops=1) -> Nested Loop (cost=0.00..16.14 rows=1 width=192) (actual time=6926.36..297527.94 rows=11 loops=1) -> Index Scan using tradestyle_name_idx on tradestyle ts (cost=0.00..7.98 rows=1 width=35) (actual time=51.99..295646.78rows=41020 loops=1) -> Index Scan using managed_supplier_idx on managed_supplier ms (cost=0.00..5.82 rows=1 width=157) (actual time=0.04..0.04rows=0 loops=41020) Total runtime: 297528.31 msec Dima.
> Sorry, it was the same query as before - just had 'COMP%' instead of > 'POST%': > > rapidb# explain analyze select * from tradestyle ts, managed_supplier > ms where ts.duns=ms.duns and ts.name like 'COMP%' and ms.subscriber=74 > order by ts.name limit 10; > NOTICE: QUERY PLAN: > > Limit (cost=0.00..16.14 rows=1 width=192) (actual > time=6926.37..297527.99 rows=10 loops=1) > -> Nested Loop (cost=0.00..16.14 rows=1 width=192) (actual > time=6926.36..297527.94 rows=11 loops=1) > -> Index Scan using tradestyle_name_idx on tradestyle ts > (cost=0.00..7.98 rows=1 width=35) (actual time=51.99..295646.78 > rows=41020 loops=1) > -> Index Scan using managed_supplier_idx on managed_supplier > ms (cost=0.00..5.82 rows=1 width=157) (actual time=0.04..0.04 rows=0 > loops=41020) > Total runtime: 297528.31 msec ... actually, after seom thinking, this plan seems to actually be WORSE that the other one - it makes about 41000 inner loops through managed_supplier, while there are only about 11000 entries in managed_supplier with subscriber=74, so, if it did it the other way around (like in the first case), that would result in only 11K inner loops - 4 times less... (there is also an overhead of sorting, but it is negligeable, because the intersection is only 110 rows) So, I just want to point it out again - the query plan does not seem to be a problem at all - whichever one it chooses, the preformnace is much, much worse then I would expect - according to pg_statio_user_tables, it only reads less then 20000 blocks from disk for this query, which totals to about 5K per second... How come it is so slow??? Dima
Dmitry Tkach <dmitry@openratings.com> writes: > Sorry, it was the same query as before - just had 'COMP%' instead of > 'POST%': Oh, I see. How many tradestyle rows actually match "name like 'COMP%'" and "name like 'POST%'" ? The planner seems to be expecting very few. regards, tom lane
Tom Lane wrote: >Dmitry Tkach <dmitry@openratings.com> writes: > >>Sorry, it was the same query as before - just had 'COMP%' instead of >>'POST%': >> > >Oh, I see. How many tradestyle rows actually match "name like 'COMP%'" >and "name like 'POST%'" ? The planner seems to be expecting very few. > > regards, tom lane > Did you receive my last message yet? I said something about it there: There is about 41000 matches for COMP%, about 11000 for POST%, and there are about 14000 entries in the managed_supplier with subscriber =74 So, the query plan performance is, (and, I believe should be) about the same regardless of which table is used for the outer loop. As far as I understand, it is (a little) better to use tradestyle.name index for the POST% query, and it is (a little more) better to use managed_supplier as an outer table for the COMP% query... For some (mysterious) reason, the optimizer chose to do it in exactly the opposite way, but, as I already said a few times earlier - I don't believe it really matters: I have tried forcing it to use one plan or the other for the same query (by setting enable_sort to false and changing the order by clause if necessary) - it DOES NOT MAKE ANY DIFFERENCE (sometimes it is a little quicker then the others, but I have never seen it take less than 5 minutes whatever query plan it uses) - there must be something else going on here that makes it crawl. Dima