Thread: Differences in identical queries
Question for anyone... I have to queries. One runs in about 2 seconds. The other takes upwards of 2 minutes. I have a temp table that is created with 2 columns. This table is joined with the larger database of call detail records. However, these 2 queries are handled very differently. The queries: First---- calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON current.destnum=anitmp.ani AND istf=true; QUERY PLAN -------------------------------------------------------------------------------------- Nested Loop (cost=0.00..2026113.09 rows=500908 width=108) -> Seq Scan on anitmp (cost=0.00..33.62 rows=945 width=8) Filter: (istf = true) -> Index Scan using i_destnum on current (cost=0.00..2137.36 rows=531 width=108) Index Cond: (current.destnum = "outer".ani) (5 rows) Second---- calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON current.orignum=anitmp.ani AND istf=false; QUERY PLAN --------------------------------------------------------------------------- Hash Join (cost=35.99..3402035.53 rows=5381529 width=108) Hash Cond: ("outer".orignum = "inner".ani) -> Seq Scan on current (cost=0.00..907191.05 rows=10170805 width=108) -> Hash (cost=33.62..33.62 rows=945 width=8) -> Seq Scan on anitmp (cost=0.00..33.62 rows=945 width=8) Filter: (istf = false) (6 rows) The tables: Table "public.current" Column | Type | Modifiers ----------+-----------------------------+----------- datetime | timestamp without time zone | orignum | bigint | destnum | bigint | billto | bigint | cost | numeric(6,4) | duration | numeric(8,1) | origcity | character(12) | destcity | character(12) | file | character varying(30) | linenum | integer | carrier | character(1) | Indexes: "i_destnum" btree (destnum) "i_orignum" btree (orignum) Table "public.anitmp" Column | Type | Modifiers --------+---------+----------- ani | bigint | istf | boolean | Anyone have any ideas for me? I have indexes on each of the necessary columns. Rob
Rob Schall wrote: > Question for anyone... > > I have to queries. One runs in about 2 seconds. The other takes upwards > of 2 minutes. I have a temp table that is created with 2 columns. This > table is joined with the larger database of call detail records. > However, these 2 queries are handled very differently. 1. They're different queries - the second is expecting 10 times as many rows as the first. 2. Can't tell if that is accurate - you need to supply EXPLAIN ANALYSE output instead of EXPLAIN, so we can see what actually happened. > calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON > current.destnum=anitmp.ani AND istf=true; > QUERY PLAN > -------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..2026113.09 rows=500908 width=108) > calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON > current.orignum=anitmp.ani AND istf=false; > --------------------------------------------------------------------------- > Hash Join (cost=35.99..3402035.53 rows=5381529 width=108) -- Richard Huxton Archonet Ltd
When I reanalyzed the anitmp table with just the 4 entries (2 with istf=true and 2 with istf=false), both queries then ran the same way/time. So it would appear, if you want to do a join or a subselect (IN), then the number of items if will be comparing it to must be less than a certain number. In my case, the full amount of false's that fail is actually 40. So if you have a table of 2 million items (in current) and want to join it to a table of 40 items, then it must do the longer hashed join version and not the nested loop. Below are the results with the anitmp table with 42 items. 40 set as false, and 2 set as true. Is there a way to rewrite my query to have it run as quick as the others? Thanks calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp ON current.orignum=anitmp.ani AND istf=false; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=1.63..1031833.26 rows=256563 width=108) (actual time=1889.469..155380.749 rows=653 loops=1) Hash Cond: ("outer".orignum = "inner".ani) -> Seq Scan on current (cost=0.00..920527.00 rows=10873900 width=108) (actual time=670.402..136192.991 rows=10681150 loops=1) -> Hash (cost=1.52..1.52 rows=41 width=8) (actual time=0.187..0.187 rows=0 loops=1) -> Seq Scan on anitmp (cost=0.00..1.52 rows=41 width=8) (actual time=0.014..0.108 rows=40 loops=1) Filter: (istf = false) Total runtime: 155381.960 ms (7 rows) calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp ON current.destnum=anitmp.ani AND istf=true; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..5718.45 rows=1413 width=108) (actual time=76.116..1545.856 rows=1382 loops=1) -> Seq Scan on anitmp (cost=0.00..1.52 rows=3 width=8) (actual time=0.025..0.042 rows=2 loops=1) Filter: (istf = true) -> Index Scan using i_destnum on current (cost=0.00..1899.74 rows=472 width=108) (actual time=60.133..769.442 rows=691 loops=2) Index Cond: (current.destnum = "outer".ani) Total runtime: 1548.321 ms (6 rows)
Rob Schall wrote: > When I reanalyzed the anitmp table with just the 4 entries (2 with > istf=true and 2 with istf=false), both queries then ran the same way/time. > So it would appear, if you want to do a join or a subselect (IN), then > the number of items if will be comparing it to must be less than a > certain number. I'm still not sure how you're concluding that. > In my case, the full amount of false's that fail is > actually 40. So if you have a table of 2 million items (in current) and > want to join it to a table of 40 items, then it must do the longer > hashed join version and not the nested loop. > > Below are the results with the anitmp table with 42 items. 40 set as > false, and 2 set as true. Is there a way to rewrite my query to have it > run as quick as the others? > calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp > ON current.orignum=anitmp.ani AND istf=false; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------- > Hash Join (cost=1.63..1031833.26 rows=256563 width=108) (actual > time=1889.469..155380.749 rows=653 loops=1) OK - so here's the root of the problem. The planner thinks it'll get back 256,563 rows but actually gets 653. If you actually got more than 200,000 rows back then a seq-scan on current might well make sense. > Hash Cond: ("outer".orignum = "inner".ani) > -> Seq Scan on current (cost=0.00..920527.00 rows=10873900 > width=108) (actual time=670.402..136192.991 rows=10681150 loops=1) > -> Hash (cost=1.52..1.52 rows=41 width=8) (actual time=0.187..0.187 > rows=0 loops=1) > -> Seq Scan on anitmp (cost=0.00..1.52 rows=41 width=8) > (actual time=0.014..0.108 rows=40 loops=1) > Filter: (istf = false) Hmm - what sort of distribution of values do you have in "orignum" - telephone numbers, so presumably they're quite distinct. I'd be tempted to up the statistics on that column, reanalyse both tables and see what happens. ALTER TABLE current ALTER COLUMN orignum SET STATISTICS=100; You can set values up to 1000, start at 100 and step up. Not checked the syntax on that last statement btw. You can see the before and after effects by looking at: SELECT * FROM pg_stats WHERE tablename='current' AND attname='orignum'; -- Richard Huxton Archonet Ltd
Rob Schall wrote: > I noticed the rows bit. I wasn't quite sure why it thought it would get > so many back. Even after doing a analyze. I will alter the table stats > and see if that helps. > > As for the values in orignum, they are all 10 digit numbers > (nxx,npa,num). I probably could have stored them as a char, etc, but I > didn't think that would help the cause at all. They are all over the > board as far what the numbers are set to, but long story short, they are > call detail numbers. So if they are our customer, then one of the > numbers i tossed in anitmp (one of the customer's numbers) could have > hundreds of entries. Not sure if this is what you were asking me. Well, it must be thinking it'll get a lot of matches for your 40 rows, so increasing the stats might well help. -- Richard Huxton Archonet Ltd
The final answer wound up being.... ALTER TABLE current ALTER COLUMN orignum SET STATISTICS 100; and then an ANALYZE current; Then the queries started running like a champ (split seconds, not minutes). Thanks Richard! Rob Richard Huxton wrote: > Rob Schall wrote: >> When I reanalyzed the anitmp table with just the 4 entries (2 with >> istf=true and 2 with istf=false), both queries then ran the same >> way/time. >> So it would appear, if you want to do a join or a subselect (IN), then >> the number of items if will be comparing it to must be less than a >> certain number. > > I'm still not sure how you're concluding that. > > > In my case, the full amount of false's that fail is >> actually 40. So if you have a table of 2 million items (in current) and >> want to join it to a table of 40 items, then it must do the longer >> hashed join version and not the nested loop. >> >> Below are the results with the anitmp table with 42 items. 40 set as >> false, and 2 set as true. Is there a way to rewrite my query to have it >> run as quick as the others? > >> calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp >> ON current.orignum=anitmp.ani AND istf=false; >> QUERY PLAN >> ----------------------------------------------------------------------------------------------------------------------------------- >> >> Hash Join (cost=1.63..1031833.26 rows=256563 width=108) (actual >> time=1889.469..155380.749 rows=653 loops=1) > > OK - so here's the root of the problem. The planner thinks it'll get > back 256,563 rows but actually gets 653. If you actually got more than > 200,000 rows back then a seq-scan on current might well make sense. > >> Hash Cond: ("outer".orignum = "inner".ani) >> -> Seq Scan on current (cost=0.00..920527.00 rows=10873900 >> width=108) (actual time=670.402..136192.991 rows=10681150 loops=1) >> -> Hash (cost=1.52..1.52 rows=41 width=8) (actual time=0.187..0.187 >> rows=0 loops=1) >> -> Seq Scan on anitmp (cost=0.00..1.52 rows=41 width=8) >> (actual time=0.014..0.108 rows=40 loops=1) >> Filter: (istf = false) > > Hmm - what sort of distribution of values do you have in "orignum" - > telephone numbers, so presumably they're quite distinct. > > I'd be tempted to up the statistics on that column, reanalyse both > tables and see what happens. > ALTER TABLE current ALTER COLUMN orignum SET STATISTICS=100; > You can set values up to 1000, start at 100 and step up. Not checked > the syntax on that last statement btw. > > You can see the before and after effects by looking at: > SELECT * FROM pg_stats WHERE tablename='current' AND attname='orignum'; >