Thread: Planner incorrectly choosing seq scan over index scan
Hi all,
We're using 8.0.3 and we're seeing a problem where the planner is choosing a seq scan and hash join over an index scan. If I set enable_hashjoin to off, then I get the plan I'm expecting and the query runs a lot faster. I've also tried lowering the random page cost (even to 1) but the planner still chooses to use the hash join.
Does anyone have any thoughts/suggestions? I saw that there was a thread recently in which the planner wasn't correctly estimating the cost for queries using LIMIT. Is it possible that something similar is happening here (perhaps because of the sort) and that the patch Tom proposed would fix it?
Thanks. Here are the various queries and plans:
Normal settings
------------------------
explain analyze
select
c.sourceId,
c.targetId,
abs(c.tr - c.sr) as xmy,
(c.sr - s.ar) * (c.tr - t.ar) as xy,
(c.sr - s.ar) * (c.sr - s.ar) as x2,
(c.tr - t.ar) * (c.tr - t.ar) as y2
from
candidates617004 c,
lte_user s,
lte_user t
where
c.sourceId = s.user_id
and c.targetId = t.user_id
order by
c.sourceId,
c.targetId;
QUERY PLAN
Sort (cost=13430.57..13439.24 rows=3467 width=48) (actual time=1390.000..1390.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Merge Join (cost=9912.07..13226.72 rows=3467 width=48) (actual time=1344.000..1375.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..95.000 rows=50034 loops=1)
-> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual time=1156.000..1156.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Hash Join (cost=8710.44..9708.21 rows=3467 width=40) (actual time=1125.000..1156.000 rows=3467 loops=1)
Hash Cond: ("outer".targetid = "inner".user_id)
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000 rows=3467 loops=1)
-> Hash (cost=8011.95..8011.95 rows=279395 width=16) (actual time=1125.000..1125.000 rows=0 loops=1)
-> Seq Scan on lte_user t (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..670.000 rows=279395 loops=1)
Total runtime: 1406.000 ms
enable_hashjoin disabled
----------------------------------------
QUERY PLAN
Sort (cost=14355.37..14364.03 rows=3467 width=48) (actual time=391.000..391.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=271.52..14151.51 rows=3467 width=48) (actual time=203.000..359.000 rows=3467 loops=1)
-> Merge Join (cost=271.52..3490.83 rows=3467 width=40) (actual time=203.000..218.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..126.000 rows=50034 loops=1)
-> Sort (cost=271.52..280.19 rows=3467 width=32) (actual time=15.000..30.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user t (cost=0.00..3.03 rows=1 width=16) (actual time=0.031..0.036 rows=1 loops=3467)
Index Cond: ("outer".targetid = t.user_id)
Total runtime: 406.000 ms
random_page_cost set to 1.5
----------------------------------------------
QUERY PLAN
Sort (cost=12702.62..12711.29 rows=3467 width=48) (actual time=1407.000..1407.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Merge Join (cost=9912.07..12498.77 rows=3467 width=48) (actual time=1391.000..1407.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..46.000 rows=50034 loops=1)
-> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual time=1188.000..1188.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Hash Join (cost=8710.44..9708.21 rows=3467 width=40) (actual time=1157.000..1188.000 rows=3467 loops=1)
Hash Cond: ("outer".targetid = "inner".user_id)
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000 rows=3467 loops=1)
-> Hash (cost=8011.95..8011.95 rows=279395 width=16) (actual time=1157.000..1157.000 rows=0 loops=1)
-> Seq Scan on lte_user t (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..750.000 rows=279395 loops=1)
Total runtime: 1422.000 ms
random_page_cost set to 1.5 and enable_hashjoin set to false
--------------------------------------------------------------------------------------------------
QUERY PLAN
Sort (cost=13565.58..13574.25 rows=3467 width=48) (actual time=390.000..390.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=271.52..13361.73 rows=3467 width=48) (actual time=203.000..360.000 rows=3467 loops=1)
-> Merge Join (cost=271.52..2762.88 rows=3467 width=40) (actual time=203.000..250.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..48.000 rows=50034 loops=1)
-> Sort (cost=271.52..280.19 rows=3467 width=32) (actual time=15.000..31.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user t (cost=0.00..3.02 rows=1 width=16) (actual time=0.023..0.023 rows=1 loops=3467)
Index Cond: ("outer".targetid = t.user_id)
Total runtime: 406.000 ms
Thanks,
Meetesh
We're using 8.0.3 and we're seeing a problem where the planner is choosing a seq scan and hash join over an index scan. If I set enable_hashjoin to off, then I get the plan I'm expecting and the query runs a lot faster. I've also tried lowering the random page cost (even to 1) but the planner still chooses to use the hash join.
Does anyone have any thoughts/suggestions? I saw that there was a thread recently in which the planner wasn't correctly estimating the cost for queries using LIMIT. Is it possible that something similar is happening here (perhaps because of the sort) and that the patch Tom proposed would fix it?
Thanks. Here are the various queries and plans:
Normal settings
------------------------
explain analyze
select
c.sourceId,
c.targetId,
abs(c.tr - c.sr) as xmy,
(c.sr - s.ar) * (c.tr - t.ar) as xy,
(c.sr - s.ar) * (c.sr - s.ar) as x2,
(c.tr - t.ar) * (c.tr - t.ar) as y2
from
candidates617004 c,
lte_user s,
lte_user t
where
c.sourceId = s.user_id
and c.targetId = t.user_id
order by
c.sourceId,
c.targetId;
QUERY PLAN
Sort (cost=13430.57..13439.24 rows=3467 width=48) (actual time=1390.000..1390.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Merge Join (cost=9912.07..13226.72 rows=3467 width=48) (actual time=1344.000..1375.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..95.000 rows=50034 loops=1)
-> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual time=1156.000..1156.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Hash Join (cost=8710.44..9708.21 rows=3467 width=40) (actual time=1125.000..1156.000 rows=3467 loops=1)
Hash Cond: ("outer".targetid = "inner".user_id)
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000 rows=3467 loops=1)
-> Hash (cost=8011.95..8011.95 rows=279395 width=16) (actual time=1125.000..1125.000 rows=0 loops=1)
-> Seq Scan on lte_user t (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..670.000 rows=279395 loops=1)
Total runtime: 1406.000 ms
enable_hashjoin disabled
----------------------------------------
QUERY PLAN
Sort (cost=14355.37..14364.03 rows=3467 width=48) (actual time=391.000..391.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=271.52..14151.51 rows=3467 width=48) (actual time=203.000..359.000 rows=3467 loops=1)
-> Merge Join (cost=271.52..3490.83 rows=3467 width=40) (actual time=203.000..218.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..126.000 rows=50034 loops=1)
-> Sort (cost=271.52..280.19 rows=3467 width=32) (actual time=15.000..30.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user t (cost=0.00..3.03 rows=1 width=16) (actual time=0.031..0.036 rows=1 loops=3467)
Index Cond: ("outer".targetid = t.user_id)
Total runtime: 406.000 ms
random_page_cost set to 1.5
----------------------------------------------
QUERY PLAN
Sort (cost=12702.62..12711.29 rows=3467 width=48) (actual time=1407.000..1407.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Merge Join (cost=9912.07..12498.77 rows=3467 width=48) (actual time=1391.000..1407.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..46.000 rows=50034 loops=1)
-> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual time=1188.000..1188.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Hash Join (cost=8710.44..9708.21 rows=3467 width=40) (actual time=1157.000..1188.000 rows=3467 loops=1)
Hash Cond: ("outer".targetid = "inner".user_id)
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000 rows=3467 loops=1)
-> Hash (cost=8011.95..8011.95 rows=279395 width=16) (actual time=1157.000..1157.000 rows=0 loops=1)
-> Seq Scan on lte_user t (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..750.000 rows=279395 loops=1)
Total runtime: 1422.000 ms
random_page_cost set to 1.5 and enable_hashjoin set to false
--------------------------------------------------------------------------------------------------
QUERY PLAN
Sort (cost=13565.58..13574.25 rows=3467 width=48) (actual time=390.000..390.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=271.52..13361.73 rows=3467 width=48) (actual time=203.000..360.000 rows=3467 loops=1)
-> Merge Join (cost=271.52..2762.88 rows=3467 width=40) (actual time=203.000..250.000 rows=3467 loops=1)
Merge Cond: ("outer".user_id = "inner".sourceid)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..48.000 rows=50034 loops=1)
-> Sort (cost=271.52..280.19 rows=3467 width=32) (actual time=15.000..31.000 rows=3467 loops=1)
Sort Key: c.sourceid
-> Seq Scan on candidates617004 c (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user t (cost=0.00..3.02 rows=1 width=16) (actual time=0.023..0.023 rows=1 loops=3467)
Index Cond: ("outer".targetid = t.user_id)
Total runtime: 406.000 ms
Thanks,
Meetesh
[Meetesh Karia - Tue at 12:19:27AM +0200] > We're using 8.0.3 and we're seeing a problem where the planner is choosing a > seq scan and hash join over an index scan. If I set enable_hashjoin to off, > then I get the plan I'm expecting and the query runs a lot faster. I've also > tried lowering the random page cost (even to 1) but the planner still > chooses to use the hash join. Have you tried increasing the statistics collection? -- Tobias Brox, +47-91700050 Nordicbet, IT dept
Meetesh Karia wrote: > Hi all, > > We're using 8.0.3 and we're seeing a problem where the planner is > choosing a seq scan and hash join over an index scan. If I set > enable_hashjoin to off, then I get the plan I'm expecting and the query > runs a lot faster. I've also tried lowering the random page cost (even > to 1) but the planner still chooses to use the hash join. > > Does anyone have any thoughts/suggestions? I saw that there was a > thread recently in which the planner wasn't correctly estimating the > cost for queries using LIMIT. Is it possible that something similar is > happening here (perhaps because of the sort) and that the patch Tom > proposed would fix it? > > Thanks. Here are the various queries and plans: > > Normal settings ... > QUERY PLAN > Sort (cost=13430.57..13439.24 rows=3467 width=48) (actual > time=1390.000..1390.000 rows=3467 loops=1) > Sort Key: c.sourceid, c.targetid > -> Merge Join (cost=9912.07..13226.72 rows=3467 width=48) (actual > time=1344.000..1375.000 rows=3467 loops=1) > Merge Cond: ("outer".user_id = "inner".sourceid) > -> Index Scan using lte_user_pkey on lte_user s > (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..95.000 > rows=50034 loops=1) This is where the planner is messing up, and mis-estimating the selectivity. It is expecting to get 280k rows, but only needs to get 50k. I assume lte_user is the bigger table, and that candidates617004 has some subset. Has lte_user and candidates617004 been recently ANALYZEd? All estimates, except for the expected number of rows from lte_user seem to be okay. Is user_id the primary key for lte_user? I'm trying to figure out how you can get 50k rows, by searching a primary key, against a 3.5k rows. Is user_id only part of the primary key for lte_user? Can you give us the output of: \d lte_user \d candidates617004 So that we have the description of the tables, and what indexes you have defined? Also, if you could describe the table layouts, that would help. John =:-> > -> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual > time=1156.000..1156.000 rows=3467 loops=1) > Sort Key: c.sourceid > -> Hash Join (cost=8710.44..9708.21 rows=3467 width=40) > (actual time=1125.000..1156.000 rows=3467 loops=1) > Hash Cond: ("outer".targetid = "inner".user_id) > -> Seq Scan on candidates617004 c > (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000 > rows=3467 loops=1) > -> Hash (cost=8011.95..8011.95 rows=279395 > width=16) (actual time=1125.000..1125.000 rows=0 loops=1) > -> Seq Scan on lte_user t > (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..670.000 > rows=279395 loops=1) > Total runtime: 1406.000 ms > > enable_hashjoin disabled > ---------------------------------------- > QUERY PLAN > Sort (cost=14355.37..14364.03 rows=3467 width=48) (actual > time=391.000..391.000 rows=3467 loops=1) > Sort Key: c.sourceid, c.targetid > -> Nested Loop (cost=271.52..14151.51 rows=3467 width=48) (actual > time=203.000..359.000 rows=3467 loops=1) > -> Merge Join (cost=271.52..3490.83 rows=3467 width=40) > (actual time=203.000..218.000 rows=3467 loops=1) > Merge Cond: ("outer".user_id = "inner".sourceid) > -> Index Scan using lte_user_pkey on lte_user s > (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..126.000 > rows=50034 loops=1) > -> Sort (cost=271.52..280.19 rows=3467 width=32) (actual > time=15.000..30.000 rows=3467 loops=1) > Sort Key: c.sourceid > -> Seq Scan on candidates617004 c > (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000 > rows=3467 loops=1) > -> Index Scan using lte_user_pkey on lte_user t > (cost=0.00..3.03 rows=1 width=16) (actual time=0.031..0.036 rows=1 > loops=3467) > Index Cond: ("outer".targetid = t.user_id) > Total runtime: 406.000 ms > > random_page_cost set to 1.5 > ---------------------------------------------- > QUERY PLAN > Sort (cost=12702.62..12711.29 rows=3467 width=48) (actual > time=1407.000..1407.000 rows=3467 loops=1) > Sort Key: c.sourceid, c.targetid > -> Merge Join (cost=9912.07..12498.77 rows=3467 width=48) (actual > time=1391.000..1407.000 rows=3467 loops=1) > Merge Cond: ("outer".user_id = "inner".sourceid) > -> Index Scan using lte_user_pkey on lte_user s > (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..46.000 > rows=50034 loops=1) > -> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual > time=1188.000..1188.000 rows=3467 loops=1) > Sort Key: c.sourceid > -> Hash Join (cost=8710.44..9708.21 rows=3467 width=40) > (actual time=1157.000..1188.000 rows=3467 loops=1) > Hash Cond: ("outer".targetid = "inner".user_id) > -> Seq Scan on candidates617004 c > (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000 > rows=3467 loops=1) > -> Hash (cost=8011.95..8011.95 rows=279395 > width=16) (actual time=1157.000..1157.000 rows=0 loops=1) > -> Seq Scan on lte_user t > (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..750.000 > rows=279395 loops=1) > Total runtime: 1422.000 ms > > random_page_cost set to 1.5 and enable_hashjoin set to false > -------------------------------------------------------------------------------------------------- > QUERY PLAN > Sort (cost=13565.58..13574.25 rows=3467 width=48) (actual > time=390.000..390.000 rows=3467 loops=1) > Sort Key: c.sourceid, c.targetid > -> Nested Loop (cost=271.52..13361.73 rows=3467 width=48) (actual > time=203.000..360.000 rows=3467 loops=1) > -> Merge Join (cost=271.52..2762.88 rows=3467 width=40) > (actual time=203.000..250.000 rows=3467 loops=1) > Merge Cond: ("outer".user_id = "inner".sourceid) > -> Index Scan using lte_user_pkey on lte_user s > (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..48.000 > rows=50034 loops=1) > -> Sort (cost=271.52..280.19 rows=3467 width=32) (actual > time=15.000..31.000 rows=3467 loops=1) > Sort Key: c.sourceid > -> Seq Scan on candidates617004 c > (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000 > rows=3467 loops=1) > -> Index Scan using lte_user_pkey on lte_user t > (cost=0.00..3.02 rows=1 width=16) (actual time=0.023..0.023 rows=1 > loops=3467) > Index Cond: ("outer".targetid = t.user_id) > Total runtime: 406.000 ms > > Thanks, > Meetesh
Attachment
Are you referring to the statistics gathering target for ANALYZE? Based on your email, I just tried the following and then re-ran the explain analyze but got the same "incorrect" plan:
alter table candidates617004
alter column sourceId set statistics 1000,
alter column targetId set statistics 1000;
analyze candidates617004;
alter table lte_user
alter column user_id set statistics 1000;
analyze lte_user;
Thanks for your suggestion,
Meetesh
alter table candidates617004
alter column sourceId set statistics 1000,
alter column targetId set statistics 1000;
analyze candidates617004;
alter table lte_user
alter column user_id set statistics 1000;
analyze lte_user;
Thanks for your suggestion,
Meetesh
On 8/2/05, Tobias Brox <tobias@nordicbet.com> wrote:
[Meetesh Karia - Tue at 12:19:27AM +0200]
> We're using 8.0.3 and we're seeing a problem where the planner is choosing a
> seq scan and hash join over an index scan. If I set enable_hashjoin to off,
> then I get the plan I'm expecting and the query runs a lot faster. I've also
> tried lowering the random page cost (even to 1) but the planner still
> chooses to use the hash join.
Have you tried increasing the statistics collection?
--
Tobias Brox, +47-91700050
Nordicbet, IT dept
Thanks John. I've answered your questions below:
I ANALYZEd both tables just before putting together my first email. And, unfortunately, modifying the statistics target didn't help either.
Yes
Hmmm ... I missed that before. But, that surprises me too. Especially since sourceId in the candidates table has only 1 value. Also, user_id is the complete primary key for lte_user.
Sure, here they are:
lte=# \d lte_user
Table "public.lte_user"
Column | Type | Modifiers
---------------+-----------------------------+-----------
user_id | bigint | not null
firstname | character varying(255) |
lastname | character varying(255) |
address1 | character varying(255) |
address2 | character varying(255) |
city | character varying(255) |
state | character varying(255) |
zip | character varying(255) |
phone1 | character varying(255) |
phone2 | character varying(255) |
username | character varying(255) |
password | character varying(255) |
deleted | boolean | not null
ext_cust_id | character varying(255) |
aboutme | character varying(255) |
birthday | timestamp without time zone |
fm_id | bigint |
ar | double precision |
Indexes:
"lte_user_pkey" PRIMARY KEY, btree (user_id)
"idx_user_extid" btree (ext_cust_id)
"idx_user_username" btree (username)
Foreign-key constraints:
"fk_user_fm" FOREIGN KEY (fm_id) REFERENCES fm(fm_id)
lte=# \d candidates617004
Table "public.candidates617004"
Column | Type | Modifiers
--------------+------------------+-----------
fmid | bigint |
sourceid | bigint |
sr | double precision |
targetid | bigint |
tr | double precision |
Sure. The lte_user table is just a collection of users. user_id is assigned uniquely using a sequence. During some processing, we create a candidates table (candidates617004 in our case). This table is usually a temp table. sourceid is a user_id (in this case it is always 617004) and targetid is also a user_id (2860 distinct values out of 3467). The rest of the information is either only used in the select clause or not used at all during this processing.
Did I miss something in the table layout description that would be helpful?
Thanks for your help!
Meetesh
Has lte_user and candidates617004 been recently ANALYZEd? All estimates,
except for the expected number of rows from lte_user seem to be okay.
I ANALYZEd both tables just before putting together my first email. And, unfortunately, modifying the statistics target didn't help either.
Is user_id the primary key for lte_user?
Yes
I'm trying to figure out how you can get 50k rows, by searching a
primary key, against a 3.5k rows. Is user_id only part of the primary
key for lte_user?
Hmmm ... I missed that before. But, that surprises me too. Especially since sourceId in the candidates table has only 1 value. Also, user_id is the complete primary key for lte_user.
Can you give us the output of:
\d lte_user
\d candidates617004
Sure, here they are:
lte=# \d lte_user
Table "public.lte_user"
Column | Type | Modifiers
---------------+-----------------------------+-----------
user_id | bigint | not null
firstname | character varying(255) |
lastname | character varying(255) |
address1 | character varying(255) |
address2 | character varying(255) |
city | character varying(255) |
state | character varying(255) |
zip | character varying(255) |
phone1 | character varying(255) |
phone2 | character varying(255) |
username | character varying(255) |
password | character varying(255) |
deleted | boolean | not null
ext_cust_id | character varying(255) |
aboutme | character varying(255) |
birthday | timestamp without time zone |
fm_id | bigint |
ar | double precision |
Indexes:
"lte_user_pkey" PRIMARY KEY, btree (user_id)
"idx_user_extid" btree (ext_cust_id)
"idx_user_username" btree (username)
Foreign-key constraints:
"fk_user_fm" FOREIGN KEY (fm_id) REFERENCES fm(fm_id)
lte=# \d candidates617004
Table "public.candidates617004"
Column | Type | Modifiers
--------------+------------------+-----------
fmid | bigint |
sourceid | bigint |
sr | double precision |
targetid | bigint |
tr | double precision |
Also, if you could describe the table layouts, that would help.
Sure. The lte_user table is just a collection of users. user_id is assigned uniquely using a sequence. During some processing, we create a candidates table (candidates617004 in our case). This table is usually a temp table. sourceid is a user_id (in this case it is always 617004) and targetid is also a user_id (2860 distinct values out of 3467). The rest of the information is either only used in the select clause or not used at all during this processing.
Did I miss something in the table layout description that would be helpful?
Thanks for your help!
Meetesh
> -> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual
> time=1156.000..1156.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Hash Join (cost=8710.44..9708.21 rows=3467 width=40)
> (actual time=1125.000..1156.000 rows=3467 loops=1)
> Hash Cond: ("outer".targetid = "inner".user_id)
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000
> rows=3467 loops=1)
> -> Hash (cost=8011.95..8011.95 rows=279395
> width=16) (actual time=1125.000..1125.000 rows=0 loops=1)
> -> Seq Scan on lte_user t
> (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..670.000
> rows=279395 loops=1)
> Total runtime: 1406.000 ms
>
> enable_hashjoin disabled
> ----------------------------------------
> QUERY PLAN
> Sort (cost=14355.37..14364.03 rows=3467 width=48) (actual
> time=391.000..391.000 rows=3467 loops=1)
> Sort Key: c.sourceid, c.targetid
> -> Nested Loop (cost=271.52..14151.51 rows=3467 width=48) (actual
> time=203.000..359.000 rows=3467 loops=1)
> -> Merge Join (cost=271.52..3490.83 rows=3467 width=40)
> (actual time=203.000..218.000 rows=3467 loops=1)
> Merge Cond: ("outer".user_id = "inner".sourceid)
> -> Index Scan using lte_user_pkey on lte_user s
> (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..126.000
> rows=50034 loops=1)
> -> Sort (cost=271.52..280.19 rows=3467 width=32) (actual
> time=15.000..30.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000
> rows=3467 loops=1)
> -> Index Scan using lte_user_pkey on lte_user t
> (cost=0.00..3.03 rows=1 width=16) (actual time=0.031..0.036 rows=1
> loops=3467)
> Index Cond: ("outer".targetid = t.user_id)
> Total runtime: 406.000 ms
>
> random_page_cost set to 1.5
> ----------------------------------------------
> QUERY PLAN
> Sort (cost= 12702.62..12711.29 rows=3467 width=48) (actual
> time=1407.000..1407.000 rows=3467 loops=1)
> Sort Key: c.sourceid, c.targetid
> -> Merge Join (cost=9912.07..12498.77 rows=3467 width=48) (actual
> time=1391.000..1407.000 rows=3467 loops=1)
> Merge Cond: ("outer".user_id = "inner".sourceid)
> -> Index Scan using lte_user_pkey on lte_user s
> (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..46.000
> rows=50034 loops=1)
> -> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual
> time=1188.000..1188.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Hash Join (cost=8710.44..9708.21 rows=3467 width=40)
> (actual time=1157.000..1188.000 rows=3467 loops=1)
> Hash Cond: ("outer".targetid = "inner".user_id)
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000
> rows=3467 loops=1)
> -> Hash (cost=8011.95..8011.95 rows=279395
> width=16) (actual time=1157.000..1157.000 rows=0 loops=1)
> -> Seq Scan on lte_user t
> (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..750.000
> rows=279395 loops=1)
> Total runtime: 1422.000 ms
>
> random_page_cost set to 1.5 and enable_hashjoin set to false
> --------------------------------------------------------------------------------------------------
> QUERY PLAN
> Sort (cost=13565.58..13574.25 rows=3467 width=48) (actual
> time=390.000..390.000 rows=3467 loops=1)
> Sort Key: c.sourceid, c.targetid
> -> Nested Loop (cost=271.52..13361.73 rows=3467 width=48) (actual
> time=203.000..360.000 rows=3467 loops=1)
> -> Merge Join (cost=271.52..2762.88 rows=3467 width=40)
> (actual time=203.000..250.000 rows=3467 loops=1)
> Merge Cond: ("outer".user_id = "inner".sourceid)
> -> Index Scan using lte_user_pkey on lte_user s
> (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..48.000
> rows=50034 loops=1)
> -> Sort (cost=271.52..280.19 rows=3467 width=32) (actual
> time=15.000..31.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000
> rows=3467 loops=1)
> -> Index Scan using lte_user_pkey on lte_user t
> (cost= 0.00..3.02 rows=1 width=16) (actual time=0.023..0.023 rows=1
> loops=3467)
> Index Cond: ("outer".targetid = t.user_id)
> Total runtime: 406.000 ms
>
> Thanks,
> Meetesh
Meetesh Karia <meetesh.karia@gmail.com> writes: > Sure. The lte_user table is just a collection of users. user_id is assigned= > uniquely using a sequence. During some processing, we create a candidates= > table (candidates617004 in our case). This table is usually a temp table.= > sourceid is a user_id (in this case it is always 617004) and targetid is=20 > also a user_id (2860 distinct values out of 3467). The rest of the=20 > information is either only used in the select clause or not used at all=20 > during this processing. If you know that sourceid has only a single value, it'd probably be helpful to call out that value in the query, ie, where ... AND c.sourceId = 617004 ... regards, tom lane
Thanks Tom,
That modifies the query plan slightly, but the planner still decides to do a hash join for the lte_user table aliased 't'. Though, if I make this change and set enable_hashjoin to off, the query plan (and execution time) gets even better.
enable_hashjoin = on
----------------------------------
QUERY PLAN
Sort (cost=10113.35..10122.02 rows=3467 width=48) (actual time=1203.000..1203.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=8711.19..9909.50 rows=3467 width=48) (actual time=1156.000..1203.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..3.02 rows=1 width=16) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (617004 = user_id)
-> Hash Join (cost=8711.19..9776.46 rows=3467 width=40) (actual time=1156.000..1187.000 rows=3467 loops=1)
Hash Cond: ("outer".targetid = "inner".user_id)
-> Seq Scan on candidates617004 c (cost=0.00..76.34 rows=3467 width=32) (actual time=0.000..16.000 rows=3467 loops=1)
Filter: (sourceid = 617004)
-> Hash (cost=8012.55..8012.55 rows=279455 width=16) (actual time=1141.000..1141.000 rows=0 loops=1)
-> Seq Scan on lte_user t (cost=0.00..8012.55 rows=279455 width=16) (actual time=0.000..720.000 rows=279395 loops=1)
Total runtime: 1218.000 ms
enable_hashjoin = off
-----------------------------------
QUERY PLAN
Sort (cost=10942.56..10951.22 rows=3467 width=48) (actual time=188.000..188.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=0.00..10738.71 rows=3467 width=48) (actual time=0.000..188.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..3.02 rows=1 width=16) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (617004 = user_id)
-> Nested Loop (cost=0.00..10605.67 rows=3467 width=40) (actual time=0.000..157.000 rows=3467 loops=1)
-> Seq Scan on candidates617004 c (cost=0.00..76.34 rows=3467 width=32) (actual time=0.000..15.000 rows=3467 loops=1)
Filter: (sourceid = 617004)
-> Index Scan using lte_user_pkey on lte_user t (cost=0.00..3.02 rows=1 width=16) (actual time=0.028..0.037 rows=1 loops=3467)
Index Cond: ("outer".targetid = t.user_id)
Total runtime: 188.000 ms
Thanks,
Meetesh
That modifies the query plan slightly, but the planner still decides to do a hash join for the lte_user table aliased 't'. Though, if I make this change and set enable_hashjoin to off, the query plan (and execution time) gets even better.
enable_hashjoin = on
----------------------------------
QUERY PLAN
Sort (cost=10113.35..10122.02 rows=3467 width=48) (actual time=1203.000..1203.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=8711.19..9909.50 rows=3467 width=48) (actual time=1156.000..1203.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..3.02 rows=1 width=16) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (617004 = user_id)
-> Hash Join (cost=8711.19..9776.46 rows=3467 width=40) (actual time=1156.000..1187.000 rows=3467 loops=1)
Hash Cond: ("outer".targetid = "inner".user_id)
-> Seq Scan on candidates617004 c (cost=0.00..76.34 rows=3467 width=32) (actual time=0.000..16.000 rows=3467 loops=1)
Filter: (sourceid = 617004)
-> Hash (cost=8012.55..8012.55 rows=279455 width=16) (actual time=1141.000..1141.000 rows=0 loops=1)
-> Seq Scan on lte_user t (cost=0.00..8012.55 rows=279455 width=16) (actual time=0.000..720.000 rows=279395 loops=1)
Total runtime: 1218.000 ms
enable_hashjoin = off
-----------------------------------
QUERY PLAN
Sort (cost=10942.56..10951.22 rows=3467 width=48) (actual time=188.000..188.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=0.00..10738.71 rows=3467 width=48) (actual time=0.000..188.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..3.02 rows=1 width=16) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (617004 = user_id)
-> Nested Loop (cost=0.00..10605.67 rows=3467 width=40) (actual time=0.000..157.000 rows=3467 loops=1)
-> Seq Scan on candidates617004 c (cost=0.00..76.34 rows=3467 width=32) (actual time=0.000..15.000 rows=3467 loops=1)
Filter: (sourceid = 617004)
-> Index Scan using lte_user_pkey on lte_user t (cost=0.00..3.02 rows=1 width=16) (actual time=0.028..0.037 rows=1 loops=3467)
Index Cond: ("outer".targetid = t.user_id)
Total runtime: 188.000 ms
Thanks,
Meetesh
On 8/2/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Meetesh Karia <meetesh.karia@gmail.com> writes:
> Sure. The lte_user table is just a collection of users. user_id is assigned=
> uniquely using a sequence. During some processing, we create a candidates=
> table (candidates617004 in our case). This table is usually a temp table.=
> sourceid is a user_id (in this case it is always 617004) and targetid is=20
> also a user_id (2860 distinct values out of 3467). The rest of the=20
> information is either only used in the select clause or not used at all=20
> during this processing.
If you know that sourceid has only a single value, it'd probably be
helpful to call out that value in the query, ie,
where ... AND c.sourceId = 617004 ...
regards, tom lane
Btw - I tried playing around with some of the other planner cost constants but I wasn't able to get the planner to choose the index scan. It seems like the issue is that the estimated cost for fetching one row from the index (3.02) is a little high in my case. Is there any way that I can adjust that cost estimate? Are there any side effects of doing that? Or is my best solution to simple set enable_hashjoin to off for this query?
Thanks,
Meetesh
Thanks,
Meetesh
On 8/2/05, Meetesh Karia <meetesh.karia@gmail.com> wrote:
Thanks Tom,
That modifies the query plan slightly, but the planner still decides to do a hash join for the lte_user table aliased 't'. Though, if I make this change and set enable_hashjoin to off, the query plan (and execution time) gets even better.
enable_hashjoin = on
----------------------------------
QUERY PLAN
Sort (cost=10113.35..10122.02 rows=3467 width=48) (actual time=1203.000..1203.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=8711.19..9909.50 rows=3467 width=48) (actual time=1156.000..1203.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..3.02 rows=1 width=16) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (617004 = user_id)
-> Hash Join (cost=8711.19..9776.46 rows=3467 width=40) (actual time=1156.000..1187.000 rows=3467 loops=1)
Hash Cond: ("outer".targetid = "inner".user_id)
-> Seq Scan on candidates617004 c (cost=0.00..76.34 rows=3467 width=32) (actual time=0.000..16.000 rows=3467 loops=1)
Filter: (sourceid = 617004)
-> Hash (cost=8012.55..8012.55 rows=279455 width=16) (actual time=1141.000..1141.000 rows=0 loops=1)
-> Seq Scan on lte_user t (cost=0.00..8012.55 rows=279455 width=16) (actual time=0.000..720.000 rows=279395 loops=1)
Total runtime: 1218.000 ms
enable_hashjoin = off
-----------------------------------
QUERY PLAN
Sort (cost=10942.56..10951.22 rows=3467 width=48) (actual time=188.000..188.000 rows=3467 loops=1)
Sort Key: c.sourceid, c.targetid
-> Nested Loop (cost=0.00..10738.71 rows=3467 width=48) (actual time=0.000..188.000 rows=3467 loops=1)
-> Index Scan using lte_user_pkey on lte_user s (cost=0.00..3.02 rows=1 width=16) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (617004 = user_id)
-> Nested Loop (cost=0.00..10605.67 rows=3467 width=40) (actual time=0.000..157.000 rows=3467 loops=1)
-> Seq Scan on candidates617004 c (cost=0.00..76.34 rows=3467 width=32) (actual time=0.000..15.000 rows=3467 loops=1)
Filter: (sourceid = 617004)
-> Index Scan using lte_user_pkey on lte_user t (cost=0.00..3.02 rows=1 width=16) (actual time=0.028..0.037 rows=1 loops=3467)
Index Cond: ("outer".targetid = t.user_id)
Total runtime: 188.000 ms
Thanks,
MeeteshOn 8/2/05, Tom Lane < tgl@sss.pgh.pa.us> wrote:Meetesh Karia <meetesh.karia@gmail.com> writes:
> Sure. The lte_user table is just a collection of users. user_id is assigned=
> uniquely using a sequence. During some processing, we create a candidates=
> table (candidates617004 in our case). This table is usually a temp table.=
> sourceid is a user_id (in this case it is always 617004) and targetid is=20
> also a user_id (2860 distinct values out of 3467). The rest of the=20
> information is either only used in the select clause or not used at all=20
> during this processing.
If you know that sourceid has only a single value, it'd probably be
helpful to call out that value in the query, ie,
where ... AND c.sourceId = 617004 ...
regards, tom lane