Thread: BUG #6335: Weird planner decision with exists (a join b) condition
The following bug has been logged on the website: Bug reference: 6335 Logged by: Maksym Boguk Email address: maxim.boguk@gmail.com PostgreSQL version: 9.0.4 Operating system: Linux Ubuntu Description:=20=20=20=20=20=20=20=20 I was explored reasons of high DB load and I localized the next problem query: That is correct version: EXPLAIN ANALYZE select * from applicant_adv_subscription aas where aas.user_id in (5112699) and exists ( SELECT * from resume join resume_view_history using (resume_id) where resume.user_id =3D aas.user_id ); =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20 QUERY PLAN=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =20=20=20=20=20=20=20=20=20=20=20=20=20=20 ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ------------------------ Nested Loop Semi Join (cost=3D0.00..20.51 rows=3D1 width=3D65) (actual time=3D0.031..0.032 rows=3D1 loops=3D1) -> Index Scan using applicant_adv_subscription_user_id_key on applicant_adv_subscription aas (cost=3D0.00..0.02 rows=3D1 width=3D65) (ac= tual time=3D0.011..0.012 rows=3D1 loops=3D1) Index Cond: (user_id =3D 5112699) -> Nested Loop (cost=3D0.00..20.49 rows=3D3118 width=3D4) (actual time=3D0.018..0.018 rows=3D1 loops=3D1) -> Index Scan using resume_user_id_key on resume (cost=3D0.00..0= .13 rows=3D18 width=3D8) (actual time=3D0.008..0.008 rows=3D1 loops=3D1) Index Cond: (user_id =3D 5112699) -> Index Scan using resume_view_history_fk73b63ccd36b06a5 on resume_view_history (cost=3D0.00..0.95 rows=3D173 width=3D4) (actual time=3D0.009..0.009 rows=3D1 loops=3D1) Index Cond: (resume_view_history.resume_id =3D resume.resume_id) Total runtime: 0.080 ms But once I add second value into IN list plan become completely screwed: EXPLAIN analyze select * from applicant_adv_subscription aas where aas.user_id in (5112699,7995496) and exists ( SELECT * from resume join resume_view_history using (resume_id) where resume.user_id =3D aas.user_id ); QUERY PLAN ---------------------------------------------------------------------------= -------------------------------------------------------------------------- Nested Loop Semi Join (cost=3D38967.39..735076.65 rows=3D2 width=3D65) (a= ctual time=3D14656.388..389866.211 rows=3D1 loops=3D1) Join Filter: (aas.user_id =3D resume.user_id) -> Bitmap Heap Scan on applicant_adv_subscription aas (cost=3D0.02..0.= 04 rows=3D2 width=3D65) (actual time=3D0.018..0.021 rows=3D2 loops=3D1) Recheck Cond: (user_id =3D ANY ('{5112699,7995496}'::integer[])) -> Bitmap Index Scan on applicant_adv_subscription_user_id_key=20 (cost=3D0.00..0.02 rows=3D2 width=3D0) (actual time=3D0.014..0.014 rows=3D2= loops=3D1) Index Cond: (user_id =3D ANY ('{5112699,7995496}'::integer[]= )) -> Hash Join (cost=3D38967.36..726839.23 rows=3D272203680 width=3D4) (= actual time=3D13267.456..182842.841 rows=3D136136926 loops=3D2) Hash Cond: (resume_view_history.resume_id =3D resume.resume_id) -> Seq Scan on resume_view_history (cost=3D0.00..282228.92 rows=3D272203680 width=3D4) (actual time=3D0.004..25574.666 rows=3D136161776 loops=3D2) -> Hash (cost=3D21737.05..21737.05 rows=3D16110150 width=3D8) (a= ctual time=3D13260.145..13260.145 rows=3D16114222 loops=3D2) Buckets: 2097152 Batches: 2 Memory Usage: 314776kB -> Seq Scan on resume (cost=3D0.00..21737.05 rows=3D161101= 50 width=3D8) (actual time=3D0.005..8839.480 rows=3D16114222 loops=3D2) Total runtime: 389866.374 ms Ooops. Changes in *_cost settings have no effect. Disabling seq_scan/merge_join/hash_join doest not help as well. Nothing special about tables, all required indexes on the place.
Excerpts from maxim.boguk's message of mi=C3=A9 dic 14 08:09:38 -0300 2011: > But once I add second value into IN list plan become completely screwed: See here: http://archives.postgresql.org/message-id/1309918036-sup-4092@alvh.no-ip.org Perhaps it's a similar problem. Maybe you'd get enthused enough to try to fix the problem? --=20 =C3=81lvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Here goes self-contained test case. I tested it on the 9.1.2, 9.1.1, 9.0.5, 9.0.4, 8.4.7 all of them affected by the problem: select version(); drop table if exists test1; drop table if exists test2; drop table if exists test3; CREATE TABLE test1 AS SELECT user_id FROM generate_series(1,1000000) AS gs(user_id); CREATE TABLE test2 AS SELECT user_id,user_id AS resume_id FROM generate_series(1,1000000) AS gs(user_id); CREATE TABLE test3 AS SELECT user_id AS resume_id FROM generate_series(1,1000000) AS gs(user_id); create index test1_user_id_key on test1(user_id); create index test2_user_id_key on test2(user_id); create index test2_resume_id_key on test2(resume_id); create index test3_resume_id_key on test3(resume_id); analyze test1; analyze test2; analyze test3; --good EXPLAIN ANALYZE select * from test1 where test1.user_id in (100) and exists ( SELECT * from test2 join test3 using (resume_id) where test2.user_id =3D test1.user_id ); --bad EXPLAIN ANALYZE select * from test1 where test1.user_id in (100, 101) and exists ( SELECT * from test2 join test3 using (resume_id) where test2.user_id =3D test1.user_id ); On Thu, Dec 15, 2011 at 1:00 AM, Alvaro Herrera <alvherre@commandprompt.com>wrote: > > Excerpts from maxim.boguk's message of mi=C3=A9 dic 14 08:09:38 -0300 201= 1: > > > But once I add second value into IN list plan become completely screwed: > > See here: > > http://archives.postgresql.org/message-id/1309918036-sup-4092@alvh.no-ip.= org > Perhaps it's a similar problem. > > Maybe you'd get enthused enough to try to fix the problem? > > -- > =C3=81lvaro Herrera <alvherre@commandprompt.com> > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > --=20 Maxim Boguk Senior Postgresql DBA. Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk If they can send one man to the moon... why can't they send them all? =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
On Wed, Dec 14, 2011 at 4:53 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote: > Here goes self-contained test case. > > I tested it on the 9.1.2, 9.1.1, 9.0.5, 9.0.4, 8.4.7 I just tested on 9.1.2 and see the same issue. > --bad > EXPLAIN ANALYZE=C2=A0 select * > from test1 > where > test1.user_id in (100, 101) > and exists ( > SELECT * from test2 > join test3 using (resume_id) > where > test2.user_id =3D test1.user_id > ); Setting enable_hashjoin to false pushes it back to a good plan again.
On Thu, Dec 15, 2011 at 12:00 PM, bricklen <bricklen@gmail.com> wrote: > On Wed, Dec 14, 2011 at 4:53 PM, Maxim Boguk <maxim.boguk@gmail.com> > wrote: > > Here goes self-contained test case. > > > > I tested it on the 9.1.2, 9.1.1, 9.0.5, 9.0.4, 8.4.7 > > I just tested on 9.1.2 and see the same issue. > > > --bad > > EXPLAIN ANALYZE select * > > from test1 > > where > > test1.user_id in (100, 101) > > and exists ( > > SELECT * from test2 > > join test3 using (resume_id) > > where > > test2.user_id =3D test1.user_id > > ); > > Setting enable_hashjoin to false pushes it back to a good plan again. > Could you show explain analyze of the good plan please? If you getting plan like: QUERY PLAN ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= --- Nested Loop Semi Join (cost=3D3.78..91844.51 rows=3D2 width=3D4) (actual time=3D0.634..0.750 rows=3D2 loops=3D1) Join Filter: (test1.user_id =3D test2.user_id) -> Bitmap Heap Scan on test1 (cost=3D2.54..4.57 rows=3D2 width=3D4) (a= ctual time=3D0.013..0.015 rows=3D2 loops=3D1) Recheck Cond: (user_id =3D ANY ('{100,101}'::integer[])) -> Bitmap Index Scan on test1_user_id_key (cost=3D0.00..2.54 rows=3D2 width=3D0) (actual time=3D0.009..0.009 rows=3D2 loops=3D1) Index Cond: (user_id =3D ANY ('{100,101}'::integer[])) -> Materialize (cost=3D1.23..64339.94 rows=3D1000000 width=3D4) (actual time=3D0.018..0.305 rows=3D100 loops=3D2) -> Merge Join (cost=3D1.23..59339.94 rows=3D1000000 width=3D4) (= actual time=3D0.032..0.446 rows=3D101 loops=3D1) Merge Cond: (test2.resume_id =3D test3.resume_id) -> Index Scan using test2_resume_id_key on test2 (cost=3D0.00..22170.28 rows=3D1000000 width=3D8) (actual time=3D0.009..0.076 rows=3D101 loops=3D1) -> Index Scan using test3_resume_id_key on test3 (cost=3D0.00..22170.28 rows=3D1000000 width=3D4) (actual time=3D0.007..0.075 rows=3D101 loops=3D1) Total runtime: 0.785 ms Try use high values for the user_id : EXPLAIN ANALYZE select * from test1 where test1.user_id in (90000, 900001) and exists ( SELECT * from test2 join test3 using (resume_id) where test2.user_id =3D test1.user_id ); QUERY PLAN ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= -------- Nested Loop Semi Join (cost=3D3.78..91844.51 rows=3D2 width=3D4) (actual time=3D432.266..4457.799 rows=3D2 loops=3D1) Join Filter: (test1.user_id =3D test2.user_id) -> Bitmap Heap Scan on test1 (cost=3D2.54..4.57 rows=3D2 width=3D4) (a= ctual time=3D0.057..0.063 rows=3D2 loops=3D1) Recheck Cond: (user_id =3D ANY ('{90000,900001}'::integer[])) -> Bitmap Index Scan on test1_user_id_key (cost=3D0.00..2.54 rows=3D2 width=3D0) (actual time=3D0.050..0.050 rows=3D2 loops=3D1) Index Cond: (user_id =3D ANY ('{90000,900001}'::integer[])) -> Materialize (cost=3D1.23..64339.94 rows=3D1000000 width=3D4) (actual time=3D0.011..1942.046 rows=3D495000 loops=3D2) -> Merge Join (cost=3D1.23..59339.94 rows=3D1000000 width=3D4) (= actual time=3D0.018..2805.842 rows=3D900001 loops=3D1) Merge Cond: (test2.resume_id =3D test3.resume_id) -> Index Scan using test2_resume_id_key on test2 (cost=3D0.00..22170.28 rows=3D1000000 width=3D8) (actual time=3D0.007..571.= 851 rows=3D900001 loops=3D1) -> Index Scan using test3_resume_id_key on test3 (cost=3D0.00..22170.28 rows=3D1000000 width=3D4) (actual time=3D0.006..594.= 484 rows=3D900001 loops=3D1) Total runtime: 4467.887 ms --=20 Maxim Boguk Senior Postgresql DBA. Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk If they can send one man to the moon... why can't they send them all? =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
maxim.boguk@gmail.com writes: > EXPLAIN analyze select * > from applicant_adv_subscription aas > where > aas.user_id in (5112699,7995496) > and exists ( > SELECT * from resume > join resume_view_history using (resume_id) > where > resume.user_id = aas.user_id > ); I'm hoping to fix this type of case with the "generalized inner indexscan" work that I've been nattering about for a year or two now. What you need to make this fast, given that resume and resume_view_history are both large, is to push the current value of aas.user_id down into the table scan of resume --- and because the join and semijoin can't be reordered, that's not possible with the planner's current simpleminded idea of what an inner indexscan can be. The other example you show manages to luck out and get a good plan due to transitive propagation of equality conditions, but that's a narrow special case. Any other form of constraint whatsoever on aas is going to end up with the crummy plan where the whole lower join gets computed. regards, tom lane
On Thu, Dec 15, 2011 at 7:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > maxim.boguk@gmail.com writes: > > EXPLAIN analyze select * > > from applicant_adv_subscription aas > > where > > aas.user_id in (5112699,7995496) > > and exists ( > > SELECT * from resume > > join resume_view_history using (resume_id) > > where > > resume.user_id = aas.user_id > > ); > > I'm hoping to fix this type of case with the "generalized inner > indexscan" work that I've been nattering about for a year or two now. > What you need to make this fast, given that resume and > resume_view_history are both large, is to push the current value of > aas.user_id down into the table scan of resume --- and because the join > and semijoin can't be reordered, that's not possible with the planner's > current simpleminded idea of what an inner indexscan can be. > > The other example you show manages to luck out and get a good plan due > to transitive propagation of equality conditions, but that's a narrow > special case. Any other form of constraint whatsoever on aas is going > to end up with the crummy plan where the whole lower join gets computed. > > regards, tom lane > Thank you very much for information. Rewriting the query did the trick and resolved performance issues. Do you plan create "generalized inner indexscan" mechanics for 9.2 version? -- Maxim Boguk