Re: b1 OR b2 <-> ( CASE WHEN b1 THE true ELSE b2 END ): performance bottleneck on logical OR - Mailing list pgsql-general
From | Cédric Dufour |
---|---|
Subject | Re: b1 OR b2 <-> ( CASE WHEN b1 THE true ELSE b2 END ): performance bottleneck on logical OR |
Date | |
Msg-id | NDBBIFNBODNADCAOFDOAEEJMCDAA.cedric.dufour@freesurf.ch Whole thread Raw |
In response to | Re: b1 OR b2 <-> ( CASE WHEN b1 THE true ELSE b2 END ): performance bottleneck on logical OR (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-general |
OK, TOM, SORRY FOR THE **INCORRECT** CONCLUSIONS ! I tried to reproduce the problem with a simpler scenario than the one that actually made me investigate the matter (the 'explain analyze' output being more than 500 lines long...)... but the problem didn't re-occur. So I tried to understand where the execution plan changed and why (comparing 2000 lines of 'explain analyze' output from my initial scenario). Here is what I got: 1. this delay (I won't call it a problem anylonger) occurs only when NESTLOOP and MERGEJOIN have been disabled **AND** 2. the FROM clause from one of the base view is changed **FROM** FROM tb_Login AS Login_default, vw_Session AS Session INNER JOIN tb_Owner AS Owner ON ( CASE WHEN ( Session.Admin_flag AND ( ( Session.FK_Owner = 0 ) OR ( Session.FK_Owner = Owner.PK ) ) ) THEN true ELSE ( ( ( Session.FK_Owner_screen = Owner.PK ) OR ( ( Session.FK_Owner_screen = 0 ) AND Owner.Share_flag ) ) AND ( Owner.Enable_flag AND ( ( Owner.Enable_date IS NULL ) OR ( Owner.Enable_date <= CURRENT_TIMESTAMP ) ) AND ( ( Owner.Disable_date IS NULL ) OR ( Owner.Disable_date > CURRENT_TIMESTAMP ) ) ) ) END ) LEFT JOIN vw_Owner_prv AS Owner_prv ON ( Owner.PK = Owner_prv.FK_Owner ) LEFT JOIN vw_Owner_loc AS Owner_loc ON ( Owner.PK = Owner_loc.FK_Owner ) WHERE ( Login_default.PK = 0 ) ** TO ** FROM tb_Login AS Login_default, vw_Session AS Session INNER JOIN tb_Owner AS Owner ** ON ( ( Session.Admin_flag AND ( ( Session.FK_Owner = 0 ) OR ( Session.FK_Owner = Owner.PK ) ) ) OR ** ( ** ( ( Session.FK_Owner_screen = Owner.PK ) OR ( ( Session.FK_Owner_screen = 0 ) AND Owner.Share_flag ) ) ** AND ** ( Owner.Enable_flag AND ( ( Owner.Enable_date IS NULL ) OR ( Owner.Enable_date <= CURRENT_TIMESTAMP ) ) AND ( ( Owner.Disable_date IS NULL ) OR ( Owner.Disable_date > CURRENT_TIMESTAMP ) ) ) ** ) ) LEFT JOIN vw_Owner_prv AS Owner_prv ON ( Owner.PK = Owner_prv.FK_Owner ) LEFT JOIN vw_Owner_loc AS Owner_loc ON ( Owner.PK = Owner_loc.FK_Owner ) WHERE ( Login_default.PK = 0 ) Causing the execution plan to been altered **FROM** Limit (cost=800023564.12..800155912.10 rows=100 width=7915) (actual time=3039.00..3148.00 rows=100 loops=1) -> Hash Join (cost=800023564.12..803266109.18 rows=2450 width=7915) (actual time=3039.00..3148.00 rows=101 loops=1) -> Hash Join (cost=700023184.11..703265708.65 rows=2450 width=7879) (actual time=3031.00..3125.00 rows=101 loops=1) -> Hash Join (cost=600022804.96..603265316.19 rows=2450 width=7843) (actual time=3022.00..3097.00 rows=101 loops=1) -> Hash Join (cost=500022424.95..503264643.05 rows=35000 width=7286) (actual time=3013.00..3071.00 rows=101 loops=1) -> Hash Join (cost=400022045.81..403258826.37 rows=1000006 width=6729) (actual time=2997.00..3033.00 rows=101 loops=1) -> Hash Join (cost=400021945.70..403253725.96 rows=1000006 width=6720) (actual time=2995.00..3023.00 rows=101 loops=1) -> Hash Join (cost=400021845.59..403248625.76 rows=1000006 width=4890) (actual time=2993.00..3014.00 rows=101 loops=1) -> Seq Scan on tb_item item (cost=0.00..34708.06 rows=1000006 width=2727) (actual time=921.00..928.00 rows=526 loops=1) -> Hash (cost=400019143.59..400019143.59 rows=10001 width=2163) (actual time=2068.00..2068.00 rows=0 loops=1) -> Hash Join (cost=400000925.60..400019143.59 rows=10001 width=2163) (actual time=909.00..1996.00 rows=10001 loops=1) ********** -> Hash Join (cost=200000562.91..200016130.63 rows=10001 width=1940) (actual time=892.00..1641.00 rows=10001 loops=1) -> Hash Join (cost=200000462.80..200015980.14 rows=10001 width=1931) (actual time=890.00..1433.00 rows=10001 loops=1) -> Hash Join (cost=200000362.69..200015829.97 rows=10001 width=1133) (actual time=888.00..1210.00 rows=10001 loops=1) -> Seq Scan on tb_folder folder (cost=0.00..12817.01 rows=10001 width=1062) (actual time=872.00..957.00 rows=10001 loops=1) -> Hash (cost=200000360.19..200000360.19 rows=1000 width=71) (actual time=16.00..16.00 rows=0 loops=1) -> Nested Loop (cost=200000250.23..200000360.19 rows=1000 width=71) (actual time=15.00..16.00 rows=101 loops=1) -> Index Scan using pk_login on tb_login login_default (cost=0.00..4.82 rows=1 width=8) (actual time=0.00..0.00 rows=1 loops=1) -> Materialize (cost=100000345.37..100000345.37 rows=1000 width=63) (actual time=15.00..15.00 rows=101 loops=1) and so on... ** TO ** Limit (cost=1024749076.40..1024788787.84 rows=1 width=7915) (actual time=18387.00..18490.00 rows=100 loops=1) -> Hash Join (cost=1024749076.40..1024788787.84 rows=1 width=7915) (actual time=18386.00..18489.00 rows=101 loops=1) -> Hash Join (cost=924748696.39..924788407.83 rows=1 width=7879) (actual time=18377.00..18458.00 rows=101 loops=1) -> Hash Join (cost=824748317.24..824788028.67 rows=1 width=7843) (actual time=18369.00..18421.00 rows=101 loops=1) -> Hash Join (cost=724747937.23..724787648.65 rows=1 width=7286) (actual time=18361.00..18397.00 rows=101 loops=1) -> Hash Join (cost=624747558.08..624787269.30 rows=38 width=6729) (actual time=18353.00..18382.00 rows=101 loops=1) -> Hash Join (cost=624747457.98..624787169.01 rows=38 width=4899) (actual time=18350.00..18372.00 rows=101 loops=1) -> Hash Join (cost=624747357.87..624787068.71 rows=38 width=4890) (actual time=18348.00..18365.00 rows=101 loops=1) -> Seq Scan on tb_item item (cost=0.00..34708.06 rows=1000006 width=2727) (actual time=912.00..915.00 rows=101 loops=1) -> Hash (cost=624747357.87..624747357.87 rows=1 width=2163) (actual time=17436.00..17436.00 rows=0 loops=1) -> Hash Join (cost=624734464.94..624747357.87 rows=1 width=2163) (actual time=909.00..17359.00 rows=10001 loops=1) *********** -> Nested Loop (cost=424734074.85..424746967.77 rows=1 width=1940) (actual time=892.00..16318.00 rows=10001 loops=1) -> Hash Join (cost=324733999.82..324746867.61 rows=1 width=1142) (actual time=889.00..2391.00 rows=10001 loops=1) -> Hash Join (cost=324733899.71..324746767.50 rows=1 width=1133) (actual time=887.00..1859.00 rows=10001 loops=1) -> Seq Scan on tb_folder folder (cost=0.00..12817.01 rows=10001 width=1062) (actual time=871.00..1209.00 rows=10001 loops=1) -> Hash (cost=324733899.71..324733899.71 rows=1 width=71) (actual time=16.00..16.00 rows=0 loops=1) -> Nested Loop (cost=324733759.85..324733899.71 rows=1 width=71) (actual time=16.00..16.00 rows=101 loops=1) -> Index Scan using pk_login on tb_login login_default (cost=0.00..4.82 rows=1 width=8) (actual time=0.00..0.00 rows=1 loops=1) -> Materialize (cost=224733894.88..224733894.88 rows=1 width=63) (actual time=16.00..16.00 rows=101 loops=1) and so on... So, to put it bluntly, this is just the result of my messing up with the planner and some fancy querying ! It appears that in the ( CASE WHEN ... THEN true ELSE ... END ) case, the planner uses a 'hash' join that is achieved much quicker than the related 'nest loop' it chooses in the ( ... OR ... ) case. My mis-understanding... Regards, Cedric Dufour > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Saturday, August 03, 2002 1:41 > To: Cédric Dufour > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] b1 OR b2 <-> ( CASE WHEN b1 THE true ELSE b2 END > ): performance bottleneck on logical OR > > > =?iso-8859-1?Q?C=E9dric_Dufour?= <cedric.dufour@freesurf.ch> writes: > > Regarding the trigger problem, it is exactly as I have > described it in the > > first place: > > > IF ( ( TG_OP = 'INSERT' ) OR ( ( TG_OP = 'UPDATE' ) AND ( old.X != > > new.X ) ) ) THEN > > -- Do actions depending on field X when inserted or when > **changed** (thus > > avoiding useless action if field X didn't change) > > END IF; > > > --> Error on **insert**: 'record old is unassigned yet'. Am I > wrong assuming > > that even though the ( TG_OP = 'INSERT' ) is true and ( TG_OP = > 'UPDATE' ) > > is false, ( old.X != new.X ) seems to be evaluated ? ( which causes the > > error ) > > It wouldn't get evaluated if TG_OP = 'INSERT' ... but plpgsql has to > insert all the parameters of the IF expression before it passes the IF > expression off to the main executor. So you're bombing out at the > parameter-interpretation stage. I think you'll have to divide this into > two plpgsql IF statements. > > > FROM > > owner > > INNER JOIN > > folder > > ON ( folder.PK = folder.FK_owner ) > > Surely that join condition is wrong. > > > WHERE > > owner.admin_bool OR > > ( > > folder.enabled_bool > > AND ( ( folder.enable_date IS NULL ) OR ( > folder.enable_date <= > > CURRENT_TIMESTAMP ) ) > > AND ( ( folder.Disable_date IS NULL ) OR ( > folder.disable_date > > > CURRENT_TIMESTAMP ) ) > > item.enabled_bool > > AND ( ( item.enable_date IS NULL ) OR ( item.enable_date <= > > CURRENT_TIMESTAMP ) ) > > AND ( ( item.disable_date IS NULL ) OR ( item.disable_date > > > CURRENT_TIMESTAMP ) ) > > ) > > I'm having a hard time making sense of this, since both your examples > contain the same typo --- I imagine there's an AND or OR before > item.enabled_bool, but it's hard to guess which. > > However, I suspect the issue is that the planner tries to flatten the > above WHERE into conjunctive normal form, which is normally a good > optimization strategy but perhaps doesn't work real well on this case. > Still that could only affect the boolean-expression evaluation time, > and it's hard to believe that that's a large fraction of the total > join time. > > What does EXPLAIN ANALYZE say about the plans for these queries? > And could we see them in typo-free form? > > regards, tom lane >
pgsql-general by date: