Thread: how to control the execution plan ?
Hi there, I try to execute the following statement: SELECT * FROM ( SELECT "MY_FUNCTION_A"(bp."COL_A", NULL::boolean) AS ALIAS_A FROM "TABLE_A" bp JOIN "TABLE_B" pn ON bp."COL_B"= pn."PK_ID" JOIN "TABLE_C" vbo ON bp."COL_C" = vbo."PK_ID" WHERE pn."Editor"::text ~~ 'Some%'::text ANDbp."COL_A" IS NOT NULL AND bp."COL_A"::text <> ''::text ) x WHERE (x.ALIAS_A::text ) IS NULL; The problem is the excution plan first make Seq Scan on "TABLE_A", with Filter: (("COL_A" IS NOT NULL) AND (("COL_A")::text <> ''::text) AND (("MY_FUNCTION_A"("COL_A", NULL::boolean))::text IS NULL))". This way, MY_FUNCTION_A crashes for some unsupported data provided by "COL_A". I'd like to get an execution plan which is filtering first the desired rows, and just after that compute te column value "MY_FUNCTION_A"(bp."COL_A", NULL::boolean). I made different combinations, including a subquery like: SELECT * FROM ( SELECT "MY_FUNCTION_A"(y."COL_A", NULL::boolean) AS ALIAS_A FROM ( SELECT bp."COL_A" FROM "TABLE_A"bp JOIN "TABLE_B" pn ON bp."COL_B" = pn."PK_ID" JOIN "TABLE_C" vbo ON bp."COL_C" = vbo."PK_ID" WHERE pn."COL_E"::text ~~ 'Some%'::text AND bp."COL_A" IS NOT NULL AND bp."COL_A"::text <> ''::text ) y ) x WHERE (x.ALIAS_A::text ) IS NULL; but postgres analyze is too 'smart' and optimize it as in the previous case, with the same Seq Scan on "TABLE_A", and with the same filter. I thought to change the function MY_FUNCTION_A, to support any argument data, but the even that another performance problem will be rised when the function will be computed for any row in join, even those that can be removed by other filter. Do you have a solution please ? TIA Sabin
On Mon, Jul 7, 2008 at 3:14 AM, Sabin Coanda <sabin.coanda@deuromedia.ro> wrote: > Hi there, > > I try to execute the following statement: > > SELECT * > FROM ( > SELECT "MY_FUNCTION_A"(bp."COL_A", NULL::boolean) AS ALIAS_A > FROM "TABLE_A" bp > JOIN "TABLE_B" pn ON bp."COL_B" = pn."PK_ID" > JOIN "TABLE_C" vbo ON bp."COL_C" = vbo."PK_ID" > WHERE pn."Editor"::text ~~ 'Some%'::text AND bp."COL_A" IS NOT NULL AND > bp."COL_A"::text <> ''::text > ) x > WHERE (x.ALIAS_A::text ) IS NULL; > > The problem is the excution plan first make Seq Scan on "TABLE_A", with > Filter: (("COL_A" IS NOT NULL) AND (("COL_A")::text <> ''::text) AND > (("MY_FUNCTION_A"("COL_A", NULL::boolean))::text IS NULL))". This way, > MY_FUNCTION_A crashes for some unsupported data provided by "COL_A". > > I'd like to get an execution plan which is filtering first the desired rows, > and just after that compute te column value "MY_FUNCTION_A"(bp."COL_A", > NULL::boolean). Just wondering what the query plans look like here, both regular explain, and if you can wait for it to execute, explain analyze. I'm guessing that the function is not indexed / indexable. Is it marked immutable (and is it actually immutable) or stable (and is stable)? If it's immutable then you can create an index on it and that should speed things up. > > I made different combinations, including a subquery like: > > SELECT * > FROM ( > SELECT "MY_FUNCTION_A"(y."COL_A", NULL::boolean) AS ALIAS_A > FROM ( > SELECT bp."COL_A" > FROM "TABLE_A" bp > JOIN "TABLE_B" pn ON bp."COL_B" = pn."PK_ID" > JOIN "TABLE_C" vbo ON bp."COL_C" = vbo."PK_ID" > WHERE pn."COL_E"::text ~~ 'Some%'::text AND bp."COL_A" IS NOT NULL > AND bp."COL_A"::text <> ''::text > ) y > ) x > WHERE (x.ALIAS_A::text ) IS NULL; > > but postgres analyze is too 'smart' and optimize it as in the previous case, > with the same Seq Scan on "TABLE_A", and with the same filter. > > I thought to change the function MY_FUNCTION_A, to support any argument > data, but the even that another performance problem will be rised when the > function will be computed for any row in join, even those that can be > removed by other filter. > > Do you have a solution please ? If it's still to smart, you can run two queries, one to pull the set you want to work with from the custom function into a temp table, then analyze it, then run the query against that. Not an optimal solution, but it might be the fastest if you can't index your function.
Hi Scott, I add the answers below. > > Just wondering what the query plans look like here, both regular > explain, and if you can wait for it to execute, explain analyze. > Just with explain, because the function craches when it is running: "Merge Join (cost=141.41..188.32 rows=1 width=24)" " Merge Cond: ("TABLE_C"."PK_ID" = bp."COL_C")" " -> Merge Left Join (cost=62.33..96.69 rows=1000 width=44)" " Merge Cond: (s."PK_ID" = "MY_FUNCTION_B".COL_D)" " -> Index Scan using "TABLE_D_pkey" on "TABLE_D" s (cost=0.00..18.49 rows=349 width=4)" " -> Sort (cost=62.33..64.83 rows=1000 width=44)" " Sort Key: "MY_FUNCTION_B".COL_D" " -> Function Scan on "MY_FUNCTION_B" (cost=0.00..12.50 rows=1000 width=44)" " -> Sort (cost=79.08..79.09 rows=1 width=28)" " Sort Key: bp."COL_C"" " -> Hash Join (cost=10.59..79.07 rows=1 width=28)" " Hash Cond: (bp."COL_B" = pn."PK_ID")" " -> Seq Scan on "TABLE_A" bp (cost=0.00..68.46 rows=4 width=32)" " Filter: (("COL_A" IS NOT NULL) AND (("COL_A")::text <> ''::text) AND ((("MY_FUNCTION_A"("COL_A", NULL::boolean))::text || ' '::text) IS NULL))" " -> Hash (cost=10.50..10.50 rows=7 width=4)" " -> Seq Scan on "TABLE_B" pn (cost=0.00..10.50 rows=7 width=4)" " Filter: (("COL_E")::text ~~ 'Some%'::text)" > I'm guessing that the function is not indexed / indexable. Is it > marked immutable (and is it actually immutable) or stable (and is > stable)? > The function is marked stable. > If it's still to smart, you can run two queries, one to pull the set > you want to work with from the custom function into a temp table, then > analyze it, then run the query against that. > Not an optimal solution, but it might be the fastest if you can't > index your function. > In fact I would use that statement to define a permanent view, not in a procedure. Finally I found a trick specifying not just WHERE (x.ALIAS_A::text ) IS NULL;, but combining with a constant and a join with a constant. By the way, it doesn't works just with the constant or with the join :( See the query and the plan below: EXPLAIN SELECT * FROM ( SELECT "MY_FUNCTION_A"(bp."COL_A", NULL::boolean) AS ALIAS_A, MY_AUX FROM "TABLE_A" bp CROSSJOIN ( SELECT '*'::character varying AS MY_AUX ) afp JOIN "TABLE_B" pn ON bp."COL_B"= pn."PK_ID" JOIN "TABLE_C" vbo ON bp."COL_C" = vbo."PK_ID" WHERE pn."COL_E"::text ~~ 'Some%'::textAND bp."COL_A" IS NOT NULL AND bp."COL_A"::text <> ''::text ) x WHERE (x.ALIAS_A::text || ' ' || MY_AUX) IS NULL; "Merge Join (cost=131.68..178.60 rows=1 width=56)" " Merge Cond: ("TABLE_C"."PK_ID" = bp."COL_C")" " -> Merge Left Join (cost=62.33..96.69 rows=1000 width=44)" " Merge Cond: (s."PK_ID" = "MY_FUNCTION_B".COL_D)" " -> Index Scan using "TABLE_D_pkey" on "TABLE_D" s (cost=0.00..18.49 rows=349 width=4)" " -> Sort (cost=62.33..64.83 rows=1000 width=44)" " Sort Key: "MY_FUNCTION_B".COL_D" " -> Function Scan on "MY_FUNCTION_B" (cost=0.00..12.50 rows=1000 width=44)" " -> Sort (cost=69.36..69.36 rows=1 width=60)" " Sort Key: bp."COL_C"" " -> Nested Loop (cost=10.59..69.34 rows=1 width=60)" " Join Filter: (((("MY_FUNCTION_A"(bp."COL_A", NULL::boolean))::text || ' '::text) || (afp.MY_AUX)::text) IS NULL)" " -> Result (cost=0.00..0.01 rows=1 width=0)" " -> Hash Join (cost=10.59..68.94 rows=22 width=28)" " Hash Cond: (bp."COL_B" = pn."PK_ID")" " -> Seq Scan on "TABLE_A" bp (cost=0.00..54.90 rows=862 width=32)" " Filter: (("COL_A" IS NOT NULL) AND (("COL_A")::text <> ''::text))" " -> Hash (cost=10.50..10.50 rows=7 width=4)" " -> Seq Scan on "TABLE_B" pn (cost=0.00..10.50 rows=7 width=4)" " Filter: (("COL_E")::text ~~ 'Some%'::text)" However I'm not sure there are no circumstances when the execution plan will detect my trick and will optimize the query again :(( Sabin