Thread: enable_incremental_sort changes query behavior
Hi, With sqlsmith I found a query that gives this error: ERROR: ORDER/GROUP BY expression not found in targetlist I noted the query (sql query below, sorry it uses custom tables i couldn't replicate with regression tables) because it doesn't include an ORDER/GROUP BY clause. --- 0 ---- select distinct subq_0.c1 as c0, ref_0.radi_usua_radi as c1, ref_0.radi_nume_asoc as c2, subq_0.c1 as c3, case when (cast(null as pg_lsn) >= pg_catalog.pg_last_wal_receive_lsn()) and (true = pg_catalog.pg_rotate_logfile_old()) then ref_0.radi_usua_rem else ref_0.radi_usua_rem end as c4, cast(nullif((select hist_codi from public.hist_eventos_2 limit 1 offset 4) , pg_catalog.pg_stat_get_buf_alloc()) as int8) as c5 from public.radicado_2 as ref_0, lateral (select ref_0.radi_text_temp as c0, ref_0.radi_usua_actu as c1 from public.hist_eventos_1 as ref_1 where cast(nullif(cast(null as float4), cast(null as float4)) as float4) >= pg_catalog.pi()) as subq_0 where ref_0.radi_usua_dest is not NULL; --- 0 ---- Attached the stack trace produced until de elog that produces the message. But if I set enable_incremental_sort to off the query gets executed without problems (attached the explain produced for that case) -- Jaime Casanova www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova <jaime.casanova@2ndquadrant.com> wrote: > > Hi, > > With sqlsmith I found a query that gives this error: > ERROR: ORDER/GROUP BY expression not found in targetlist > > I noted the query (sql query below, sorry it uses custom tables i > couldn't replicate with regression tables) because it doesn't include > an ORDER/GROUP BY clause. > > --- 0 ---- > select distinct > subq_0.c1 as c0, > ref_0.radi_usua_radi as c1, > ref_0.radi_nume_asoc as c2, > subq_0.c1 as c3, > case when (cast(null as pg_lsn) >= > pg_catalog.pg_last_wal_receive_lsn()) > and (true = pg_catalog.pg_rotate_logfile_old()) then > ref_0.radi_usua_rem else ref_0.radi_usua_rem end > as c4, > cast(nullif((select hist_codi from public.hist_eventos_2 > limit 1 offset 4) > , > pg_catalog.pg_stat_get_buf_alloc()) as int8) as c5 > from > public.radicado_2 as ref_0, > lateral (select > ref_0.radi_text_temp as c0, > ref_0.radi_usua_actu as c1 > from > public.hist_eventos_1 as ref_1 > where cast(nullif(cast(null as float4), > cast(null as float4)) as float4) >= pg_catalog.pi()) as subq_0 > where ref_0.radi_usua_dest is not NULL; > --- 0 ---- > > Attached the stack trace produced until de elog that produces the message. > > But if I set enable_incremental_sort to off the query gets executed > without problems (attached the explain produced for that case) Thanks for the report. Is there by an chance an index on ref_0.radi_text_temp? And if you set enable_hashagg = off what plan do you get (or error)? Thanks, James
On Wed, 30 Sep 2020 at 21:21, James Coleman <jtc331@gmail.com> wrote: > > On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova > <jaime.casanova@2ndquadrant.com> wrote: > > > > Hi, > > > > With sqlsmith I found a query that gives this error: > > ERROR: ORDER/GROUP BY expression not found in targetlist > > [...] > > > > But if I set enable_incremental_sort to off the query gets executed > > without problems (attached the explain produced for that case) > > Thanks for the report. > Hi, by experiment I reduced the query to this --- 0 --- select distinct subq_0.c1 as c0, case when (true = pg_catalog.pg_rotate_logfile_old()) then ref_0.t else ref_0.t end as c4 from public.ref_0, lateral (select ref_0.i as c1 from generate_series(1, 100) as ref_1) as subq_0 --- 0 --- the only custom table already needed can be created with this commands: --- 0 --- create table ref_0 as select repeat('abcde', (random() * 10)::integer) t, random() * 1000 i from generate_series(1, 500000); create index on ref_0 (i); analyze ref_0 ; --- 0 --- > Is there by an chance an index on ref_0.radi_text_temp? > there is an index involved but not on that field, commands above create the index in the right column... after that, ANALYZE the table > And if you set enable_hashagg = off what plan do you get (or error)? > same error -- Jaime Casanova www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Oct 1, 2020 at 3:09 AM Jaime Casanova <jaime.casanova@2ndquadrant.com> wrote: > > On Wed, 30 Sep 2020 at 21:21, James Coleman <jtc331@gmail.com> wrote: > > > > On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova > > <jaime.casanova@2ndquadrant.com> wrote: > > > > > > Hi, > > > > > > With sqlsmith I found a query that gives this error: > > > ERROR: ORDER/GROUP BY expression not found in targetlist > > > > [...] > > > > > > But if I set enable_incremental_sort to off the query gets executed > > > without problems (attached the explain produced for that case) > > > > Thanks for the report. > > > > Hi, > > by experiment I reduced the query to this > > --- 0 --- > select distinct > subq_0.c1 as c0, > case when (true = pg_catalog.pg_rotate_logfile_old()) then > ref_0.t else ref_0.t > end > as c4 > from > public.ref_0, > lateral (select > > ref_0.i as c1 > from > generate_series(1, 100) as ref_1) as subq_0 > --- 0 --- > > the only custom table already needed can be created with this commands: > > --- 0 --- > create table ref_0 as select repeat('abcde', (random() * 10)::integer) > t, random() * 1000 i from generate_series(1, 500000); > create index on ref_0 (i); > analyze ref_0 ; > --- 0 --- > > > > Is there by an chance an index on ref_0.radi_text_temp? > > > > there is an index involved but not on that field, commands above > create the index in the right column... after that, ANALYZE the table > > > And if you set enable_hashagg = off what plan do you get (or error)? > > > > same error I was able to reproduce the error without incremental sort enabled (i.e., it happens with a full sort also). The function call in the SELECT doesn't have to be in a case expression; for example I was able to reproduce changing that to `random()::text || ref_0.t`. It looks like the issue happens when: 1. The sort happens within a parallel node. 2. One of the sort keys is an expression containing a volatile function call and a column from the lateral join. Here are the settings I used with your above repro case to show it with regular sort: enable_hashagg=off enable_incremental_sort=off enable_seqscan=off parallel_setup_cost=10 parallel_tuple_cost=0 The plan (obtained by replacing the volatile function with a stable one): Unique -> Nested Loop -> Gather Merge Workers Planned: 2 -> Sort Sort Key: ref_0.i, (md5(ref_0.t)) -> Parallel Index Scan using ref_0_i_idx on ref_0 -> Function Scan on generate_series ref_1 Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. I haven't been able to dig further than that yet, but my intuition is to poke around in the parallel query machinery? James
On Thu, Oct 01, 2020 at 09:02:57AM -0400, James Coleman wrote: >On Thu, Oct 1, 2020 at 3:09 AM Jaime Casanova ><jaime.casanova@2ndquadrant.com> wrote: >> >> On Wed, 30 Sep 2020 at 21:21, James Coleman <jtc331@gmail.com> wrote: >> > >> > On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova >> > <jaime.casanova@2ndquadrant.com> wrote: >> > > >> > > Hi, >> > > >> > > With sqlsmith I found a query that gives this error: >> > > ERROR: ORDER/GROUP BY expression not found in targetlist >> > > >> [...] >> > > >> > > But if I set enable_incremental_sort to off the query gets executed >> > > without problems (attached the explain produced for that case) >> > >> > Thanks for the report. >> > >> >> Hi, >> >> by experiment I reduced the query to this >> >> --- 0 --- >> select distinct >> subq_0.c1 as c0, >> case when (true = pg_catalog.pg_rotate_logfile_old()) then >> ref_0.t else ref_0.t >> end >> as c4 >> from >> public.ref_0, >> lateral (select >> >> ref_0.i as c1 >> from >> generate_series(1, 100) as ref_1) as subq_0 >> --- 0 --- >> >> the only custom table already needed can be created with this commands: >> >> --- 0 --- >> create table ref_0 as select repeat('abcde', (random() * 10)::integer) >> t, random() * 1000 i from generate_series(1, 500000); >> create index on ref_0 (i); >> analyze ref_0 ; >> --- 0 --- >> >> >> > Is there by an chance an index on ref_0.radi_text_temp? >> > >> >> there is an index involved but not on that field, commands above >> create the index in the right column... after that, ANALYZE the table >> >> > And if you set enable_hashagg = off what plan do you get (or error)? >> > >> >> same error > >I was able to reproduce the error without incremental sort enabled >(i.e., it happens with a full sort also). The function call in the >SELECT doesn't have to be in a case expression; for example I was able >to reproduce changing that to `random()::text || ref_0.t`. > >It looks like the issue happens when: >1. The sort happens within a parallel node. >2. One of the sort keys is an expression containing a volatile >function call and a column from the lateral join. > >Here are the settings I used with your above repro case to show it >with regular sort: > > enable_hashagg=off > enable_incremental_sort=off > enable_seqscan=off > parallel_setup_cost=10 > parallel_tuple_cost=0 > >The plan (obtained by replacing the volatile function with a stable one): > > Unique > -> Nested Loop > -> Gather Merge > Workers Planned: 2 > -> Sort > Sort Key: ref_0.i, (md5(ref_0.t)) > -> Parallel Index Scan using ref_0_i_idx on ref_0 > -> Function Scan on generate_series ref_1 > >Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. > >I haven't been able to dig further than that yet, but my intuition is >to poke around in the parallel query machinery? > Nope. Bisect says this was introduced by this commit: ba3e76cc57 Consider Incremental Sort paths at additional places Looking at the diff, I suspect generate_useful_gather_paths (or maybe get_useful_pathkeys_for_relation) fails to do something with the pathkeys. Of course, that'd explain why it only happens for parallel plans, as this builds gather nodes. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Oct 1, 2020 at 6:08 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Thu, Oct 01, 2020 at 09:02:57AM -0400, James Coleman wrote: > >On Thu, Oct 1, 2020 at 3:09 AM Jaime Casanova > ><jaime.casanova@2ndquadrant.com> wrote: > >> > >> On Wed, 30 Sep 2020 at 21:21, James Coleman <jtc331@gmail.com> wrote: > >> > > >> > On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova > >> > <jaime.casanova@2ndquadrant.com> wrote: > >> > > > >> > > Hi, > >> > > > >> > > With sqlsmith I found a query that gives this error: > >> > > ERROR: ORDER/GROUP BY expression not found in targetlist > >> > > > >> [...] > >> > > > >> > > But if I set enable_incremental_sort to off the query gets executed > >> > > without problems (attached the explain produced for that case) > >> > > >> > Thanks for the report. > >> > > >> > >> Hi, > >> > >> by experiment I reduced the query to this > >> > >> --- 0 --- > >> select distinct > >> subq_0.c1 as c0, > >> case when (true = pg_catalog.pg_rotate_logfile_old()) then > >> ref_0.t else ref_0.t > >> end > >> as c4 > >> from > >> public.ref_0, > >> lateral (select > >> > >> ref_0.i as c1 > >> from > >> generate_series(1, 100) as ref_1) as subq_0 > >> --- 0 --- > >> > >> the only custom table already needed can be created with this commands: > >> > >> --- 0 --- > >> create table ref_0 as select repeat('abcde', (random() * 10)::integer) > >> t, random() * 1000 i from generate_series(1, 500000); > >> create index on ref_0 (i); > >> analyze ref_0 ; > >> --- 0 --- > >> > >> > >> > Is there by an chance an index on ref_0.radi_text_temp? > >> > > >> > >> there is an index involved but not on that field, commands above > >> create the index in the right column... after that, ANALYZE the table > >> > >> > And if you set enable_hashagg = off what plan do you get (or error)? > >> > > >> > >> same error > > > >I was able to reproduce the error without incremental sort enabled > >(i.e., it happens with a full sort also). The function call in the > >SELECT doesn't have to be in a case expression; for example I was able > >to reproduce changing that to `random()::text || ref_0.t`. > > > >It looks like the issue happens when: > >1. The sort happens within a parallel node. > >2. One of the sort keys is an expression containing a volatile > >function call and a column from the lateral join. > > > >Here are the settings I used with your above repro case to show it > >with regular sort: > > > > enable_hashagg=off > > enable_incremental_sort=off > > enable_seqscan=off > > parallel_setup_cost=10 > > parallel_tuple_cost=0 > > > >The plan (obtained by replacing the volatile function with a stable one): > > > > Unique > > -> Nested Loop > > -> Gather Merge > > Workers Planned: 2 > > -> Sort > > Sort Key: ref_0.i, (md5(ref_0.t)) > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > -> Function Scan on generate_series ref_1 > > > >Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. > > > >I haven't been able to dig further than that yet, but my intuition is > >to poke around in the parallel query machinery? > > > > Nope. Bisect says this was introduced by this commit: > > ba3e76cc57 Consider Incremental Sort paths at additional places > > Looking at the diff, I suspect generate_useful_gather_paths (or maybe > get_useful_pathkeys_for_relation) fails to do something with the > pathkeys. > > Of course, that'd explain why it only happens for parallel plans, as > this builds gather nodes. I should have known I'd regret making a wild guess. I was in the middle of bisecting too, but had to set it aside for a bit and hadn't finished. I'll try to look into that commit (and I agree those functions are the likely places to look) as I get a chance here. James
On Thu, Oct 1, 2020 at 9:10 PM James Coleman <jtc331@gmail.com> wrote: > > On Thu, Oct 1, 2020 at 6:08 PM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > > > On Thu, Oct 01, 2020 at 09:02:57AM -0400, James Coleman wrote: > > >On Thu, Oct 1, 2020 at 3:09 AM Jaime Casanova > > ><jaime.casanova@2ndquadrant.com> wrote: > > >> > > >> On Wed, 30 Sep 2020 at 21:21, James Coleman <jtc331@gmail.com> wrote: > > >> > > > >> > On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova > > >> > <jaime.casanova@2ndquadrant.com> wrote: > > >> > > > > >> > > Hi, > > >> > > > > >> > > With sqlsmith I found a query that gives this error: > > >> > > ERROR: ORDER/GROUP BY expression not found in targetlist > > >> > > > > >> [...] > > >> > > > > >> > > But if I set enable_incremental_sort to off the query gets executed > > >> > > without problems (attached the explain produced for that case) > > >> > > > >> > Thanks for the report. > > >> > > > >> > > >> Hi, > > >> > > >> by experiment I reduced the query to this > > >> > > >> --- 0 --- > > >> select distinct > > >> subq_0.c1 as c0, > > >> case when (true = pg_catalog.pg_rotate_logfile_old()) then > > >> ref_0.t else ref_0.t > > >> end > > >> as c4 > > >> from > > >> public.ref_0, > > >> lateral (select > > >> > > >> ref_0.i as c1 > > >> from > > >> generate_series(1, 100) as ref_1) as subq_0 > > >> --- 0 --- > > >> > > >> the only custom table already needed can be created with this commands: > > >> > > >> --- 0 --- > > >> create table ref_0 as select repeat('abcde', (random() * 10)::integer) > > >> t, random() * 1000 i from generate_series(1, 500000); > > >> create index on ref_0 (i); > > >> analyze ref_0 ; > > >> --- 0 --- > > >> > > >> > > >> > Is there by an chance an index on ref_0.radi_text_temp? > > >> > > > >> > > >> there is an index involved but not on that field, commands above > > >> create the index in the right column... after that, ANALYZE the table > > >> > > >> > And if you set enable_hashagg = off what plan do you get (or error)? > > >> > > > >> > > >> same error > > > > > >I was able to reproduce the error without incremental sort enabled > > >(i.e., it happens with a full sort also). The function call in the > > >SELECT doesn't have to be in a case expression; for example I was able > > >to reproduce changing that to `random()::text || ref_0.t`. > > > > > >It looks like the issue happens when: > > >1. The sort happens within a parallel node. > > >2. One of the sort keys is an expression containing a volatile > > >function call and a column from the lateral join. > > > > > >Here are the settings I used with your above repro case to show it > > >with regular sort: > > > > > > enable_hashagg=off > > > enable_incremental_sort=off > > > enable_seqscan=off > > > parallel_setup_cost=10 > > > parallel_tuple_cost=0 > > > > > >The plan (obtained by replacing the volatile function with a stable one): > > > > > > Unique > > > -> Nested Loop > > > -> Gather Merge > > > Workers Planned: 2 > > > -> Sort > > > Sort Key: ref_0.i, (md5(ref_0.t)) > > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > > -> Function Scan on generate_series ref_1 > > > > > >Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. > > > > > >I haven't been able to dig further than that yet, but my intuition is > > >to poke around in the parallel query machinery? > > > > > > > Nope. Bisect says this was introduced by this commit: > > > > ba3e76cc57 Consider Incremental Sort paths at additional places > > > > Looking at the diff, I suspect generate_useful_gather_paths (or maybe > > get_useful_pathkeys_for_relation) fails to do something with the > > pathkeys. > > > > Of course, that'd explain why it only happens for parallel plans, as > > this builds gather nodes. > > I should have known I'd regret making a wild guess. I was in the > middle of bisecting too, but had to set it aside for a bit and hadn't > finished. > > I'll try to look into that commit (and I agree those functions are the > likely places to look) as I get a chance here. I've been able to confirm that the problem goes away if we stop adding the gather merge paths in generate_useful_gather_paths(). I'm not sure yet what conclusion that leads us to. It seems to be that the biggest clue remains that all of this works correctly unless one of the selected columns (which happens to be a pathkey at this point because it's a DISTINCT query) contains a volatile expression. James
On Fri, Oct 02, 2020 at 09:19:44AM -0400, James Coleman wrote: > > ... > >I've been able to confirm that the problem goes away if we stop adding >the gather merge paths in generate_useful_gather_paths(). > >I'm not sure yet what conclusion that leads us to. It seems to be that >the biggest clue remains that all of this works correctly unless one >of the selected columns (which happens to be a pathkey at this point >because it's a DISTINCT query) contains a volatile expression. > Yeah. It seems to me this is a bug in get_useful_pathkeys_for_relation, which is calling find_em_expr_for_rel and is happy with anything it returns. But this was copied from postgres_fdw, which however does a bit more here: if (pathkey_ec->ec_has_volatile || !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) || !is_foreign_expr(root, rel, em_expr)) So not only does it explicitly check volatility of the pathkey, it also calls is_foreign_expr which checks the expression for mutable functions. The attached patch seems to fix this, but it only adds the check for mutable functions. Maybe it should check ec_has_volatile too ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Fri, Oct 2, 2020 at 10:32 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Fri, Oct 02, 2020 at 09:19:44AM -0400, James Coleman wrote: > > > > ... > > > >I've been able to confirm that the problem goes away if we stop adding > >the gather merge paths in generate_useful_gather_paths(). > > > >I'm not sure yet what conclusion that leads us to. It seems to be that > >the biggest clue remains that all of this works correctly unless one > >of the selected columns (which happens to be a pathkey at this point > >because it's a DISTINCT query) contains a volatile expression. > > > > Yeah. It seems to me this is a bug in get_useful_pathkeys_for_relation, > which is calling find_em_expr_for_rel and is happy with anything it > returns. But this was copied from postgres_fdw, which however does a bit > more here: > > if (pathkey_ec->ec_has_volatile || > !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) || > !is_foreign_expr(root, rel, em_expr)) > > So not only does it explicitly check volatility of the pathkey, it also > calls is_foreign_expr which checks the expression for mutable functions. > > The attached patch seems to fix this, but it only adds the check for > mutable functions. Maybe it should check ec_has_volatile too ... We actually discussed the volatility check in that function back on the original thread [1], and we'd concluded that was specifically necessary for the fdw code because the function would execute on two different servers (and thus produce different results), but that in a local server only scenario it should be fine. My understanding (correct me if I'm wrong) is that the volatile function should only be executed once (at the scan level?) to build the tuple and from then on the datum isn't going to change, so I'm not sure why the volatility would matter here. James 1: https://www.postgresql.org/message-id/20200328025830.6v6ogkseohakc32q%40development
On Fri, Oct 2, 2020 at 10:53 AM James Coleman <jtc331@gmail.com> wrote: > > On Fri, Oct 2, 2020 at 10:32 AM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > > > On Fri, Oct 02, 2020 at 09:19:44AM -0400, James Coleman wrote: > > > > > > ... > > > > > >I've been able to confirm that the problem goes away if we stop adding > > >the gather merge paths in generate_useful_gather_paths(). > > > > > >I'm not sure yet what conclusion that leads us to. It seems to be that > > >the biggest clue remains that all of this works correctly unless one > > >of the selected columns (which happens to be a pathkey at this point > > >because it's a DISTINCT query) contains a volatile expression. > > > > > > > Yeah. It seems to me this is a bug in get_useful_pathkeys_for_relation, > > which is calling find_em_expr_for_rel and is happy with anything it > > returns. But this was copied from postgres_fdw, which however does a bit > > more here: > > > > if (pathkey_ec->ec_has_volatile || > > !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) || > > !is_foreign_expr(root, rel, em_expr)) > > > > So not only does it explicitly check volatility of the pathkey, it also > > calls is_foreign_expr which checks the expression for mutable functions. > > > > The attached patch seems to fix this, but it only adds the check for > > mutable functions. Maybe it should check ec_has_volatile too ... > > We actually discussed the volatility check in that function back on > the original thread [1], and we'd concluded that was specifically > necessary for the fdw code because the function would execute on two > different servers (and thus produce different results), but that in a > local server only scenario it should be fine. > > My understanding (correct me if I'm wrong) is that the volatile > function should only be executed once (at the scan level?) to build > the tuple and from then on the datum isn't going to change, so I'm not > sure why the volatility would matter here. > > James > > 1: https://www.postgresql.org/message-id/20200328025830.6v6ogkseohakc32q%40development Oh, hmm, could what I said all be true, but there still be some rule that you shouldn't compare datums generated from volatile expressions in different backends (i.e., parallel query)? James
On Fri, Oct 02, 2020 at 10:55:14AM -0400, James Coleman wrote: >On Fri, Oct 2, 2020 at 10:53 AM James Coleman <jtc331@gmail.com> wrote: >> >> On Fri, Oct 2, 2020 at 10:32 AM Tomas Vondra >> <tomas.vondra@2ndquadrant.com> wrote: >> > >> > On Fri, Oct 02, 2020 at 09:19:44AM -0400, James Coleman wrote: >> > > >> > > ... >> > > >> > >I've been able to confirm that the problem goes away if we stop adding >> > >the gather merge paths in generate_useful_gather_paths(). >> > > >> > >I'm not sure yet what conclusion that leads us to. It seems to be that >> > >the biggest clue remains that all of this works correctly unless one >> > >of the selected columns (which happens to be a pathkey at this point >> > >because it's a DISTINCT query) contains a volatile expression. >> > > >> > >> > Yeah. It seems to me this is a bug in get_useful_pathkeys_for_relation, >> > which is calling find_em_expr_for_rel and is happy with anything it >> > returns. But this was copied from postgres_fdw, which however does a bit >> > more here: >> > >> > if (pathkey_ec->ec_has_volatile || >> > !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) || >> > !is_foreign_expr(root, rel, em_expr)) >> > >> > So not only does it explicitly check volatility of the pathkey, it also >> > calls is_foreign_expr which checks the expression for mutable functions. >> > >> > The attached patch seems to fix this, but it only adds the check for >> > mutable functions. Maybe it should check ec_has_volatile too ... >> >> We actually discussed the volatility check in that function back on >> the original thread [1], and we'd concluded that was specifically >> necessary for the fdw code because the function would execute on two >> different servers (and thus produce different results), but that in a >> local server only scenario it should be fine. >> >> My understanding (correct me if I'm wrong) is that the volatile >> function should only be executed once (at the scan level?) to build >> the tuple and from then on the datum isn't going to change, so I'm not >> sure why the volatility would matter here. >> >> James >> >> 1: https://www.postgresql.org/message-id/20200328025830.6v6ogkseohakc32q%40development > >Oh, hmm, could what I said all be true, but there still be some rule >that you shouldn't compare datums generated from volatile expressions >in different backends (i.e., parallel query)? > I'm not sure it's all that related to parallel query - it's more likely that when constructing the paths below the gather merge, this new code fails to do something important. I'm not 100% sure how the grouping and volatile functions work, so let me think aloud here ... The backtrace looks like this: #0 get_sortgroupref_tle #1 0x0000000000808ab9 in prepare_sort_from_pathkeys #2 0x000000000080926c in make_sort_from_pathkeys #3 0x0000000000801032 in create_sort_plan #4 0x00000000007fe7e0 in create_plan_recurse #5 0x0000000000800b2c in create_gather_merge_plan #6 0x00000000007fe94d in create_plan_recurse #7 0x0000000000805328 in create_nestloop_plan #8 0x00000000007ff3c5 in create_join_plan #9 0x00000000007fe5f8 in create_plan_recurse #10 0x0000000000800d68 in create_projection_plan #11 0x00000000007fe662 in create_plan_recurse #12 0x0000000000801252 in create_upper_unique_plan #13 0x00000000007fe760 in create_plan_recurse #14 0x00000000007fe4f2 in create_plan #15 0x000000000081082f in standard_planner and the create_sort_plan works with lefttree that is IndexScan, so the query we're constructing looks like this: Distinct -> Nestloop -> Gather Merge -> Sort -> Index Scan and it's the sort that expects to find the expression in the Index Scan target list. Which seems rather bogus, because clearly the index scan does not include the expression. (I wonder if it's somehow related that indexes can't be built on volatile expressions ...) Anyway, the index scan clearly does not include the expression the sort references, hence the failure. And the index can can't compute it, because we probably need to compute it on top of the join I think (otherwise we might get duplicate values for volatile functions etc.) Looking at this from a slightly different angle, the root cause here seems to be that generate_useful_gather_paths uses the pathkeys it gets from get_useful_pathkeys_for_relation, which means root->query_pathkeys. But all other create_gather_merge_calls use root->sort_pathkeys, so maybe this is the actual problem and get_useful_pathkeys_for_relation should use root->sort_pathkeys instead. That does fix the issue for me too (and it passes all regression tests). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Oct 2, 2020 at 2:25 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Fri, Oct 02, 2020 at 10:55:14AM -0400, James Coleman wrote: > >On Fri, Oct 2, 2020 at 10:53 AM James Coleman <jtc331@gmail.com> wrote: > >> > >> On Fri, Oct 2, 2020 at 10:32 AM Tomas Vondra > >> <tomas.vondra@2ndquadrant.com> wrote: > >> > > >> > On Fri, Oct 02, 2020 at 09:19:44AM -0400, James Coleman wrote: > >> > > > >> > > ... > >> > > > >> > >I've been able to confirm that the problem goes away if we stop adding > >> > >the gather merge paths in generate_useful_gather_paths(). > >> > > > >> > >I'm not sure yet what conclusion that leads us to. It seems to be that > >> > >the biggest clue remains that all of this works correctly unless one > >> > >of the selected columns (which happens to be a pathkey at this point > >> > >because it's a DISTINCT query) contains a volatile expression. > >> > > > >> > > >> > Yeah. It seems to me this is a bug in get_useful_pathkeys_for_relation, > >> > which is calling find_em_expr_for_rel and is happy with anything it > >> > returns. But this was copied from postgres_fdw, which however does a bit > >> > more here: > >> > > >> > if (pathkey_ec->ec_has_volatile || > >> > !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) || > >> > !is_foreign_expr(root, rel, em_expr)) > >> > > >> > So not only does it explicitly check volatility of the pathkey, it also > >> > calls is_foreign_expr which checks the expression for mutable functions. > >> > > >> > The attached patch seems to fix this, but it only adds the check for > >> > mutable functions. Maybe it should check ec_has_volatile too ... > >> > >> We actually discussed the volatility check in that function back on > >> the original thread [1], and we'd concluded that was specifically > >> necessary for the fdw code because the function would execute on two > >> different servers (and thus produce different results), but that in a > >> local server only scenario it should be fine. > >> > >> My understanding (correct me if I'm wrong) is that the volatile > >> function should only be executed once (at the scan level?) to build > >> the tuple and from then on the datum isn't going to change, so I'm not > >> sure why the volatility would matter here. > >> > >> James > >> > >> 1: https://www.postgresql.org/message-id/20200328025830.6v6ogkseohakc32q%40development > > > >Oh, hmm, could what I said all be true, but there still be some rule > >that you shouldn't compare datums generated from volatile expressions > >in different backends (i.e., parallel query)? > > > > I'm not sure it's all that related to parallel query - it's more likely > that when constructing the paths below the gather merge, this new code > fails to do something important. > > I'm not 100% sure how the grouping and volatile functions work, so let > me think aloud here ... > > The backtrace looks like this: > > #0 get_sortgroupref_tle > #1 0x0000000000808ab9 in prepare_sort_from_pathkeys > #2 0x000000000080926c in make_sort_from_pathkeys > #3 0x0000000000801032 in create_sort_plan > #4 0x00000000007fe7e0 in create_plan_recurse > #5 0x0000000000800b2c in create_gather_merge_plan > #6 0x00000000007fe94d in create_plan_recurse > #7 0x0000000000805328 in create_nestloop_plan > #8 0x00000000007ff3c5 in create_join_plan > #9 0x00000000007fe5f8 in create_plan_recurse > #10 0x0000000000800d68 in create_projection_plan > #11 0x00000000007fe662 in create_plan_recurse > #12 0x0000000000801252 in create_upper_unique_plan > #13 0x00000000007fe760 in create_plan_recurse > #14 0x00000000007fe4f2 in create_plan > #15 0x000000000081082f in standard_planner > > and the create_sort_plan works with lefttree that is IndexScan, so the > query we're constructing looks like this: > > Distinct > -> Nestloop > -> Gather Merge > -> Sort > -> Index Scan > > and it's the sort that expects to find the expression in the Index Scan > target list. Which seems rather bogus, because clearly the index scan > does not include the expression. (I wonder if it's somehow related that > indexes can't be built on volatile expressions ...) The index scan does include values not found in the index though, right? It returns the whole tuple -- though I'm not sure if it includes calculated values (BTW: is this what is meant by a "projection" being added? Or is that something else?) > Anyway, the index scan clearly does not include the expression the sort > references, hence the failure. And the index can can't compute it, > because we probably need to compute it on top of the join I think > (otherwise we might get duplicate values for volatile functions etc.) In this particular query there's no reason why we'd have to wait to compute it until after the join, right? For example, if we take away the parallelism but still force a Unique plan, we get this: Unique -> Sort Sort Key: ref_0.i, (CASE WHEN pg_rotate_logfile_old() THEN ref_0.t ELSE ref_0.t END) -> Nested Loop -> Seq Scan on ref_0 -> Function Scan on generate_series ref_1 And I don't see any reason why the CASE statement couldn't in theory (I don't know the internals enough to know when it actually happens) be done as part of the base relation scan (in this case, the seq scan). It's not dependent on any information from the join. Furthermore the same plan works correctly for a non-volatile expression, so that means (I think) that the relation scan is capable of producing that expression in the tuple it generates (either that or the sort is generating it?). > Looking at this from a slightly different angle, the root cause here > seems to be that generate_useful_gather_paths uses the pathkeys it gets > from get_useful_pathkeys_for_relation, which means root->query_pathkeys. > But all other create_gather_merge_calls use root->sort_pathkeys, so > maybe this is the actual problem and get_useful_pathkeys_for_relation > should use root->sort_pathkeys instead. That does fix the issue for me > too (and it passes all regression tests). So I'm not convinced this is the correct way forward. It seems we want to be able to apply this as broadly as possible, and using sort_pathkeys here means we wouldn't use it for the DISTINCT case at all, right? So far I think excluding volatile expressions (or mutable functions?) is the best way we've discussed so far, though I'm clear yet on why it is that that's a necessary restriction. If the answer is "there are cases where it would be unsafe for parallel query even though it isn't here...", I'm fine with that, but if so, we should make sure that's in the code comments/readmes somewhere. I also verified that the functions I've been playing with (pg_rotate_logfile_old and md5) are marked parallel safe in pg_proc. James
On Fri, Oct 02, 2020 at 04:12:11PM -0400, James Coleman wrote: >On Fri, Oct 2, 2020 at 2:25 PM Tomas Vondra ><tomas.vondra@2ndquadrant.com> wrote: >> >> On Fri, Oct 02, 2020 at 10:55:14AM -0400, James Coleman wrote: >> >On Fri, Oct 2, 2020 at 10:53 AM James Coleman <jtc331@gmail.com> wrote: >> >> >> >> On Fri, Oct 2, 2020 at 10:32 AM Tomas Vondra >> >> <tomas.vondra@2ndquadrant.com> wrote: >> >> > >> >> > On Fri, Oct 02, 2020 at 09:19:44AM -0400, James Coleman wrote: >> >> > > >> >> > > ... >> >> > > >> >> > >I've been able to confirm that the problem goes away if we stop adding >> >> > >the gather merge paths in generate_useful_gather_paths(). >> >> > > >> >> > >I'm not sure yet what conclusion that leads us to. It seems to be that >> >> > >the biggest clue remains that all of this works correctly unless one >> >> > >of the selected columns (which happens to be a pathkey at this point >> >> > >because it's a DISTINCT query) contains a volatile expression. >> >> > > >> >> > >> >> > Yeah. It seems to me this is a bug in get_useful_pathkeys_for_relation, >> >> > which is calling find_em_expr_for_rel and is happy with anything it >> >> > returns. But this was copied from postgres_fdw, which however does a bit >> >> > more here: >> >> > >> >> > if (pathkey_ec->ec_has_volatile || >> >> > !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) || >> >> > !is_foreign_expr(root, rel, em_expr)) >> >> > >> >> > So not only does it explicitly check volatility of the pathkey, it also >> >> > calls is_foreign_expr which checks the expression for mutable functions. >> >> > >> >> > The attached patch seems to fix this, but it only adds the check for >> >> > mutable functions. Maybe it should check ec_has_volatile too ... >> >> >> >> We actually discussed the volatility check in that function back on >> >> the original thread [1], and we'd concluded that was specifically >> >> necessary for the fdw code because the function would execute on two >> >> different servers (and thus produce different results), but that in a >> >> local server only scenario it should be fine. >> >> >> >> My understanding (correct me if I'm wrong) is that the volatile >> >> function should only be executed once (at the scan level?) to build >> >> the tuple and from then on the datum isn't going to change, so I'm not >> >> sure why the volatility would matter here. >> >> >> >> James >> >> >> >> 1: https://www.postgresql.org/message-id/20200328025830.6v6ogkseohakc32q%40development >> > >> >Oh, hmm, could what I said all be true, but there still be some rule >> >that you shouldn't compare datums generated from volatile expressions >> >in different backends (i.e., parallel query)? >> > >> >> I'm not sure it's all that related to parallel query - it's more likely >> that when constructing the paths below the gather merge, this new code >> fails to do something important. >> >> I'm not 100% sure how the grouping and volatile functions work, so let >> me think aloud here ... >> >> The backtrace looks like this: >> >> #0 get_sortgroupref_tle >> #1 0x0000000000808ab9 in prepare_sort_from_pathkeys >> #2 0x000000000080926c in make_sort_from_pathkeys >> #3 0x0000000000801032 in create_sort_plan >> #4 0x00000000007fe7e0 in create_plan_recurse >> #5 0x0000000000800b2c in create_gather_merge_plan >> #6 0x00000000007fe94d in create_plan_recurse >> #7 0x0000000000805328 in create_nestloop_plan >> #8 0x00000000007ff3c5 in create_join_plan >> #9 0x00000000007fe5f8 in create_plan_recurse >> #10 0x0000000000800d68 in create_projection_plan >> #11 0x00000000007fe662 in create_plan_recurse >> #12 0x0000000000801252 in create_upper_unique_plan >> #13 0x00000000007fe760 in create_plan_recurse >> #14 0x00000000007fe4f2 in create_plan >> #15 0x000000000081082f in standard_planner >> >> and the create_sort_plan works with lefttree that is IndexScan, so the >> query we're constructing looks like this: >> >> Distinct >> -> Nestloop >> -> Gather Merge >> -> Sort >> -> Index Scan >> >> and it's the sort that expects to find the expression in the Index Scan >> target list. Which seems rather bogus, because clearly the index scan >> does not include the expression. (I wonder if it's somehow related that >> indexes can't be built on volatile expressions ...) > >The index scan does include values not found in the index though, >right? It returns the whole tuple -- though I'm not sure if it >includes calculated values (BTW: is this what is meant by a >"projection" being added? Or is that something else?) > AFAIK index scans are projection-capable - at least it seems like that from is_projection_capable_path. But the volatility blocks that, I think (see below). >> Anyway, the index scan clearly does not include the expression the sort >> references, hence the failure. And the index can can't compute it, >> because we probably need to compute it on top of the join I think >> (otherwise we might get duplicate values for volatile functions etc.) > >In this particular query there's no reason why we'd have to wait to >compute it until after the join, right? > >For example, if we take away the parallelism but still force a Unique >plan, we get this: > > Unique > -> Sort > Sort Key: ref_0.i, (CASE WHEN pg_rotate_logfile_old() THEN >ref_0.t ELSE ref_0.t END) > -> Nested Loop > -> Seq Scan on ref_0 > -> Function Scan on generate_series ref_1 > >And I don't see any reason why the CASE statement couldn't in theory >(I don't know the internals enough to know when it actually happens) >be done as part of the base relation scan (in this case, the seq >scan). It's not dependent on any information from the join. > Imagine a query like this: select t1.id, volatile_func() from t1 join t2 using (id); and now assume t2 contains duplicate values. If the volatile_func gets evaluated as part of the t1 scan, then we'll get multiple occurrences in the results, due to the t2 duplicates. I belive volatile functions are not expected to behave like that - the docs say: A query using a volatile function will re-evaluate the function at every row where its value is needed.. And I assume this references to rows at the level where the function is used, i.e. after the join. >Furthermore the same plan works correctly for a non-volatile >expression, so that means (I think) that the relation scan is capable >of producing that expression in the tuple it generates (either that or >the sort is generating it?). > I do believe the reason is exactly that non-volatile expressions can be pushed down, so that we actually find them in the target list of the node below the sort. See explains for the second set of queries, where I used a simple non-volatile expression. >> Looking at this from a slightly different angle, the root cause here >> seems to be that generate_useful_gather_paths uses the pathkeys it gets >> from get_useful_pathkeys_for_relation, which means root->query_pathkeys. >> But all other create_gather_merge_calls use root->sort_pathkeys, so >> maybe this is the actual problem and get_useful_pathkeys_for_relation >> should use root->sort_pathkeys instead. That does fix the issue for me >> too (and it passes all regression tests). > >So I'm not convinced this is the correct way forward. It seems we want >to be able to apply this as broadly as possible, and using >sort_pathkeys here means we wouldn't use it for the DISTINCT case at >all, right? > >So far I think excluding volatile expressions (or mutable functions?) >is the best way we've discussed so far, though I'm clear yet on why it >is that that's a necessary restriction. If the answer is "there are >cases where it would be unsafe for parallel query even though it isn't >here...", I'm fine with that, but if so, we should make sure that's in >the code comments/readmes somewhere. > >I also verified that the functions I've been playing with >(pg_rotate_logfile_old and md5) are marked parallel safe in pg_proc. > More importanly, it does not actually fix the issue - it does fix that particular query, but just replacing the DISTINCT with either ORDER BY or GROUP BY make it fail again :-( Attached is a simple script I used, demonstrating these issues for the three cases that expect to have ressortgroupref != 0 (per the comment before TargetEntry in plannodes.h). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Fri, Oct 02, 2020 at 04:12:11PM -0400, James Coleman wrote: > >On Fri, Oct 2, 2020 at 2:25 PM Tomas Vondra > ><tomas.vondra@2ndquadrant.com> wrote: > >> > >> On Fri, Oct 02, 2020 at 10:55:14AM -0400, James Coleman wrote: > >> >On Fri, Oct 2, 2020 at 10:53 AM James Coleman <jtc331@gmail.com> wrote: > >> >> > >> >> On Fri, Oct 2, 2020 at 10:32 AM Tomas Vondra > >> >> <tomas.vondra@2ndquadrant.com> wrote: > >> >> > > >> >> > On Fri, Oct 02, 2020 at 09:19:44AM -0400, James Coleman wrote: > >> >> > > > >> >> > > ... > >> >> > > > >> >> > >I've been able to confirm that the problem goes away if we stop adding > >> >> > >the gather merge paths in generate_useful_gather_paths(). > >> >> > > > >> >> > >I'm not sure yet what conclusion that leads us to. It seems to be that > >> >> > >the biggest clue remains that all of this works correctly unless one > >> >> > >of the selected columns (which happens to be a pathkey at this point > >> >> > >because it's a DISTINCT query) contains a volatile expression. > >> >> > > > >> >> > > >> >> > Yeah. It seems to me this is a bug in get_useful_pathkeys_for_relation, > >> >> > which is calling find_em_expr_for_rel and is happy with anything it > >> >> > returns. But this was copied from postgres_fdw, which however does a bit > >> >> > more here: > >> >> > > >> >> > if (pathkey_ec->ec_has_volatile || > >> >> > !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) || > >> >> > !is_foreign_expr(root, rel, em_expr)) > >> >> > > >> >> > So not only does it explicitly check volatility of the pathkey, it also > >> >> > calls is_foreign_expr which checks the expression for mutable functions. > >> >> > > >> >> > The attached patch seems to fix this, but it only adds the check for > >> >> > mutable functions. Maybe it should check ec_has_volatile too ... > >> >> > >> >> We actually discussed the volatility check in that function back on > >> >> the original thread [1], and we'd concluded that was specifically > >> >> necessary for the fdw code because the function would execute on two > >> >> different servers (and thus produce different results), but that in a > >> >> local server only scenario it should be fine. > >> >> > >> >> My understanding (correct me if I'm wrong) is that the volatile > >> >> function should only be executed once (at the scan level?) to build > >> >> the tuple and from then on the datum isn't going to change, so I'm not > >> >> sure why the volatility would matter here. > >> >> > >> >> James > >> >> > >> >> 1: https://www.postgresql.org/message-id/20200328025830.6v6ogkseohakc32q%40development > >> > > >> >Oh, hmm, could what I said all be true, but there still be some rule > >> >that you shouldn't compare datums generated from volatile expressions > >> >in different backends (i.e., parallel query)? > >> > > >> > >> I'm not sure it's all that related to parallel query - it's more likely > >> that when constructing the paths below the gather merge, this new code > >> fails to do something important. > >> > >> I'm not 100% sure how the grouping and volatile functions work, so let > >> me think aloud here ... > >> > >> The backtrace looks like this: > >> > >> #0 get_sortgroupref_tle > >> #1 0x0000000000808ab9 in prepare_sort_from_pathkeys > >> #2 0x000000000080926c in make_sort_from_pathkeys > >> #3 0x0000000000801032 in create_sort_plan > >> #4 0x00000000007fe7e0 in create_plan_recurse > >> #5 0x0000000000800b2c in create_gather_merge_plan > >> #6 0x00000000007fe94d in create_plan_recurse > >> #7 0x0000000000805328 in create_nestloop_plan > >> #8 0x00000000007ff3c5 in create_join_plan > >> #9 0x00000000007fe5f8 in create_plan_recurse > >> #10 0x0000000000800d68 in create_projection_plan > >> #11 0x00000000007fe662 in create_plan_recurse > >> #12 0x0000000000801252 in create_upper_unique_plan > >> #13 0x00000000007fe760 in create_plan_recurse > >> #14 0x00000000007fe4f2 in create_plan > >> #15 0x000000000081082f in standard_planner > >> > >> and the create_sort_plan works with lefttree that is IndexScan, so the > >> query we're constructing looks like this: > >> > >> Distinct > >> -> Nestloop > >> -> Gather Merge > >> -> Sort > >> -> Index Scan > >> > >> and it's the sort that expects to find the expression in the Index Scan > >> target list. Which seems rather bogus, because clearly the index scan > >> does not include the expression. (I wonder if it's somehow related that > >> indexes can't be built on volatile expressions ...) > > > >The index scan does include values not found in the index though, > >right? It returns the whole tuple -- though I'm not sure if it > >includes calculated values (BTW: is this what is meant by a > >"projection" being added? Or is that something else?) > > > > AFAIK index scans are projection-capable - at least it seems like that > from is_projection_capable_path. But the volatility blocks that, I think > (see below). > > >> Anyway, the index scan clearly does not include the expression the sort > >> references, hence the failure. And the index can can't compute it, > >> because we probably need to compute it on top of the join I think > >> (otherwise we might get duplicate values for volatile functions etc.) > > > >In this particular query there's no reason why we'd have to wait to > >compute it until after the join, right? > > > >For example, if we take away the parallelism but still force a Unique > >plan, we get this: > > > > Unique > > -> Sort > > Sort Key: ref_0.i, (CASE WHEN pg_rotate_logfile_old() THEN > >ref_0.t ELSE ref_0.t END) > > -> Nested Loop > > -> Seq Scan on ref_0 > > -> Function Scan on generate_series ref_1 > > > >And I don't see any reason why the CASE statement couldn't in theory > >(I don't know the internals enough to know when it actually happens) > >be done as part of the base relation scan (in this case, the seq > >scan). It's not dependent on any information from the join. > > > > Imagine a query like this: > > select t1.id, volatile_func() from t1 join t2 using (id); > > and now assume t2 contains duplicate values. If the volatile_func gets > evaluated as part of the t1 scan, then we'll get multiple occurrences > in the results, due to the t2 duplicates. I belive volatile functions > are not expected to behave like that - the docs say: > > A query using a volatile function will re-evaluate the function at > every row where its value is needed.. > > And I assume this references to rows at the level where the function is > used, i.e. after the join. > > >Furthermore the same plan works correctly for a non-volatile > >expression, so that means (I think) that the relation scan is capable > >of producing that expression in the tuple it generates (either that or > >the sort is generating it?). > > > > I do believe the reason is exactly that non-volatile expressions can be > pushed down, so that we actually find them in the target list of the > node below the sort. > > See explains for the second set of queries, where I used a simple > non-volatile expression. > > >> Looking at this from a slightly different angle, the root cause here > >> seems to be that generate_useful_gather_paths uses the pathkeys it gets > >> from get_useful_pathkeys_for_relation, which means root->query_pathkeys. > >> But all other create_gather_merge_calls use root->sort_pathkeys, so > >> maybe this is the actual problem and get_useful_pathkeys_for_relation > >> should use root->sort_pathkeys instead. That does fix the issue for me > >> too (and it passes all regression tests). > > > >So I'm not convinced this is the correct way forward. It seems we want > >to be able to apply this as broadly as possible, and using > >sort_pathkeys here means we wouldn't use it for the DISTINCT case at > >all, right? > > > >So far I think excluding volatile expressions (or mutable functions?) > >is the best way we've discussed so far, though I'm clear yet on why it > >is that that's a necessary restriction. If the answer is "there are > >cases where it would be unsafe for parallel query even though it isn't > >here...", I'm fine with that, but if so, we should make sure that's in > >the code comments/readmes somewhere. > > > >I also verified that the functions I've been playing with > >(pg_rotate_logfile_old and md5) are marked parallel safe in pg_proc. > > > > More importanly, it does not actually fix the issue - it does fix that > particular query, but just replacing the DISTINCT with either ORDER BY > or GROUP BY make it fail again :-( > > Attached is a simple script I used, demonstrating these issues for the > three cases that expect to have ressortgroupref != 0 (per the comment > before TargetEntry in plannodes.h). So does checking for volatile expressions (if you happened to test that) solve all the cases? If you haven't tested that yet, I can try to do that this evening. James
On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > >And I don't see any reason why the CASE statement couldn't in theory > >(I don't know the internals enough to know when it actually happens) > >be done as part of the base relation scan (in this case, the seq > >scan). It's not dependent on any information from the join. > > > > Imagine a query like this: > > select t1.id, volatile_func() from t1 join t2 using (id); > > and now assume t2 contains duplicate values. If the volatile_func gets > evaluated as part of the t1 scan, then we'll get multiple occurrences > in the results, due to the t2 duplicates. I belive volatile functions > are not expected to behave like that - the docs say: > > A query using a volatile function will re-evaluate the function at > every row where its value is needed.. > > And I assume this references to rows at the level where the function is > used, i.e. after the join. Ah, this makes sense. I was thinking exactly the opposite: that you'd want not to execute volatile functions multiple times for the same base rel row, but now that you describe it it makes sense why they shouldn't be. James
On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra ><tomas.vondra@2ndquadrant.com> wrote: >> >> ... >> >> More importanly, it does not actually fix the issue - it does fix that >> particular query, but just replacing the DISTINCT with either ORDER BY >> or GROUP BY make it fail again :-( >> >> Attached is a simple script I used, demonstrating these issues for the >> three cases that expect to have ressortgroupref != 0 (per the comment >> before TargetEntry in plannodes.h). > >So does checking for volatile expressions (if you happened to test >that) solve all the cases? If you haven't tested that yet, I can try >to do that this evening. > Yes, it does fix all the three queries in the SQL script. The question however is whether this is the root issue, and whether it's the right way to fix it. For example - volatility is not the only reason that may block qual pushdown. If you look at qual_is_pushdown_safe, it also blocks pushdown of leaky functions in security_barrier views. So I wonder if that could cause failures too, somehow. But I haven't managed to create such example. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > ><tomas.vondra@2ndquadrant.com> wrote: > >> > >> ... > >> > >> More importanly, it does not actually fix the issue - it does fix that > >> particular query, but just replacing the DISTINCT with either ORDER BY > >> or GROUP BY make it fail again :-( > >> > >> Attached is a simple script I used, demonstrating these issues for the > >> three cases that expect to have ressortgroupref != 0 (per the comment > >> before TargetEntry in plannodes.h). > > > >So does checking for volatile expressions (if you happened to test > >that) solve all the cases? If you haven't tested that yet, I can try > >to do that this evening. > > > > Yes, it does fix all the three queries in the SQL script. > > The question however is whether this is the root issue, and whether it's > the right way to fix it. For example - volatility is not the only reason > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > also blocks pushdown of leaky functions in security_barrier views. So I > wonder if that could cause failures too, somehow. But I haven't managed > to create such example. I was about to say that the issue here is slightly different from qual etc. pushdown, since we're not concerned about quals here, and so I wonder where we determine what target list entries to put in a given scan path, but then I realized that implies (maybe!) a simpler solution. Instead of duplicating checks on target list entries would be safe, why not check directly in get_useful_pathkeys_for_relation() whether or not the pathkey has a target list entry? I haven't been able to try that out yet, and so maybe I'm missing something, but I need to step out for a bit, so I'll have to look at it later. James
On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: > > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > > ><tomas.vondra@2ndquadrant.com> wrote: > > >> > > >> ... > > >> > > >> More importanly, it does not actually fix the issue - it does fix that > > >> particular query, but just replacing the DISTINCT with either ORDER BY > > >> or GROUP BY make it fail again :-( > > >> > > >> Attached is a simple script I used, demonstrating these issues for the > > >> three cases that expect to have ressortgroupref != 0 (per the comment > > >> before TargetEntry in plannodes.h). > > > > > >So does checking for volatile expressions (if you happened to test > > >that) solve all the cases? If you haven't tested that yet, I can try > > >to do that this evening. > > > > > > > Yes, it does fix all the three queries in the SQL script. > > > > The question however is whether this is the root issue, and whether it's > > the right way to fix it. For example - volatility is not the only reason > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > > also blocks pushdown of leaky functions in security_barrier views. So I > > wonder if that could cause failures too, somehow. But I haven't managed > > to create such example. > > I was about to say that the issue here is slightly different from qual > etc. pushdown, since we're not concerned about quals here, and so I > wonder where we determine what target list entries to put in a given > scan path, but then I realized that implies (maybe!) a simpler > solution. Instead of duplicating checks on target list entries would > be safe, why not check directly in get_useful_pathkeys_for_relation() > whether or not the pathkey has a target list entry? > > I haven't been able to try that out yet, and so maybe I'm missing > something, but I need to step out for a bit, so I'll have to look at > it later. I've started poking at this, but haven't yet found a workable solution. See the attached patch which "solves" the problem by breaking putting the sort under the gather merge, but it at least demonstrates conceptually what I think we're interested in doing. The issue currently is that the comparison of expressions fails -- in our query, the first column selected shows up as a Var (with the same contents) but is a different pointer in the em_expr and the reltarget exprs list. I don't yet see a good way to get the equivalence class for a PathTarget entry. James
Attachment
On Fri, Oct 2, 2020 at 2:25 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > The backtrace looks like this: > > #0 get_sortgroupref_tle > #1 0x0000000000808ab9 in prepare_sort_from_pathkeys > #2 0x000000000080926c in make_sort_from_pathkeys > #3 0x0000000000801032 in create_sort_plan > #4 0x00000000007fe7e0 in create_plan_recurse > #5 0x0000000000800b2c in create_gather_merge_plan > #6 0x00000000007fe94d in create_plan_recurse > #7 0x0000000000805328 in create_nestloop_plan > #8 0x00000000007ff3c5 in create_join_plan > #9 0x00000000007fe5f8 in create_plan_recurse > #10 0x0000000000800d68 in create_projection_plan > #11 0x00000000007fe662 in create_plan_recurse > #12 0x0000000000801252 in create_upper_unique_plan > #13 0x00000000007fe760 in create_plan_recurse > #14 0x00000000007fe4f2 in create_plan > #15 0x000000000081082f in standard_planner > > and the create_sort_plan works with lefttree that is IndexScan, so the > query we're constructing looks like this: > > Distinct > -> Nestloop > -> Gather Merge > -> Sort > -> Index Scan > > and it's the sort that expects to find the expression in the Index Scan > target list. Which seems rather bogus, because clearly the index scan > does not include the expression. (I wonder if it's somehow related that > indexes can't be built on volatile expressions ...) > > Anyway, the index scan clearly does not include the expression the sort > references, hence the failure. And the index can can't compute it, > because we probably need to compute it on top of the join I think > (otherwise we might get duplicate values for volatile functions etc.) > > > Looking at this from a slightly different angle, the root cause here > seems to be that generate_useful_gather_paths uses the pathkeys it gets > from get_useful_pathkeys_for_relation, which means root->query_pathkeys. > But all other create_gather_merge_calls use root->sort_pathkeys, so > maybe this is the actual problem and get_useful_pathkeys_for_relation > should use root->sort_pathkeys instead. That does fix the issue for me > too (and it passes all regression tests). So I've been a bit confused how our error could come from working with root->query_pathkeys when that's what's supposedly being set from the make_pathkeys_for_sortclauses() call in the backtrace Jaime reported, but I just realized that the trace I get when reproducing the error is different -- and matches the one you shared above. Jaime: was the backtrace in the original report by any chance record from breakpointing in the first call to get_sortgroupref_tle() (and one that successfully returned a sort group ref) rather than a call that hit the elog error on line 379? James
On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331@gmail.com> wrote: > > On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: > > > > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > > <tomas.vondra@2ndquadrant.com> wrote: > > > > > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > > > ><tomas.vondra@2ndquadrant.com> wrote: > > > >> > > > >> ... > > > >> > > > >> More importanly, it does not actually fix the issue - it does fix that > > > >> particular query, but just replacing the DISTINCT with either ORDER BY > > > >> or GROUP BY make it fail again :-( > > > >> > > > >> Attached is a simple script I used, demonstrating these issues for the > > > >> three cases that expect to have ressortgroupref != 0 (per the comment > > > >> before TargetEntry in plannodes.h). > > > > > > > >So does checking for volatile expressions (if you happened to test > > > >that) solve all the cases? If you haven't tested that yet, I can try > > > >to do that this evening. > > > > > > > > > > Yes, it does fix all the three queries in the SQL script. > > > > > > The question however is whether this is the root issue, and whether it's > > > the right way to fix it. For example - volatility is not the only reason > > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > > > also blocks pushdown of leaky functions in security_barrier views. So I > > > wonder if that could cause failures too, somehow. But I haven't managed > > > to create such example. > > > > I was about to say that the issue here is slightly different from qual > > etc. pushdown, since we're not concerned about quals here, and so I > > wonder where we determine what target list entries to put in a given > > scan path, but then I realized that implies (maybe!) a simpler > > solution. Instead of duplicating checks on target list entries would > > be safe, why not check directly in get_useful_pathkeys_for_relation() > > whether or not the pathkey has a target list entry? > > > > I haven't been able to try that out yet, and so maybe I'm missing > > something, but I need to step out for a bit, so I'll have to look at > > it later. > > I've started poking at this, but haven't yet found a workable > solution. See the attached patch which "solves" the problem by > breaking putting the sort under the gather merge, but it at least > demonstrates conceptually what I think we're interested in doing. > > The issue currently is that the comparison of expressions fails -- in > our query, the first column selected shows up as a Var (with the same > contents) but is a different pointer in the em_expr and the reltarget > exprs list. I don't yet see a good way to get the equivalence class > for a PathTarget entry. Hmm, I think I was looking to do is the attached patch. I didn't realize until I did a lot of reading through source that we have an equal() function that can compare exprs. That (plus the realization in [1] the originally reported backtrace wasn't where the error actually came from) convinced me that what we need is to confirm not only that the all of the vars in the ec member come from the relids in the rel but also that the expr is actually represented in the target of the rel. With the gucs changed as I mentioned earlier both of the plans (with and without a volatile call in the 2nd select entry) now look like: Unique -> Gather Merge Workers Planned: 2 -> Sort Sort Key: ref_0.i, (md5(ref_0.t)) -> Nested Loop -> Parallel Index Scan using ref_0_i_idx on ref_0 -> Function Scan on generate_series ref_1 Without the gucs changed the minimal repro case now doesn't error, but results in this plan: HashAggregate Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t ELSE ref_0.t END -> Nested Loop -> Seq Scan on ref_0 -> Function Scan on generate_series ref_1 Similarly in your six queries I now only see parallel query showing up in the last one. I created an entirely new function because adding the target expr lookup to the existing find_em_expr_for_rel() function broke a bunch of postgres_fdw tests. That maybe raises questions about whether that code also could have problems in theory/in the future, but I didn't investigate further. In any case we already know it excludes volatile...so maybe it's fine because in practice that's actually a broader exclusion than what we're doing here. This seems to fix the issue, but I'd like feedback on whether it's too strict. We could of course just check em_has_volatile, but I'm wondering if that's simultaneously too strict (by not allowing the volatile expression to be computed in the gather merge supposing there's no join) and too loose (maybe there are other cases we should care about?). It also just strikes me as re-encoding rules that should have already been applied (and thus we should be able to look up in the data we have if it's safe to use the expr/pathkey). Those are mostly intuitions though. James 1: https://www.postgresql.org/message-id/CAAaqYe8zqDAv0Sfak5Riu%2BDKsm-i3ARPursn5v6qTwiCXmkXKQ%40mail.gmail.com
Attachment
On Sat, 3 Oct 2020 at 08:15, James Coleman <jtc331@gmail.com> wrote: > > Jaime: was the backtrace in the original report by any chance record > from breakpointing in the first call to get_sortgroupref_tle() (and > one that successfully returned a sort group ref) rather than a call > that hit the elog error on line 379? > I remember the first backtrace a made for this was on the line of the elog but then a second backtrace was with the function, now that repeat it with line of the function for the elog (and the original query) i see again create_incrementalsort_plan() which made me try with enable_incremental_sort disabled... sorry for that, attached a better backtrace -- Jaime Casanova www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Sat, Oct 03, 2020 at 10:50:06AM -0400, James Coleman wrote: >On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331@gmail.com> wrote: >> >> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: >> > >> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra >> > <tomas.vondra@2ndquadrant.com> wrote: >> > > >> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: >> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra >> > > ><tomas.vondra@2ndquadrant.com> wrote: >> > > >> >> > > >> ... >> > > >> >> > > >> More importanly, it does not actually fix the issue - it does fix that >> > > >> particular query, but just replacing the DISTINCT with either ORDER BY >> > > >> or GROUP BY make it fail again :-( >> > > >> >> > > >> Attached is a simple script I used, demonstrating these issues for the >> > > >> three cases that expect to have ressortgroupref != 0 (per the comment >> > > >> before TargetEntry in plannodes.h). >> > > > >> > > >So does checking for volatile expressions (if you happened to test >> > > >that) solve all the cases? If you haven't tested that yet, I can try >> > > >to do that this evening. >> > > > >> > > >> > > Yes, it does fix all the three queries in the SQL script. >> > > >> > > The question however is whether this is the root issue, and whether it's >> > > the right way to fix it. For example - volatility is not the only reason >> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it >> > > also blocks pushdown of leaky functions in security_barrier views. So I >> > > wonder if that could cause failures too, somehow. But I haven't managed >> > > to create such example. >> > >> > I was about to say that the issue here is slightly different from qual >> > etc. pushdown, since we're not concerned about quals here, and so I >> > wonder where we determine what target list entries to put in a given >> > scan path, but then I realized that implies (maybe!) a simpler >> > solution. Instead of duplicating checks on target list entries would >> > be safe, why not check directly in get_useful_pathkeys_for_relation() >> > whether or not the pathkey has a target list entry? >> > >> > I haven't been able to try that out yet, and so maybe I'm missing >> > something, but I need to step out for a bit, so I'll have to look at >> > it later. >> >> I've started poking at this, but haven't yet found a workable >> solution. See the attached patch which "solves" the problem by >> breaking putting the sort under the gather merge, but it at least >> demonstrates conceptually what I think we're interested in doing. >> >> The issue currently is that the comparison of expressions fails -- in >> our query, the first column selected shows up as a Var (with the same >> contents) but is a different pointer in the em_expr and the reltarget >> exprs list. I don't yet see a good way to get the equivalence class >> for a PathTarget entry. > >Hmm, I think I was looking to do is the attached patch. I didn't >realize until I did a lot of reading through source that we have an >equal() function that can compare exprs. That (plus the realization in >[1] the originally reported backtrace wasn't where the error actually >came from) convinced me that what we need is to confirm not only that >the all of the vars in the ec member come from the relids in the rel >but also that the expr is actually represented in the target of the >rel. > >With the gucs changed as I mentioned earlier both of the plans (with >and without a volatile call in the 2nd select entry) now look like: > > Unique > -> Gather Merge > Workers Planned: 2 > -> Sort > Sort Key: ref_0.i, (md5(ref_0.t)) > -> Nested Loop > -> Parallel Index Scan using ref_0_i_idx on ref_0 > -> Function Scan on generate_series ref_1 > >Without the gucs changed the minimal repro case now doesn't error, but >results in this plan: > > HashAggregate > Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t >ELSE ref_0.t END > -> Nested Loop > -> Seq Scan on ref_0 > -> Function Scan on generate_series ref_1 > >Similarly in your six queries I now only see parallel query showing up >in the last one. > OK, that seems reasonable I think. >I created an entirely new function because adding the target expr >lookup to the existing find_em_expr_for_rel() function broke a bunch >of postgres_fdw tests. That maybe raises questions about whether that >code also could have problems in theory/in the future, but I didn't >investigate further. In any case we already know it excludes >volatile...so maybe it's fine because in practice that's actually a >broader exclusion than what we're doing here. > I don't think postgres_fdw needs these new checks, because FDW scan's are not allowed in parallel part of the plan - if that changes in the future, I'm sure that'll require fixes in plenty other places. OTOH it's interesting that it triggers those failures - I wonder if we could learn something from them? AFAICS those paths can't be built by generate_useful_gather_paths (because of the postgres_fdw vs. parallel query restrictions), so how do these plans look? >This seems to fix the issue, but I'd like feedback on whether it's too >strict. We could of course just check em_has_volatile, but I'm >wondering if that's simultaneously too strict (by not allowing the >volatile expression to be computed in the gather merge supposing >there's no join) and too loose (maybe there are other cases we should >care about?). It also just strikes me as re-encoding rules that should >have already been applied (and thus we should be able to look up in >the data we have if it's safe to use the expr/pathkey). Those are >mostly intuitions though. > I don't know :-( As I mentioned before, I suspect checking just the volatility may not be enough in some cases (leakyness, etc.) and I think you're right it may be too strict in other cases. Not sure I understand which rules you think we're re-encoding, but I have a nagging feeling there's a piece of code somewhere earlier in the query planner meant to prevent such cases (sort on path without all the pathkeys), and that fixing it here is just a band aid :-( I'm sure we're building plans with Sort on top of Index Scan paths, so how come those don't fail? How come the non-parallel version of these queries don't fail? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Oct 3, 2020 at 5:44 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Sat, Oct 03, 2020 at 10:50:06AM -0400, James Coleman wrote: > >On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331@gmail.com> wrote: > >> > >> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: > >> > > >> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > >> > <tomas.vondra@2ndquadrant.com> wrote: > >> > > > >> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > >> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > >> > > ><tomas.vondra@2ndquadrant.com> wrote: > >> > > >> > >> > > >> ... > >> > > >> > >> > > >> More importanly, it does not actually fix the issue - it does fix that > >> > > >> particular query, but just replacing the DISTINCT with either ORDER BY > >> > > >> or GROUP BY make it fail again :-( > >> > > >> > >> > > >> Attached is a simple script I used, demonstrating these issues for the > >> > > >> three cases that expect to have ressortgroupref != 0 (per the comment > >> > > >> before TargetEntry in plannodes.h). > >> > > > > >> > > >So does checking for volatile expressions (if you happened to test > >> > > >that) solve all the cases? If you haven't tested that yet, I can try > >> > > >to do that this evening. > >> > > > > >> > > > >> > > Yes, it does fix all the three queries in the SQL script. > >> > > > >> > > The question however is whether this is the root issue, and whether it's > >> > > the right way to fix it. For example - volatility is not the only reason > >> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > >> > > also blocks pushdown of leaky functions in security_barrier views. So I > >> > > wonder if that could cause failures too, somehow. But I haven't managed > >> > > to create such example. > >> > > >> > I was about to say that the issue here is slightly different from qual > >> > etc. pushdown, since we're not concerned about quals here, and so I > >> > wonder where we determine what target list entries to put in a given > >> > scan path, but then I realized that implies (maybe!) a simpler > >> > solution. Instead of duplicating checks on target list entries would > >> > be safe, why not check directly in get_useful_pathkeys_for_relation() > >> > whether or not the pathkey has a target list entry? > >> > > >> > I haven't been able to try that out yet, and so maybe I'm missing > >> > something, but I need to step out for a bit, so I'll have to look at > >> > it later. > >> > >> I've started poking at this, but haven't yet found a workable > >> solution. See the attached patch which "solves" the problem by > >> breaking putting the sort under the gather merge, but it at least > >> demonstrates conceptually what I think we're interested in doing. > >> > >> The issue currently is that the comparison of expressions fails -- in > >> our query, the first column selected shows up as a Var (with the same > >> contents) but is a different pointer in the em_expr and the reltarget > >> exprs list. I don't yet see a good way to get the equivalence class > >> for a PathTarget entry. > > > >Hmm, I think I was looking to do is the attached patch. I didn't > >realize until I did a lot of reading through source that we have an > >equal() function that can compare exprs. That (plus the realization in > >[1] the originally reported backtrace wasn't where the error actually > >came from) convinced me that what we need is to confirm not only that > >the all of the vars in the ec member come from the relids in the rel > >but also that the expr is actually represented in the target of the > >rel. > > > >With the gucs changed as I mentioned earlier both of the plans (with > >and without a volatile call in the 2nd select entry) now look like: > > > > Unique > > -> Gather Merge > > Workers Planned: 2 > > -> Sort > > Sort Key: ref_0.i, (md5(ref_0.t)) > > -> Nested Loop > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > -> Function Scan on generate_series ref_1 > > > >Without the gucs changed the minimal repro case now doesn't error, but > >results in this plan: > > > > HashAggregate > > Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t > >ELSE ref_0.t END > > -> Nested Loop > > -> Seq Scan on ref_0 > > -> Function Scan on generate_series ref_1 > > > >Similarly in your six queries I now only see parallel query showing up > >in the last one. > > > > OK, that seems reasonable I think. If we proceed with this patch I'd like to tweak it to check for the relids subset first on the theory that it will be cheaper than the expr equality check loop; that change is attached. > >I created an entirely new function because adding the target expr > >lookup to the existing find_em_expr_for_rel() function broke a bunch > >of postgres_fdw tests. That maybe raises questions about whether that > >code also could have problems in theory/in the future, but I didn't > >investigate further. In any case we already know it excludes > >volatile...so maybe it's fine because in practice that's actually a > >broader exclusion than what we're doing here. > > > > I don't think postgres_fdw needs these new checks, because FDW scan's > are not allowed in parallel part of the plan - if that changes in the > future, I'm sure that'll require fixes in plenty other places. > > OTOH it's interesting that it triggers those failures - I wonder if we > could learn something from them? AFAICS those paths can't be built by > generate_useful_gather_paths (because of the postgres_fdw vs. parallel > query restrictions), so how do these plans look? Well the fdw code doesn't trigger the errors, but both generate_useful_gather_paths() and the fdw code originally relied on find_em_expr_for_rel(), but now they're diverging. IIRC the fdw code also uses it for determining what path keys are valid -- in that case which ones are safe to be sent to the foreign server (so all of the vars have to be from rels on the foreign server). The fdw code has additional checks too though, for example no volatile expressions may be pushed to the remote. > >This seems to fix the issue, but I'd like feedback on whether it's too > >strict. We could of course just check em_has_volatile, but I'm > >wondering if that's simultaneously too strict (by not allowing the > >volatile expression to be computed in the gather merge supposing > >there's no join) and too loose (maybe there are other cases we should > >care about?). It also just strikes me as re-encoding rules that should > >have already been applied (and thus we should be able to look up in > >the data we have if it's safe to use the expr/pathkey). Those are > >mostly intuitions though. > > > > I don't know :-( As I mentioned before, I suspect checking just the > volatility may not be enough in some cases (leakyness, etc.) and I think > you're right it may be too strict in other cases. > > Not sure I understand which rules you think we're re-encoding, but I > have a nagging feeling there's a piece of code somewhere earlier in > the query planner meant to prevent such cases (sort on path without all > the pathkeys), and that fixing it here is just a band aid :-( I'm getting at exactly what you're saying below: there's got to be some existing rules (and presumably code already enforcing it in some places) at what level in the path tree a given pathkey is safe/correct, and we don't want to reinvent those rules (or ideally that code). Verifying the expr is in the rel seems to me to be intuitively correct, but I wish there was another place in the code that could validate that. > I'm sure we're building plans with Sort on top of Index Scan paths, so > how come those don't fail? How come the non-parallel version of these > queries don't fail? Yeah, exactly. Something has to be doing something similar -- I don't think everywhere else using sort_pathkeys instead of query_pathkeys really changes the problem. I've read through all of the places we use generate_useful_gather_paths() as well as root->sort_pathkeys to try to get a better understanding of this. Most of the places I convinced myself it's already safe -- for example places like create_orderd_paths are working with the upper rel, and the full target list is available. A few other places I added comments (see the "questions" patch in the attached series) where I'm not sure why we apply projections *after* we create a sort path; given the issue we're discussing here I would have thought you'd want to do exactly the opposite. I haven't yet tried to read through the code in other places where we build an explicit sort to see if I can find any hints as to what existing code does to ensure this situation doesn't occur, but so far I haven't found anything else obvious. Is there someone who might be able to point us in the right direction/validate what we're thinking? James
Attachment
On Sat, Oct 3, 2020 at 10:10 PM James Coleman <jtc331@gmail.com> wrote: > > On Sat, Oct 3, 2020 at 5:44 PM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > > > On Sat, Oct 03, 2020 at 10:50:06AM -0400, James Coleman wrote: > > >On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331@gmail.com> wrote: > > >> > > >> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: > > >> > > > >> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > > >> > <tomas.vondra@2ndquadrant.com> wrote: > > >> > > > > >> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > > >> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > > >> > > ><tomas.vondra@2ndquadrant.com> wrote: > > >> > > >> > > >> > > >> ... > > >> > > >> > > >> > > >> More importanly, it does not actually fix the issue - it does fix that > > >> > > >> particular query, but just replacing the DISTINCT with either ORDER BY > > >> > > >> or GROUP BY make it fail again :-( > > >> > > >> > > >> > > >> Attached is a simple script I used, demonstrating these issues for the > > >> > > >> three cases that expect to have ressortgroupref != 0 (per the comment > > >> > > >> before TargetEntry in plannodes.h). > > >> > > > > > >> > > >So does checking for volatile expressions (if you happened to test > > >> > > >that) solve all the cases? If you haven't tested that yet, I can try > > >> > > >to do that this evening. > > >> > > > > > >> > > > > >> > > Yes, it does fix all the three queries in the SQL script. > > >> > > > > >> > > The question however is whether this is the root issue, and whether it's > > >> > > the right way to fix it. For example - volatility is not the only reason > > >> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > > >> > > also blocks pushdown of leaky functions in security_barrier views. So I > > >> > > wonder if that could cause failures too, somehow. But I haven't managed > > >> > > to create such example. > > >> > > > >> > I was about to say that the issue here is slightly different from qual > > >> > etc. pushdown, since we're not concerned about quals here, and so I > > >> > wonder where we determine what target list entries to put in a given > > >> > scan path, but then I realized that implies (maybe!) a simpler > > >> > solution. Instead of duplicating checks on target list entries would > > >> > be safe, why not check directly in get_useful_pathkeys_for_relation() > > >> > whether or not the pathkey has a target list entry? > > >> > > > >> > I haven't been able to try that out yet, and so maybe I'm missing > > >> > something, but I need to step out for a bit, so I'll have to look at > > >> > it later. > > >> > > >> I've started poking at this, but haven't yet found a workable > > >> solution. See the attached patch which "solves" the problem by > > >> breaking putting the sort under the gather merge, but it at least > > >> demonstrates conceptually what I think we're interested in doing. > > >> > > >> The issue currently is that the comparison of expressions fails -- in > > >> our query, the first column selected shows up as a Var (with the same > > >> contents) but is a different pointer in the em_expr and the reltarget > > >> exprs list. I don't yet see a good way to get the equivalence class > > >> for a PathTarget entry. > > > > > >Hmm, I think I was looking to do is the attached patch. I didn't > > >realize until I did a lot of reading through source that we have an > > >equal() function that can compare exprs. That (plus the realization in > > >[1] the originally reported backtrace wasn't where the error actually > > >came from) convinced me that what we need is to confirm not only that > > >the all of the vars in the ec member come from the relids in the rel > > >but also that the expr is actually represented in the target of the > > >rel. > > > > > >With the gucs changed as I mentioned earlier both of the plans (with > > >and without a volatile call in the 2nd select entry) now look like: > > > > > > Unique > > > -> Gather Merge > > > Workers Planned: 2 > > > -> Sort > > > Sort Key: ref_0.i, (md5(ref_0.t)) > > > -> Nested Loop > > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > > -> Function Scan on generate_series ref_1 > > > > > >Without the gucs changed the minimal repro case now doesn't error, but > > >results in this plan: > > > > > > HashAggregate > > > Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t > > >ELSE ref_0.t END > > > -> Nested Loop > > > -> Seq Scan on ref_0 > > > -> Function Scan on generate_series ref_1 > > > > > >Similarly in your six queries I now only see parallel query showing up > > >in the last one. > > > > > > > OK, that seems reasonable I think. > > If we proceed with this patch I'd like to tweak it to check for the > relids subset first on the theory that it will be cheaper than the > expr equality check loop; that change is attached. > > > >I created an entirely new function because adding the target expr > > >lookup to the existing find_em_expr_for_rel() function broke a bunch > > >of postgres_fdw tests. That maybe raises questions about whether that > > >code also could have problems in theory/in the future, but I didn't > > >investigate further. In any case we already know it excludes > > >volatile...so maybe it's fine because in practice that's actually a > > >broader exclusion than what we're doing here. > > > > > > > I don't think postgres_fdw needs these new checks, because FDW scan's > > are not allowed in parallel part of the plan - if that changes in the > > future, I'm sure that'll require fixes in plenty other places. > > > > OTOH it's interesting that it triggers those failures - I wonder if we > > could learn something from them? AFAICS those paths can't be built by > > generate_useful_gather_paths (because of the postgres_fdw vs. parallel > > query restrictions), so how do these plans look? > > Well the fdw code doesn't trigger the errors, but both > generate_useful_gather_paths() and the fdw code originally relied on > find_em_expr_for_rel(), but now they're diverging. IIRC the fdw code > also uses it for determining what path keys are valid -- in that case > which ones are safe to be sent to the foreign server (so all of the > vars have to be from rels on the foreign server). The fdw code has > additional checks too though, for example no volatile expressions may > be pushed to the remote. > > > >This seems to fix the issue, but I'd like feedback on whether it's too > > >strict. We could of course just check em_has_volatile, but I'm > > >wondering if that's simultaneously too strict (by not allowing the > > >volatile expression to be computed in the gather merge supposing > > >there's no join) and too loose (maybe there are other cases we should > > >care about?). It also just strikes me as re-encoding rules that should > > >have already been applied (and thus we should be able to look up in > > >the data we have if it's safe to use the expr/pathkey). Those are > > >mostly intuitions though. > > > > > > > I don't know :-( As I mentioned before, I suspect checking just the > > volatility may not be enough in some cases (leakyness, etc.) and I think > > you're right it may be too strict in other cases. > > > > Not sure I understand which rules you think we're re-encoding, but I > > have a nagging feeling there's a piece of code somewhere earlier in > > the query planner meant to prevent such cases (sort on path without all > > the pathkeys), and that fixing it here is just a band aid :-( > > I'm getting at exactly what you're saying below: there's got to be > some existing rules (and presumably code already enforcing it in some > places) at what level in the path tree a given pathkey is > safe/correct, and we don't want to reinvent those rules (or ideally > that code). Verifying the expr is in the rel seems to me to be > intuitively correct, but I wish there was another place in the code > that could validate that. > > > I'm sure we're building plans with Sort on top of Index Scan paths, so > > how come those don't fail? How come the non-parallel version of these > > queries don't fail? > > Yeah, exactly. Something has to be doing something similar -- I don't > think everywhere else using sort_pathkeys instead of query_pathkeys > really changes the problem. I've read through all of the places we use > generate_useful_gather_paths() as well as root->sort_pathkeys to try > to get a better understanding of this. Most of the places I convinced > myself it's already safe -- for example places like > create_orderd_paths are working with the upper rel, and the full > target list is available. A few other places I added comments (see the > "questions" patch in the attached series) where I'm not sure why we > apply projections *after* we create a sort path; given the issue we're > discussing here I would have thought you'd want to do exactly the > opposite. > > I haven't yet tried to read through the code in other places where we > build an explicit sort to see if I can find any hints as to what > existing code does to ensure this situation doesn't occur, but so far > I haven't found anything else obvious. I did a lot more reading through code, and I concluded that generally (maybe exclusively?) create_sort_path() is only called on upper rels (e.g., from create_ordered_paths()), so we already have the full target list available. I also stepped through in the debugger and confirmed that non-var exprs generally don't seem to show up in the pathtarget of paths either on base rels or on join rels. Finally, the last piece of relevant data is that we don't actually push down sorts on non-parallel paths even when it would be beneficial. For example, this query: select i, md5(i::text) from t2, generate_series(1,2) order by i; generates this plan: Sort Sort Key: t2.i -> Nested Loop -> Function Scan on generate_series -> Materialize -> Seq Scan on t2 but it seems obvious to me that it would be better to move the sort to immediately above the seq scan. So I conclude from all of this that the real reason we didn't see this problem before was that we didn't have anything that was trying to generate useful sort paths lower in the query, so it just worked. To state it from the other direction, I don't see any indication that there actually is any code trying to determine what's safe to put in the target at various join levels; from what I can tell only vars are distributed at those points. Thinking about it some more, this actually makes sense, since get_useful_pathkeys() is new code, and something like it would have been needed already if we wanted to push sorts down further into the plan. As a bit of a side note: combined with the above idea about pushing sort down, it could also be beneficial to try to distribute expressions lower too. So I think the previously attached patch is probably correct; the second patch with the comment questions shows I still don't fully understand how/when projections are applied (and when they're needed), though it might have something to do with what I said about only operating on upper rels. James
On Sun, Oct 4, 2020 at 9:40 PM James Coleman <jtc331@gmail.com> wrote: > > On Sat, Oct 3, 2020 at 10:10 PM James Coleman <jtc331@gmail.com> wrote: > > > > On Sat, Oct 3, 2020 at 5:44 PM Tomas Vondra > > <tomas.vondra@2ndquadrant.com> wrote: > > > > > > On Sat, Oct 03, 2020 at 10:50:06AM -0400, James Coleman wrote: > > > >On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331@gmail.com> wrote: > > > >> > > > >> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: > > > >> > > > > >> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > > > >> > <tomas.vondra@2ndquadrant.com> wrote: > > > >> > > > > > >> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > > > >> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > > > >> > > ><tomas.vondra@2ndquadrant.com> wrote: > > > >> > > >> > > > >> > > >> ... > > > >> > > >> > > > >> > > >> More importanly, it does not actually fix the issue - it does fix that > > > >> > > >> particular query, but just replacing the DISTINCT with either ORDER BY > > > >> > > >> or GROUP BY make it fail again :-( > > > >> > > >> > > > >> > > >> Attached is a simple script I used, demonstrating these issues for the > > > >> > > >> three cases that expect to have ressortgroupref != 0 (per the comment > > > >> > > >> before TargetEntry in plannodes.h). > > > >> > > > > > > >> > > >So does checking for volatile expressions (if you happened to test > > > >> > > >that) solve all the cases? If you haven't tested that yet, I can try > > > >> > > >to do that this evening. > > > >> > > > > > > >> > > > > > >> > > Yes, it does fix all the three queries in the SQL script. > > > >> > > > > > >> > > The question however is whether this is the root issue, and whether it's > > > >> > > the right way to fix it. For example - volatility is not the only reason > > > >> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > > > >> > > also blocks pushdown of leaky functions in security_barrier views. So I > > > >> > > wonder if that could cause failures too, somehow. But I haven't managed > > > >> > > to create such example. > > > >> > > > > >> > I was about to say that the issue here is slightly different from qual > > > >> > etc. pushdown, since we're not concerned about quals here, and so I > > > >> > wonder where we determine what target list entries to put in a given > > > >> > scan path, but then I realized that implies (maybe!) a simpler > > > >> > solution. Instead of duplicating checks on target list entries would > > > >> > be safe, why not check directly in get_useful_pathkeys_for_relation() > > > >> > whether or not the pathkey has a target list entry? > > > >> > > > > >> > I haven't been able to try that out yet, and so maybe I'm missing > > > >> > something, but I need to step out for a bit, so I'll have to look at > > > >> > it later. > > > >> > > > >> I've started poking at this, but haven't yet found a workable > > > >> solution. See the attached patch which "solves" the problem by > > > >> breaking putting the sort under the gather merge, but it at least > > > >> demonstrates conceptually what I think we're interested in doing. > > > >> > > > >> The issue currently is that the comparison of expressions fails -- in > > > >> our query, the first column selected shows up as a Var (with the same > > > >> contents) but is a different pointer in the em_expr and the reltarget > > > >> exprs list. I don't yet see a good way to get the equivalence class > > > >> for a PathTarget entry. > > > > > > > >Hmm, I think I was looking to do is the attached patch. I didn't > > > >realize until I did a lot of reading through source that we have an > > > >equal() function that can compare exprs. That (plus the realization in > > > >[1] the originally reported backtrace wasn't where the error actually > > > >came from) convinced me that what we need is to confirm not only that > > > >the all of the vars in the ec member come from the relids in the rel > > > >but also that the expr is actually represented in the target of the > > > >rel. > > > > > > > >With the gucs changed as I mentioned earlier both of the plans (with > > > >and without a volatile call in the 2nd select entry) now look like: > > > > > > > > Unique > > > > -> Gather Merge > > > > Workers Planned: 2 > > > > -> Sort > > > > Sort Key: ref_0.i, (md5(ref_0.t)) > > > > -> Nested Loop > > > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > > > -> Function Scan on generate_series ref_1 > > > > > > > >Without the gucs changed the minimal repro case now doesn't error, but > > > >results in this plan: > > > > > > > > HashAggregate > > > > Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t > > > >ELSE ref_0.t END > > > > -> Nested Loop > > > > -> Seq Scan on ref_0 > > > > -> Function Scan on generate_series ref_1 > > > > > > > >Similarly in your six queries I now only see parallel query showing up > > > >in the last one. > > > > > > > > > > OK, that seems reasonable I think. > > > > If we proceed with this patch I'd like to tweak it to check for the > > relids subset first on the theory that it will be cheaper than the > > expr equality check loop; that change is attached. > > > > > >I created an entirely new function because adding the target expr > > > >lookup to the existing find_em_expr_for_rel() function broke a bunch > > > >of postgres_fdw tests. That maybe raises questions about whether that > > > >code also could have problems in theory/in the future, but I didn't > > > >investigate further. In any case we already know it excludes > > > >volatile...so maybe it's fine because in practice that's actually a > > > >broader exclusion than what we're doing here. > > > > > > > > > > I don't think postgres_fdw needs these new checks, because FDW scan's > > > are not allowed in parallel part of the plan - if that changes in the > > > future, I'm sure that'll require fixes in plenty other places. > > > > > > OTOH it's interesting that it triggers those failures - I wonder if we > > > could learn something from them? AFAICS those paths can't be built by > > > generate_useful_gather_paths (because of the postgres_fdw vs. parallel > > > query restrictions), so how do these plans look? > > > > Well the fdw code doesn't trigger the errors, but both > > generate_useful_gather_paths() and the fdw code originally relied on > > find_em_expr_for_rel(), but now they're diverging. IIRC the fdw code > > also uses it for determining what path keys are valid -- in that case > > which ones are safe to be sent to the foreign server (so all of the > > vars have to be from rels on the foreign server). The fdw code has > > additional checks too though, for example no volatile expressions may > > be pushed to the remote. > > > > > >This seems to fix the issue, but I'd like feedback on whether it's too > > > >strict. We could of course just check em_has_volatile, but I'm > > > >wondering if that's simultaneously too strict (by not allowing the > > > >volatile expression to be computed in the gather merge supposing > > > >there's no join) and too loose (maybe there are other cases we should > > > >care about?). It also just strikes me as re-encoding rules that should > > > >have already been applied (and thus we should be able to look up in > > > >the data we have if it's safe to use the expr/pathkey). Those are > > > >mostly intuitions though. > > > > > > > > > > I don't know :-( As I mentioned before, I suspect checking just the > > > volatility may not be enough in some cases (leakyness, etc.) and I think > > > you're right it may be too strict in other cases. > > > > > > Not sure I understand which rules you think we're re-encoding, but I > > > have a nagging feeling there's a piece of code somewhere earlier in > > > the query planner meant to prevent such cases (sort on path without all > > > the pathkeys), and that fixing it here is just a band aid :-( > > > > I'm getting at exactly what you're saying below: there's got to be > > some existing rules (and presumably code already enforcing it in some > > places) at what level in the path tree a given pathkey is > > safe/correct, and we don't want to reinvent those rules (or ideally > > that code). Verifying the expr is in the rel seems to me to be > > intuitively correct, but I wish there was another place in the code > > that could validate that. > > > > > I'm sure we're building plans with Sort on top of Index Scan paths, so > > > how come those don't fail? How come the non-parallel version of these > > > queries don't fail? > > > > Yeah, exactly. Something has to be doing something similar -- I don't > > think everywhere else using sort_pathkeys instead of query_pathkeys > > really changes the problem. I've read through all of the places we use > > generate_useful_gather_paths() as well as root->sort_pathkeys to try > > to get a better understanding of this. Most of the places I convinced > > myself it's already safe -- for example places like > > create_orderd_paths are working with the upper rel, and the full > > target list is available. A few other places I added comments (see the > > "questions" patch in the attached series) where I'm not sure why we > > apply projections *after* we create a sort path; given the issue we're > > discussing here I would have thought you'd want to do exactly the > > opposite. > > > > I haven't yet tried to read through the code in other places where we > > build an explicit sort to see if I can find any hints as to what > > existing code does to ensure this situation doesn't occur, but so far > > I haven't found anything else obvious. > > I did a lot more reading through code, and I concluded that generally > (maybe exclusively?) create_sort_path() is only called on upper rels > (e.g., from create_ordered_paths()), so we already have the full > target list available. I also stepped through in the debugger and > confirmed that non-var exprs generally don't seem to show up in the > pathtarget of paths either on base rels or on join rels. Finally, the > last piece of relevant data is that we don't actually push down sorts > on non-parallel paths even when it would be beneficial. For example, > this query: > > select i, md5(i::text) from t2, generate_series(1,2) order by i; > > generates this plan: > > Sort > Sort Key: t2.i > -> Nested Loop > -> Function Scan on generate_series > -> Materialize > -> Seq Scan on t2 > > but it seems obvious to me that it would be better to move the sort to > immediately above the seq scan. > > So I conclude from all of this that the real reason we didn't see this > problem before was that we didn't have anything that was trying to > generate useful sort paths lower in the query, so it just worked. To > state it from the other direction, I don't see any indication that > there actually is any code trying to determine what's safe to put in > the target at various join levels; from what I can tell only vars are > distributed at those points. Thinking about it some more, this > actually makes sense, since get_useful_pathkeys() is new code, and > something like it would have been needed already if we wanted to push > sorts down further into the plan. > > As a bit of a side note: combined with the above idea about pushing > sort down, it could also be beneficial to try to distribute > expressions lower too. > > So I think the previously attached patch is probably correct; the > second patch with the comment questions shows I still don't fully > understand how/when projections are applied (and when they're needed), > though it might have something to do with what I said about only > operating on upper rels. Just FYI in case the patch seems mergeable; I'm adding test cases now, and I think I might make some minor tweaks to the patch. I hope to have a new version out ~this evening. James
On Mon, Oct 5, 2020 at 11:33 AM James Coleman <jtc331@gmail.com> wrote: > > On Sun, Oct 4, 2020 at 9:40 PM James Coleman <jtc331@gmail.com> wrote: > > > > On Sat, Oct 3, 2020 at 10:10 PM James Coleman <jtc331@gmail.com> wrote: > > > > > > On Sat, Oct 3, 2020 at 5:44 PM Tomas Vondra > > > <tomas.vondra@2ndquadrant.com> wrote: > > > > > > > > On Sat, Oct 03, 2020 at 10:50:06AM -0400, James Coleman wrote: > > > > >On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331@gmail.com> wrote: > > > > >> > > > > >> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: > > > > >> > > > > > >> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > > > > >> > <tomas.vondra@2ndquadrant.com> wrote: > > > > >> > > > > > > >> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > > > > >> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > > > > >> > > ><tomas.vondra@2ndquadrant.com> wrote: > > > > >> > > >> > > > > >> > > >> ... > > > > >> > > >> > > > > >> > > >> More importanly, it does not actually fix the issue - it does fix that > > > > >> > > >> particular query, but just replacing the DISTINCT with either ORDER BY > > > > >> > > >> or GROUP BY make it fail again :-( > > > > >> > > >> > > > > >> > > >> Attached is a simple script I used, demonstrating these issues for the > > > > >> > > >> three cases that expect to have ressortgroupref != 0 (per the comment > > > > >> > > >> before TargetEntry in plannodes.h). > > > > >> > > > > > > > >> > > >So does checking for volatile expressions (if you happened to test > > > > >> > > >that) solve all the cases? If you haven't tested that yet, I can try > > > > >> > > >to do that this evening. > > > > >> > > > > > > > >> > > > > > > >> > > Yes, it does fix all the three queries in the SQL script. > > > > >> > > > > > > >> > > The question however is whether this is the root issue, and whether it's > > > > >> > > the right way to fix it. For example - volatility is not the only reason > > > > >> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > > > > >> > > also blocks pushdown of leaky functions in security_barrier views. So I > > > > >> > > wonder if that could cause failures too, somehow. But I haven't managed > > > > >> > > to create such example. > > > > >> > > > > > >> > I was about to say that the issue here is slightly different from qual > > > > >> > etc. pushdown, since we're not concerned about quals here, and so I > > > > >> > wonder where we determine what target list entries to put in a given > > > > >> > scan path, but then I realized that implies (maybe!) a simpler > > > > >> > solution. Instead of duplicating checks on target list entries would > > > > >> > be safe, why not check directly in get_useful_pathkeys_for_relation() > > > > >> > whether or not the pathkey has a target list entry? > > > > >> > > > > > >> > I haven't been able to try that out yet, and so maybe I'm missing > > > > >> > something, but I need to step out for a bit, so I'll have to look at > > > > >> > it later. > > > > >> > > > > >> I've started poking at this, but haven't yet found a workable > > > > >> solution. See the attached patch which "solves" the problem by > > > > >> breaking putting the sort under the gather merge, but it at least > > > > >> demonstrates conceptually what I think we're interested in doing. > > > > >> > > > > >> The issue currently is that the comparison of expressions fails -- in > > > > >> our query, the first column selected shows up as a Var (with the same > > > > >> contents) but is a different pointer in the em_expr and the reltarget > > > > >> exprs list. I don't yet see a good way to get the equivalence class > > > > >> for a PathTarget entry. > > > > > > > > > >Hmm, I think I was looking to do is the attached patch. I didn't > > > > >realize until I did a lot of reading through source that we have an > > > > >equal() function that can compare exprs. That (plus the realization in > > > > >[1] the originally reported backtrace wasn't where the error actually > > > > >came from) convinced me that what we need is to confirm not only that > > > > >the all of the vars in the ec member come from the relids in the rel > > > > >but also that the expr is actually represented in the target of the > > > > >rel. > > > > > > > > > >With the gucs changed as I mentioned earlier both of the plans (with > > > > >and without a volatile call in the 2nd select entry) now look like: > > > > > > > > > > Unique > > > > > -> Gather Merge > > > > > Workers Planned: 2 > > > > > -> Sort > > > > > Sort Key: ref_0.i, (md5(ref_0.t)) > > > > > -> Nested Loop > > > > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > > > > -> Function Scan on generate_series ref_1 > > > > > > > > > >Without the gucs changed the minimal repro case now doesn't error, but > > > > >results in this plan: > > > > > > > > > > HashAggregate > > > > > Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t > > > > >ELSE ref_0.t END > > > > > -> Nested Loop > > > > > -> Seq Scan on ref_0 > > > > > -> Function Scan on generate_series ref_1 > > > > > > > > > >Similarly in your six queries I now only see parallel query showing up > > > > >in the last one. > > > > > > > > > > > > > OK, that seems reasonable I think. > > > > > > If we proceed with this patch I'd like to tweak it to check for the > > > relids subset first on the theory that it will be cheaper than the > > > expr equality check loop; that change is attached. > > > > > > > >I created an entirely new function because adding the target expr > > > > >lookup to the existing find_em_expr_for_rel() function broke a bunch > > > > >of postgres_fdw tests. That maybe raises questions about whether that > > > > >code also could have problems in theory/in the future, but I didn't > > > > >investigate further. In any case we already know it excludes > > > > >volatile...so maybe it's fine because in practice that's actually a > > > > >broader exclusion than what we're doing here. > > > > > > > > > > > > > I don't think postgres_fdw needs these new checks, because FDW scan's > > > > are not allowed in parallel part of the plan - if that changes in the > > > > future, I'm sure that'll require fixes in plenty other places. > > > > > > > > OTOH it's interesting that it triggers those failures - I wonder if we > > > > could learn something from them? AFAICS those paths can't be built by > > > > generate_useful_gather_paths (because of the postgres_fdw vs. parallel > > > > query restrictions), so how do these plans look? > > > > > > Well the fdw code doesn't trigger the errors, but both > > > generate_useful_gather_paths() and the fdw code originally relied on > > > find_em_expr_for_rel(), but now they're diverging. IIRC the fdw code > > > also uses it for determining what path keys are valid -- in that case > > > which ones are safe to be sent to the foreign server (so all of the > > > vars have to be from rels on the foreign server). The fdw code has > > > additional checks too though, for example no volatile expressions may > > > be pushed to the remote. > > > > > > > >This seems to fix the issue, but I'd like feedback on whether it's too > > > > >strict. We could of course just check em_has_volatile, but I'm > > > > >wondering if that's simultaneously too strict (by not allowing the > > > > >volatile expression to be computed in the gather merge supposing > > > > >there's no join) and too loose (maybe there are other cases we should > > > > >care about?). It also just strikes me as re-encoding rules that should > > > > >have already been applied (and thus we should be able to look up in > > > > >the data we have if it's safe to use the expr/pathkey). Those are > > > > >mostly intuitions though. > > > > > > > > > > > > > I don't know :-( As I mentioned before, I suspect checking just the > > > > volatility may not be enough in some cases (leakyness, etc.) and I think > > > > you're right it may be too strict in other cases. > > > > > > > > Not sure I understand which rules you think we're re-encoding, but I > > > > have a nagging feeling there's a piece of code somewhere earlier in > > > > the query planner meant to prevent such cases (sort on path without all > > > > the pathkeys), and that fixing it here is just a band aid :-( > > > > > > I'm getting at exactly what you're saying below: there's got to be > > > some existing rules (and presumably code already enforcing it in some > > > places) at what level in the path tree a given pathkey is > > > safe/correct, and we don't want to reinvent those rules (or ideally > > > that code). Verifying the expr is in the rel seems to me to be > > > intuitively correct, but I wish there was another place in the code > > > that could validate that. > > > > > > > I'm sure we're building plans with Sort on top of Index Scan paths, so > > > > how come those don't fail? How come the non-parallel version of these > > > > queries don't fail? > > > > > > Yeah, exactly. Something has to be doing something similar -- I don't > > > think everywhere else using sort_pathkeys instead of query_pathkeys > > > really changes the problem. I've read through all of the places we use > > > generate_useful_gather_paths() as well as root->sort_pathkeys to try > > > to get a better understanding of this. Most of the places I convinced > > > myself it's already safe -- for example places like > > > create_orderd_paths are working with the upper rel, and the full > > > target list is available. A few other places I added comments (see the > > > "questions" patch in the attached series) where I'm not sure why we > > > apply projections *after* we create a sort path; given the issue we're > > > discussing here I would have thought you'd want to do exactly the > > > opposite. > > > > > > I haven't yet tried to read through the code in other places where we > > > build an explicit sort to see if I can find any hints as to what > > > existing code does to ensure this situation doesn't occur, but so far > > > I haven't found anything else obvious. > > > > I did a lot more reading through code, and I concluded that generally > > (maybe exclusively?) create_sort_path() is only called on upper rels > > (e.g., from create_ordered_paths()), so we already have the full > > target list available. I also stepped through in the debugger and > > confirmed that non-var exprs generally don't seem to show up in the > > pathtarget of paths either on base rels or on join rels. Finally, the > > last piece of relevant data is that we don't actually push down sorts > > on non-parallel paths even when it would be beneficial. For example, > > this query: > > > > select i, md5(i::text) from t2, generate_series(1,2) order by i; > > > > generates this plan: > > > > Sort > > Sort Key: t2.i > > -> Nested Loop > > -> Function Scan on generate_series > > -> Materialize > > -> Seq Scan on t2 > > > > but it seems obvious to me that it would be better to move the sort to > > immediately above the seq scan. > > > > So I conclude from all of this that the real reason we didn't see this > > problem before was that we didn't have anything that was trying to > > generate useful sort paths lower in the query, so it just worked. To > > state it from the other direction, I don't see any indication that > > there actually is any code trying to determine what's safe to put in > > the target at various join levels; from what I can tell only vars are > > distributed at those points. Thinking about it some more, this > > actually makes sense, since get_useful_pathkeys() is new code, and > > something like it would have been needed already if we wanted to push > > sorts down further into the plan. > > > > As a bit of a side note: combined with the above idea about pushing > > sort down, it could also be beneficial to try to distribute > > expressions lower too. > > > > So I think the previously attached patch is probably correct; the > > second patch with the comment questions shows I still don't fully > > understand how/when projections are applied (and when they're needed), > > though it might have something to do with what I said about only > > operating on upper rels. > > Just FYI in case the patch seems mergeable; I'm adding test cases now, > and I think I might make some minor tweaks to the patch. > > I hope to have a new version out ~this evening. All right, here's a modified patch. I did a bit of surgery: the concept is the same, but I decided to explicitly not the parallels to (and follow as possible) prepare_sort_from_pathkeys(). That means we only have to do the expression equality check if the expr is volatile, and the relids bms check doesn't have to bail if it's empty. This properly allows us to push the sort all the way down to the base rel when the only things in the base rel are referenced (including stable expressions) but pushes the sort up to the upper rel when a volatile expression is used (technically it would be safe for it to go lower, but in the current planner that's the only time it appears in the target list, so allowing that would be a larger functional change). I left the questions patch here, though obviously it doesn't need to be committed. If someone has answers to the questions, I'd be interested though, but I don't think it affects the correctness of this patch. James
Attachment
On Mon, Oct 5, 2020 at 10:38 PM James Coleman <jtc331@gmail.com> wrote: > > On Mon, Oct 5, 2020 at 11:33 AM James Coleman <jtc331@gmail.com> wrote: > > > > On Sun, Oct 4, 2020 at 9:40 PM James Coleman <jtc331@gmail.com> wrote: > > > > > > On Sat, Oct 3, 2020 at 10:10 PM James Coleman <jtc331@gmail.com> wrote: > > > > > > > > On Sat, Oct 3, 2020 at 5:44 PM Tomas Vondra > > > > <tomas.vondra@2ndquadrant.com> wrote: > > > > > > > > > > On Sat, Oct 03, 2020 at 10:50:06AM -0400, James Coleman wrote: > > > > > >On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc331@gmail.com> wrote: > > > > > >> > > > > > >> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc331@gmail.com> wrote: > > > > > >> > > > > > > >> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > > > > > >> > <tomas.vondra@2ndquadrant.com> wrote: > > > > > >> > > > > > > > >> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > > > > > >> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > > > > > >> > > ><tomas.vondra@2ndquadrant.com> wrote: > > > > > >> > > >> > > > > > >> > > >> ... > > > > > >> > > >> > > > > > >> > > >> More importanly, it does not actually fix the issue - it does fix that > > > > > >> > > >> particular query, but just replacing the DISTINCT with either ORDER BY > > > > > >> > > >> or GROUP BY make it fail again :-( > > > > > >> > > >> > > > > > >> > > >> Attached is a simple script I used, demonstrating these issues for the > > > > > >> > > >> three cases that expect to have ressortgroupref != 0 (per the comment > > > > > >> > > >> before TargetEntry in plannodes.h). > > > > > >> > > > > > > > > >> > > >So does checking for volatile expressions (if you happened to test > > > > > >> > > >that) solve all the cases? If you haven't tested that yet, I can try > > > > > >> > > >to do that this evening. > > > > > >> > > > > > > > > >> > > > > > > > >> > > Yes, it does fix all the three queries in the SQL script. > > > > > >> > > > > > > > >> > > The question however is whether this is the root issue, and whether it's > > > > > >> > > the right way to fix it. For example - volatility is not the only reason > > > > > >> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > > > > > >> > > also blocks pushdown of leaky functions in security_barrier views. So I > > > > > >> > > wonder if that could cause failures too, somehow. But I haven't managed > > > > > >> > > to create such example. > > > > > >> > > > > > > >> > I was about to say that the issue here is slightly different from qual > > > > > >> > etc. pushdown, since we're not concerned about quals here, and so I > > > > > >> > wonder where we determine what target list entries to put in a given > > > > > >> > scan path, but then I realized that implies (maybe!) a simpler > > > > > >> > solution. Instead of duplicating checks on target list entries would > > > > > >> > be safe, why not check directly in get_useful_pathkeys_for_relation() > > > > > >> > whether or not the pathkey has a target list entry? > > > > > >> > > > > > > >> > I haven't been able to try that out yet, and so maybe I'm missing > > > > > >> > something, but I need to step out for a bit, so I'll have to look at > > > > > >> > it later. > > > > > >> > > > > > >> I've started poking at this, but haven't yet found a workable > > > > > >> solution. See the attached patch which "solves" the problem by > > > > > >> breaking putting the sort under the gather merge, but it at least > > > > > >> demonstrates conceptually what I think we're interested in doing. > > > > > >> > > > > > >> The issue currently is that the comparison of expressions fails -- in > > > > > >> our query, the first column selected shows up as a Var (with the same > > > > > >> contents) but is a different pointer in the em_expr and the reltarget > > > > > >> exprs list. I don't yet see a good way to get the equivalence class > > > > > >> for a PathTarget entry. > > > > > > > > > > > >Hmm, I think I was looking to do is the attached patch. I didn't > > > > > >realize until I did a lot of reading through source that we have an > > > > > >equal() function that can compare exprs. That (plus the realization in > > > > > >[1] the originally reported backtrace wasn't where the error actually > > > > > >came from) convinced me that what we need is to confirm not only that > > > > > >the all of the vars in the ec member come from the relids in the rel > > > > > >but also that the expr is actually represented in the target of the > > > > > >rel. > > > > > > > > > > > >With the gucs changed as I mentioned earlier both of the plans (with > > > > > >and without a volatile call in the 2nd select entry) now look like: > > > > > > > > > > > > Unique > > > > > > -> Gather Merge > > > > > > Workers Planned: 2 > > > > > > -> Sort > > > > > > Sort Key: ref_0.i, (md5(ref_0.t)) > > > > > > -> Nested Loop > > > > > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > > > > > -> Function Scan on generate_series ref_1 > > > > > > > > > > > >Without the gucs changed the minimal repro case now doesn't error, but > > > > > >results in this plan: > > > > > > > > > > > > HashAggregate > > > > > > Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t > > > > > >ELSE ref_0.t END > > > > > > -> Nested Loop > > > > > > -> Seq Scan on ref_0 > > > > > > -> Function Scan on generate_series ref_1 > > > > > > > > > > > >Similarly in your six queries I now only see parallel query showing up > > > > > >in the last one. > > > > > > > > > > > > > > > > OK, that seems reasonable I think. > > > > > > > > If we proceed with this patch I'd like to tweak it to check for the > > > > relids subset first on the theory that it will be cheaper than the > > > > expr equality check loop; that change is attached. > > > > > > > > > >I created an entirely new function because adding the target expr > > > > > >lookup to the existing find_em_expr_for_rel() function broke a bunch > > > > > >of postgres_fdw tests. That maybe raises questions about whether that > > > > > >code also could have problems in theory/in the future, but I didn't > > > > > >investigate further. In any case we already know it excludes > > > > > >volatile...so maybe it's fine because in practice that's actually a > > > > > >broader exclusion than what we're doing here. > > > > > > > > > > > > > > > > I don't think postgres_fdw needs these new checks, because FDW scan's > > > > > are not allowed in parallel part of the plan - if that changes in the > > > > > future, I'm sure that'll require fixes in plenty other places. > > > > > > > > > > OTOH it's interesting that it triggers those failures - I wonder if we > > > > > could learn something from them? AFAICS those paths can't be built by > > > > > generate_useful_gather_paths (because of the postgres_fdw vs. parallel > > > > > query restrictions), so how do these plans look? > > > > > > > > Well the fdw code doesn't trigger the errors, but both > > > > generate_useful_gather_paths() and the fdw code originally relied on > > > > find_em_expr_for_rel(), but now they're diverging. IIRC the fdw code > > > > also uses it for determining what path keys are valid -- in that case > > > > which ones are safe to be sent to the foreign server (so all of the > > > > vars have to be from rels on the foreign server). The fdw code has > > > > additional checks too though, for example no volatile expressions may > > > > be pushed to the remote. > > > > > > > > > >This seems to fix the issue, but I'd like feedback on whether it's too > > > > > >strict. We could of course just check em_has_volatile, but I'm > > > > > >wondering if that's simultaneously too strict (by not allowing the > > > > > >volatile expression to be computed in the gather merge supposing > > > > > >there's no join) and too loose (maybe there are other cases we should > > > > > >care about?). It also just strikes me as re-encoding rules that should > > > > > >have already been applied (and thus we should be able to look up in > > > > > >the data we have if it's safe to use the expr/pathkey). Those are > > > > > >mostly intuitions though. > > > > > > > > > > > > > > > > I don't know :-( As I mentioned before, I suspect checking just the > > > > > volatility may not be enough in some cases (leakyness, etc.) and I think > > > > > you're right it may be too strict in other cases. > > > > > > > > > > Not sure I understand which rules you think we're re-encoding, but I > > > > > have a nagging feeling there's a piece of code somewhere earlier in > > > > > the query planner meant to prevent such cases (sort on path without all > > > > > the pathkeys), and that fixing it here is just a band aid :-( > > > > > > > > I'm getting at exactly what you're saying below: there's got to be > > > > some existing rules (and presumably code already enforcing it in some > > > > places) at what level in the path tree a given pathkey is > > > > safe/correct, and we don't want to reinvent those rules (or ideally > > > > that code). Verifying the expr is in the rel seems to me to be > > > > intuitively correct, but I wish there was another place in the code > > > > that could validate that. > > > > > > > > > I'm sure we're building plans with Sort on top of Index Scan paths, so > > > > > how come those don't fail? How come the non-parallel version of these > > > > > queries don't fail? > > > > > > > > Yeah, exactly. Something has to be doing something similar -- I don't > > > > think everywhere else using sort_pathkeys instead of query_pathkeys > > > > really changes the problem. I've read through all of the places we use > > > > generate_useful_gather_paths() as well as root->sort_pathkeys to try > > > > to get a better understanding of this. Most of the places I convinced > > > > myself it's already safe -- for example places like > > > > create_orderd_paths are working with the upper rel, and the full > > > > target list is available. A few other places I added comments (see the > > > > "questions" patch in the attached series) where I'm not sure why we > > > > apply projections *after* we create a sort path; given the issue we're > > > > discussing here I would have thought you'd want to do exactly the > > > > opposite. > > > > > > > > I haven't yet tried to read through the code in other places where we > > > > build an explicit sort to see if I can find any hints as to what > > > > existing code does to ensure this situation doesn't occur, but so far > > > > I haven't found anything else obvious. > > > > > > I did a lot more reading through code, and I concluded that generally > > > (maybe exclusively?) create_sort_path() is only called on upper rels > > > (e.g., from create_ordered_paths()), so we already have the full > > > target list available. I also stepped through in the debugger and > > > confirmed that non-var exprs generally don't seem to show up in the > > > pathtarget of paths either on base rels or on join rels. Finally, the > > > last piece of relevant data is that we don't actually push down sorts > > > on non-parallel paths even when it would be beneficial. For example, > > > this query: > > > > > > select i, md5(i::text) from t2, generate_series(1,2) order by i; > > > > > > generates this plan: > > > > > > Sort > > > Sort Key: t2.i > > > -> Nested Loop > > > -> Function Scan on generate_series > > > -> Materialize > > > -> Seq Scan on t2 > > > > > > but it seems obvious to me that it would be better to move the sort to > > > immediately above the seq scan. > > > > > > So I conclude from all of this that the real reason we didn't see this > > > problem before was that we didn't have anything that was trying to > > > generate useful sort paths lower in the query, so it just worked. To > > > state it from the other direction, I don't see any indication that > > > there actually is any code trying to determine what's safe to put in > > > the target at various join levels; from what I can tell only vars are > > > distributed at those points. Thinking about it some more, this > > > actually makes sense, since get_useful_pathkeys() is new code, and > > > something like it would have been needed already if we wanted to push > > > sorts down further into the plan. > > > > > > As a bit of a side note: combined with the above idea about pushing > > > sort down, it could also be beneficial to try to distribute > > > expressions lower too. > > > > > > So I think the previously attached patch is probably correct; the > > > second patch with the comment questions shows I still don't fully > > > understand how/when projections are applied (and when they're needed), > > > though it might have something to do with what I said about only > > > operating on upper rels. > > > > Just FYI in case the patch seems mergeable; I'm adding test cases now, > > and I think I might make some minor tweaks to the patch. > > > > I hope to have a new version out ~this evening. > > All right, here's a modified patch. I did a bit of surgery: the > concept is the same, but I decided to explicitly not the parallels to > (and follow as possible) prepare_sort_from_pathkeys(). That means we > only have to do the expression equality check if the expr is volatile, > and the relids bms check doesn't have to bail if it's empty. > > This properly allows us to push the sort all the way down to the base > rel when the only things in the base rel are referenced (including > stable expressions) but pushes the sort up to the upper rel when a > volatile expression is used (technically it would be safe for it to go > lower, but in the current planner that's the only time it appears in > the target list, so allowing that would be a larger functional > change). > > I left the questions patch here, though obviously it doesn't need to > be committed. If someone has answers to the questions, I'd be > interested though, but I don't think it affects the correctness of > this patch. And with a typo fixed. James
Attachment
On Tue, 6 Oct 2020 at 06:46, James Coleman <jtc331@gmail.com> wrote: > > > All right, here's a modified patch. I did a bit of surgery: the > > concept is the same, but I decided to explicitly not the parallels to > > (and follow as possible) prepare_sort_from_pathkeys(). That means we > > only have to do the expression equality check if the expr is volatile, > > and the relids bms check doesn't have to bail if it's empty. > > > > This properly allows us to push the sort all the way down to the base > > rel when the only things in the base rel are referenced (including > > stable expressions) but pushes the sort up to the upper rel when a > > volatile expression is used (technically it would be safe for it to go > > lower, but in the current planner that's the only time it appears in > > the target list, so allowing that would be a larger functional > > change). > > > > I left the questions patch here, though obviously it doesn't need to > > be committed. If someone has answers to the questions, I'd be > > interested though, but I don't think it affects the correctness of > > this patch. > > And with a typo fixed. > Hi James, Thanks for working on this, I haven't tried the latest patch but the previous single patch worked fine in the cases you and Tomas found and with my original example. I haven't been able to reproduce the same problem with other queries (I tried RLS and views with barriers, i also wanted to test with LEAK functions but didn't have the time yet). Can you please create an entry in the commitfest for this one so we don't lose track of it? -- Jaime Casanova www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Oct 6, 2020 at 9:28 AM Jaime Casanova <jaime.casanova@2ndquadrant.com> wrote: > Can you please create an entry in the commitfest for this one so we > don't lose track of it? Done.
On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc331@gmail.com> wrote: > The plan (obtained by replacing the volatile function with a stable one): > > Unique > -> Nested Loop > -> Gather Merge > Workers Planned: 2 > -> Sort > Sort Key: ref_0.i, (md5(ref_0.t)) > -> Parallel Index Scan using ref_0_i_idx on ref_0 > -> Function Scan on generate_series ref_1 > > Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. As far as I remember this stuff, target lists for any nodes below the top of the join tree were previously always just Var nodes. Any expressions that needed to be computed were computed at the level of the topmost join, before adding upper planner steps like Agg, WindowAgg, Limit, etc. But, here you've got a complex expression -- md5(ref_0.t) -- begin computed somewhere in the middle of the plan tree so that the sort can be performed at that point. However, that seems likely to break a bunch of assumptions all over the planner, because, unless a lot of work and surgery has been done since I looked at this last, that md5(ref_0.t) value is not going to "bubble up" to higher levels of the plan through the tlists of the individual nodes. That poses a risk that it will be recomputed multiple times, which is particularly bad for volatile functions because you might not always get the same answer, but it seems like it might be bad even for non-volatile functions, because it could be slow if the value is used at multiple places in the query. It feels like the sort of thing that might have broken some other assumptions, too, although I can't say exactly what. I wonder if whatever part of the incremental sort patch allowed computation of tlist expressions below the top level of the join tree ought to just be reverted outright, but that might be overkill. I can't say that I really understand exactly what's going on here, but it seems like a pretty significant behavior change to make as part of another project, and I wonder if we're going to keep finding other problems with it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 7, 2020 at 3:52 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc331@gmail.com> wrote: > > The plan (obtained by replacing the volatile function with a stable one): > > > > Unique > > -> Nested Loop > > -> Gather Merge > > Workers Planned: 2 > > -> Sort > > Sort Key: ref_0.i, (md5(ref_0.t)) > > -> Parallel Index Scan using ref_0_i_idx on ref_0 > > -> Function Scan on generate_series ref_1 > > > > Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. > > As far as I remember this stuff, target lists for any nodes below the > top of the join tree were previously always just Var nodes. Any > expressions that needed to be computed were computed at the level of > the topmost join, before adding upper planner steps like Agg, > WindowAgg, Limit, etc. But, here you've got a complex expression -- > md5(ref_0.t) -- begin computed somewhere in the middle of the plan > tree so that the sort can be performed at that point. However, that > seems likely to break a bunch of assumptions all over the planner, > because, unless a lot of work and surgery has been done since I looked > at this last, that md5(ref_0.t) value is not going to "bubble up" to > higher levels of the plan through the tlists of the individual nodes. > That poses a risk that it will be recomputed multiple times, which is > particularly bad for volatile functions because you might not always > get the same answer, but it seems like it might be bad even for > non-volatile functions, because it could be slow if the value is used > at multiple places in the query. It feels like the sort of thing that > might have broken some other assumptions, too, although I can't say > exactly what. The distinction between volatile and stable expressions here has been discussed at length in this thread; I think it'd be worth reading through the rest of the thread before offering sledgehammer ideas below like reverting the entire piece. In particular, most of the discussion in the thread has been about what is actually safe to reference at lower levels in the plan. It would be extremely easy tomake this only allow sorting by Vars at lower levels in the plan, though the current revision of the patch follows what prepare_sort_from_pathkeys allows and disallows. Obviously it'd be nice to be able to sort by non-Var expressions at lower levels too. But again, it'd be easy to change that if it's not safe, so it'd be helpful if you could point to areas that don't bubble non-Vars up so that we can actually comment about that in the source and discuss specifics. > I wonder if whatever part of the incremental sort patch allowed > computation of tlist expressions below the top level of the join tree > ought to just be reverted outright, but that might be overkill. I > can't say that I really understand exactly what's going on here, but > it seems like a pretty significant behavior change to make as part of > another project, and I wonder if we're going to keep finding other > problems with it. James
On Wed, Oct 07, 2020 at 04:01:27PM -0400, James Coleman wrote: >On Wed, Oct 7, 2020 at 3:52 PM Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc331@gmail.com> wrote: >> > The plan (obtained by replacing the volatile function with a stable one): >> > >> > Unique >> > -> Nested Loop >> > -> Gather Merge >> > Workers Planned: 2 >> > -> Sort >> > Sort Key: ref_0.i, (md5(ref_0.t)) >> > -> Parallel Index Scan using ref_0_i_idx on ref_0 >> > -> Function Scan on generate_series ref_1 >> > >> > Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. >> >> As far as I remember this stuff, target lists for any nodes below the >> top of the join tree were previously always just Var nodes. Any >> expressions that needed to be computed were computed at the level of >> the topmost join, before adding upper planner steps like Agg, >> WindowAgg, Limit, etc. But, here you've got a complex expression -- >> md5(ref_0.t) -- begin computed somewhere in the middle of the plan >> tree so that the sort can be performed at that point. However, that >> seems likely to break a bunch of assumptions all over the planner, >> because, unless a lot of work and surgery has been done since I looked >> at this last, that md5(ref_0.t) value is not going to "bubble up" to >> higher levels of the plan through the tlists of the individual nodes. >> That poses a risk that it will be recomputed multiple times, which is >> particularly bad for volatile functions because you might not always >> get the same answer, but it seems like it might be bad even for >> non-volatile functions, because it could be slow if the value is used >> at multiple places in the query. It feels like the sort of thing that >> might have broken some other assumptions, too, although I can't say >> exactly what. > >The distinction between volatile and stable expressions here has been >discussed at length in this thread; I think it'd be worth reading >through the rest of the thread before offering sledgehammer ideas >below like reverting the entire piece. > Um, that's a tad too harsh, I guess ... We've discussed the volatile/stable expressions before, but AFAIK we failed to notice the planner assumes such expressions are expected to be computed only in the upper parts of the plan. If this assumption really exists, it probably relies on simply calling create_sort_path just about right, and the new code looking at useful pathkeys below gather merge might be breaking that simply by doing it too early. In that case we can either rip this code out, effectively reverting most of ba3e76cc571, or consider only pathkeys that we can find in the tlist of the child path. I'm still not entirely sure I understand what's happening, or what the exact rule is. Consider this query: explain (verbose) select distinct i, t, md5(t) from ref_0; which on PG12 (i.e. before incremental sort) is planned like this: QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=78120.92..83120.92 rows=500000 width=65) Output: i, t, (md5(t)) -> Sort (cost=78120.92..79370.92 rows=500000 width=65) Output: i, t, (md5(t)) Sort Key: ref_0.i, ref_0.t, (md5(ref_0.t)) -> Seq Scan on public.ref_0 (cost=0.00..10282.00 rows=500000 width=65) Output: i, t, md5(t) (7 rows) i.e. the (stable) function is pushed all the way to the scan node. And even if we replace it with a volatile expression it gets pushed down: explain (verbose) select distinct i, t, md5(random()::text || t) from ref_0; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=83120.92..88120.92 rows=500000 width=65) Output: i, t, (md5(((random())::text || t))) -> Sort (cost=83120.92..84370.92 rows=500000 width=65) Output: i, t, (md5(((random())::text || t))) Sort Key: ref_0.i, ref_0.t, (md5(((random())::text || ref_0.t))) -> Seq Scan on public.ref_0 (cost=0.00..15282.00 rows=500000 width=65) Output: i, t, md5(((random())::text || t)) (7 rows) But perhaps I just don't understand the assumption correctly? >In particular, most of the discussion in the thread has been about >what is actually safe to reference at lower levels in the plan. It >would be extremely easy tomake this only allow sorting by Vars at >lower levels in the plan, though the current revision of the patch >follows what prepare_sort_from_pathkeys allows and disallows. > >Obviously it'd be nice to be able to sort by non-Var expressions at >lower levels too. But again, it'd be easy to change that if it's not >safe, so it'd be helpful if you could point to areas that don't bubble >non-Vars up so that we can actually comment about that in the source >and discuss specifics. > I think the proble here might be "what we think is theoretically safe" vs. "what the planner expects". I wouldn't be surprised if this was safe in theoretical sense, but planner not expecting that. >> I wonder if whatever part of the incremental sort patch allowed >> computation of tlist expressions below the top level of the join tree >> ought to just be reverted outright, but that might be overkill. I >> can't say that I really understand exactly what's going on here, but >> it seems like a pretty significant behavior change to make as part of >> another project, and I wonder if we're going to keep finding other >> problems with it. > It certainly was not my intention to make such change in this patch. I don't think the earlier parts of the incremental sort have this issue because all those places were already constructing regular sort before. But this code is different because it considers adding sort node (both regular and incremental) below a gather merge, which was not happening before. I suspect this is the problem, somehow. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Oct 07, 2020 at 03:52:04PM -0400, Robert Haas wrote: >On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc331@gmail.com> wrote: >> The plan (obtained by replacing the volatile function with a stable one): >> >> Unique >> -> Nested Loop >> -> Gather Merge >> Workers Planned: 2 >> -> Sort >> Sort Key: ref_0.i, (md5(ref_0.t)) >> -> Parallel Index Scan using ref_0_i_idx on ref_0 >> -> Function Scan on generate_series ref_1 >> >> Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. > >As far as I remember this stuff, target lists for any nodes below the >top of the join tree were previously always just Var nodes. Any >expressions that needed to be computed were computed at the level of >the topmost join, before adding upper planner steps like Agg, >WindowAgg, Limit, etc. But, here you've got a complex expression -- >md5(ref_0.t) -- begin computed somewhere in the middle of the plan >tree so that the sort can be performed at that point. However, that >seems likely to break a bunch of assumptions all over the planner, >because, unless a lot of work and surgery has been done since I looked >at this last, that md5(ref_0.t) value is not going to "bubble up" to >higher levels of the plan through the tlists of the individual nodes. > >That poses a risk that it will be recomputed multiple times, which is >particularly bad for volatile functions because you might not always >get the same answer, but it seems like it might be bad even for >non-volatile functions, because it could be slow if the value is used >at multiple places in the query. It feels like the sort of thing that >might have broken some other assumptions, too, although I can't say >exactly what. I agree this might be a clear correctness issue for volatile functions. Not sure about stable functions, though - it's easy to construct cases where pushing the function down results in much faster execution. Consider this: create table t (id int, val text); insert into t select 1, repeat('x', 1000) from generate_series(1,100) s(i); -- no pushdown select id, md5(t1.val) from t t1 join t t2 using (id) join t t3 using (id); Time: 5222.895 ms (00:05.223) -- manual pushdown with x as (select id, md5(t.val) from t) select id, t1.md5 from x t1 join x t2 using (id) join x t3 using (id); Time: 486.455 ms Of course, this is independent of what the planner assumes correctly, which is what's being discussed in this thread. And I'm sure there are plenty counter-examples too, so we'd need to cost this properly. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Oct 7, 2020 at 6:22 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Wed, Oct 07, 2020 at 04:01:27PM -0400, James Coleman wrote: > >On Wed, Oct 7, 2020 at 3:52 PM Robert Haas <robertmhaas@gmail.com> wrote: > >> > >> On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc331@gmail.com> wrote: > >> > The plan (obtained by replacing the volatile function with a stable one): > >> > > >> > Unique > >> > -> Nested Loop > >> > -> Gather Merge > >> > Workers Planned: 2 > >> > -> Sort > >> > Sort Key: ref_0.i, (md5(ref_0.t)) > >> > -> Parallel Index Scan using ref_0_i_idx on ref_0 > >> > -> Function Scan on generate_series ref_1 > >> > > >> > Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error. > >> > >> As far as I remember this stuff, target lists for any nodes below the > >> top of the join tree were previously always just Var nodes. Any > >> expressions that needed to be computed were computed at the level of > >> the topmost join, before adding upper planner steps like Agg, > >> WindowAgg, Limit, etc. But, here you've got a complex expression -- > >> md5(ref_0.t) -- begin computed somewhere in the middle of the plan > >> tree so that the sort can be performed at that point. However, that > >> seems likely to break a bunch of assumptions all over the planner, > >> because, unless a lot of work and surgery has been done since I looked > >> at this last, that md5(ref_0.t) value is not going to "bubble up" to > >> higher levels of the plan through the tlists of the individual nodes. > >> That poses a risk that it will be recomputed multiple times, which is > >> particularly bad for volatile functions because you might not always > >> get the same answer, but it seems like it might be bad even for > >> non-volatile functions, because it could be slow if the value is used > >> at multiple places in the query. It feels like the sort of thing that > >> might have broken some other assumptions, too, although I can't say > >> exactly what. > > > >The distinction between volatile and stable expressions here has been > >discussed at length in this thread; I think it'd be worth reading > >through the rest of the thread before offering sledgehammer ideas > >below like reverting the entire piece. > > > > Um, that's a tad too harsh, I guess ... Yes, it was too harsh. So, my apologies, Robert. Nonetheless I do think that reverting would be a surprisingly strong reaction for this because I don't think the problem scope is particularly dire. Two reasons why: 1. As noted earlier it'd be pretty easy to limit this sort pushdown only to sorts on Vars. But I don't think that;s necessary, because: 2. From my time in the debugger on this issue, I don't see how the planner can be assuming these things won't bubble up. For one thing, your Tomas's example below shows the stable expression being computed at the scan node level and bubbling up through. For another thing, earlier in this thread [1] I have a patch that compares expressions in the path target to the expression in the pathkey, and even under a gather merge the stable expression (md5) is already in the path target at that level of the path. > We've discussed the volatile/stable expressions before, but AFAIK we > failed to notice the planner assumes such expressions are expected to be > computed only in the upper parts of the plan. If this assumption really > exists, it probably relies on simply calling create_sort_path just about > right, and the new code looking at useful pathkeys below gather merge > might be breaking that simply by doing it too early. I don't think such an assumption exists, as explained above, or at least I can't find an indication of it, and it's already violated in pre-incremental sort code. > In that case we can either rip this code out, effectively reverting most > of ba3e76cc571, or consider only pathkeys that we can find in the tlist > of the child path. This is exactly what the patch in [1] does. The current patch effectively does it also, but I believe it does so in a more refined way. > I'm still not entirely sure I understand what's happening, or what the > exact rule is. Consider this query: > > explain (verbose) select distinct i, t, md5(t) from ref_0; > > which on PG12 (i.e. before incremental sort) is planned like this: > > QUERY PLAN > ---------------------------------------------------------------------------------- > Unique (cost=78120.92..83120.92 rows=500000 width=65) > Output: i, t, (md5(t)) > -> Sort (cost=78120.92..79370.92 rows=500000 width=65) > Output: i, t, (md5(t)) > Sort Key: ref_0.i, ref_0.t, (md5(ref_0.t)) > -> Seq Scan on public.ref_0 (cost=0.00..10282.00 rows=500000 width=65) > Output: i, t, md5(t) > (7 rows) > > i.e. the (stable) function is pushed all the way to the scan node. And > even if we replace it with a volatile expression it gets pushed down: > > explain (verbose) select distinct i, t, md5(random()::text || t) from ref_0; > > QUERY PLAN > ---------------------------------------------------------------------------------- > Unique (cost=83120.92..88120.92 rows=500000 width=65) > Output: i, t, (md5(((random())::text || t))) > -> Sort (cost=83120.92..84370.92 rows=500000 width=65) > Output: i, t, (md5(((random())::text || t))) > Sort Key: ref_0.i, ref_0.t, (md5(((random())::text || ref_0.t))) > -> Seq Scan on public.ref_0 (cost=0.00..15282.00 rows=500000 width=65) > Output: i, t, md5(((random())::text || t)) > (7 rows) > > > But perhaps I just don't understand the assumption correctly? > > >In particular, most of the discussion in the thread has been about > >what is actually safe to reference at lower levels in the plan. It > >would be extremely easy tomake this only allow sorting by Vars at > >lower levels in the plan, though the current revision of the patch > >follows what prepare_sort_from_pathkeys allows and disallows. > > > >Obviously it'd be nice to be able to sort by non-Var expressions at > >lower levels too. But again, it'd be easy to change that if it's not > >safe, so it'd be helpful if you could point to areas that don't bubble > >non-Vars up so that we can actually comment about that in the source > >and discuss specifics. > > > > I think the proble here might be "what we think is theoretically safe" > vs. "what the planner expects". I wouldn't be surprised if this was safe > in theoretical sense, but planner not expecting that. Agreed as a general rule, but as above I'm having a hard time matching up pre-this-code behavior with the claim that the planner isn't expecting it. > >> I wonder if whatever part of the incremental sort patch allowed > >> computation of tlist expressions below the top level of the join tree > >> ought to just be reverted outright, but that might be overkill. I > >> can't say that I really understand exactly what's going on here, but > >> it seems like a pretty significant behavior change to make as part of > >> another project, and I wonder if we're going to keep finding other > >> problems with it. > > > > It certainly was not my intention to make such change in this patch. I > don't think the earlier parts of the incremental sort have this issue > because all those places were already constructing regular sort before. > > But this code is different because it considers adding sort node (both > regular and incremental) below a gather merge, which was not happening > before. I suspect this is the problem, somehow. If my above understanding holds, then it's not a significant behavior change, but if it is, we have a simple workaround (looking only at pathkeys referencing a Var in the path target list). James 1: https://www.postgresql.org/message-id/CAAaqYe9Q5TL-d0Fcs1m9bes7YbfkrizmEVFF%2BLdTLE6ZEXCwhg%40mail.gmail.com
On Tue, Oct 06, 2020 at 09:37:31AM -0400, James Coleman wrote: >On Tue, Oct 6, 2020 at 9:28 AM Jaime Casanova ><jaime.casanova@2ndquadrant.com> wrote: >> Can you please create an entry in the commitfest for this one so we >> don't lose track of it? > We're not too far from the next minor release, so I've been looking at this fix again and I intend to get it committed shortly (on Monday or so). Attached is a slightly modified version of the v4 patch that: (a) Removes comments about projections added to code that is not directly related to the fix. I'm not against adding such comments separately. (b) Fixes comment in expected output of incremental_sort test. (c) Removes else branch from find_em_expr_usable_for_sorting_rel. It's not quite needed thanks to the "return" in the "if" branch. IMO this makes it more elegant. I do have two questions about find_em_expr_usable_for_sorting_rel. (a) Where does the em_is_const check come from? The comment claims we should not be trying to sort by equivalence class members, but I can't convince myself it's actually true and I don't see it discussed in this thread. (b) In find_em_expr_for_rel, which was what was used before, the condition on relids was this: if (bms_is_subset(em->em_relids, rel->relids) && !bms_is_empty(em->em_relids)) { return em->em_expr; } but here we're using only if (!bms_is_subset(em->em_relids, rel->relids)) continue; Isn't this missing the second bms_is_empty condition? Of course, an alternative to this fix would be reverting ba3e76cc571 (completely or just the part introducing generate_useful_gather_paths). If anyone thinks that's what we should do, please speak now. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Thu, Oct 29, 2020 at 6:06 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Tue, Oct 06, 2020 at 09:37:31AM -0400, James Coleman wrote: > >On Tue, Oct 6, 2020 at 9:28 AM Jaime Casanova > ><jaime.casanova@2ndquadrant.com> wrote: > >> Can you please create an entry in the commitfest for this one so we > >> don't lose track of it? > > > > We're not too far from the next minor release, so I've been looking at > this fix again and I intend to get it committed shortly (on Monday or > so). Attached is a slightly modified version of the v4 patch that: > > (a) Removes comments about projections added to code that is not > directly related to the fix. I'm not against adding such comments > separately. Seems fine. Do you want to commit them at the same time (just a separate commit)? Or have a separate patch? It seems a bit overkill to start a new thread just for those. > (b) Fixes comment in expected output of incremental_sort test. Thanks. > (c) Removes else branch from find_em_expr_usable_for_sorting_rel. It's > not quite needed thanks to the "return" in the "if" branch. IMO this > makes it more elegant. No objection. > I do have two questions about find_em_expr_usable_for_sorting_rel. > > (a) Where does the em_is_const check come from? The comment claims we > should not be trying to sort by equivalence class members, but I can't > convince myself it's actually true and I don't see it discussed in this > thread. That comes from find_ec_member_for_tle which is called by prepare_sort_from_pathkeys which we note in the comments contains the set of rules we need to mirror. > (b) In find_em_expr_for_rel, which was what was used before, the > condition on relids was this: > > if (bms_is_subset(em->em_relids, rel->relids) && > !bms_is_empty(em->em_relids)) > { > return em->em_expr; > } > > but here we're using only > > if (!bms_is_subset(em->em_relids, rel->relids)) > continue; > > Isn't this missing the second bms_is_empty condition? I definitely remember intentionally removing that condition. For one, it's not present in prepare_sort_from_pathkeys. My memory is a bit fuzzy on all of the whys, but isn't it the case that if we've already excluded const expressions, that we must have vars (and therefore rels) present, and therefore the relids bms must be non-empty? Probably more importantly, we're going to check that an actual em expr matches, which means the bms subset check is really just an optimization to avoid unnecessarily scanning the exprs for equality. Perhaps it's worth adding a comment saying as such? By the way, the fact that this is parallel to prepare_sort_from_pathkeys is the reason the XXX comment is still there about possibly needing to remove relabel types (since prepare_sort_from_pathkeys does that). I don't think it's a hard requirement: the worst case is that by not digging into relabel types we're being unnecessarily strict. Both of these I can add if desired. James
On Fri, Oct 30, 2020 at 01:26:08PM -0400, James Coleman wrote: >On Thu, Oct 29, 2020 at 6:06 PM Tomas Vondra ><tomas.vondra@2ndquadrant.com> wrote: >> >> On Tue, Oct 06, 2020 at 09:37:31AM -0400, James Coleman wrote: >> >On Tue, Oct 6, 2020 at 9:28 AM Jaime Casanova >> ><jaime.casanova@2ndquadrant.com> wrote: >> >> Can you please create an entry in the commitfest for this one so we >> >> don't lose track of it? >> > >> >> We're not too far from the next minor release, so I've been looking at >> this fix again and I intend to get it committed shortly (on Monday or >> so). Attached is a slightly modified version of the v4 patch that: >> >> (a) Removes comments about projections added to code that is not >> directly related to the fix. I'm not against adding such comments >> separately. > >Seems fine. Do you want to commit them at the same time (just a >separate commit)? Or have a separate patch? It seems a bit overkill to >start a new thread just for those. > Probably sometime later, as a separate patch. I haven't thought very much about those comments, it just seemed unrelated to the fix so I've removed that for now. Let's not bother with this until after the minor release. >> (b) Fixes comment in expected output of incremental_sort test. > >Thanks. > >> (c) Removes else branch from find_em_expr_usable_for_sorting_rel. It's >> not quite needed thanks to the "return" in the "if" branch. IMO this >> makes it more elegant. > >No objection. > >> I do have two questions about find_em_expr_usable_for_sorting_rel. >> >> (a) Where does the em_is_const check come from? The comment claims we >> should not be trying to sort by equivalence class members, but I can't >> convince myself it's actually true and I don't see it discussed in this >> thread. > >That comes from find_ec_member_for_tle which is called by >prepare_sort_from_pathkeys which we note in the comments contains the >set of rules we need to mirror. > Thanks for the pointer. I'll take a look. >> (b) In find_em_expr_for_rel, which was what was used before, the >> condition on relids was this: >> >> if (bms_is_subset(em->em_relids, rel->relids) && >> !bms_is_empty(em->em_relids)) >> { >> return em->em_expr; >> } >> >> but here we're using only >> >> if (!bms_is_subset(em->em_relids, rel->relids)) >> continue; >> >> Isn't this missing the second bms_is_empty condition? > >I definitely remember intentionally removing that condition. For one, >it's not present in prepare_sort_from_pathkeys. My memory is a bit >fuzzy on all of the whys, but isn't it the case that if we've already >excluded const expressions, that we must have vars (and therefore >rels) present, and therefore the relids bms must be non-empty? >Probably more importantly, we're going to check that an actual em expr >matches, which means the bms subset check is really just an >optimization to avoid unnecessarily scanning the exprs for equality. >Perhaps it's worth adding a comment saying as such? > >By the way, the fact that this is parallel to >prepare_sort_from_pathkeys is the reason the XXX comment is still >there about possibly needing to remove relabel types (since >prepare_sort_from_pathkeys does that). I don't think it's a hard >requirement: the worst case is that by not digging into relabel types >we're being unnecessarily strict. > >Both of these I can add if desired. > Yeah, it'd be good to explain the reasoning why it's fine to have the conditions like this. Thanks. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Oct 30, 2020 at 5:03 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Fri, Oct 30, 2020 at 01:26:08PM -0400, James Coleman wrote: > >On Thu, Oct 29, 2020 at 6:06 PM Tomas Vondra > ><tomas.vondra@2ndquadrant.com> wrote: > >> > >> On Tue, Oct 06, 2020 at 09:37:31AM -0400, James Coleman wrote: > >> >On Tue, Oct 6, 2020 at 9:28 AM Jaime Casanova > >> ><jaime.casanova@2ndquadrant.com> wrote: > >> >> Can you please create an entry in the commitfest for this one so we > >> >> don't lose track of it? > >> > > >> > >> We're not too far from the next minor release, so I've been looking at > >> this fix again and I intend to get it committed shortly (on Monday or > >> so). Attached is a slightly modified version of the v4 patch that: > >> > >> (a) Removes comments about projections added to code that is not > >> directly related to the fix. I'm not against adding such comments > >> separately. > > > >Seems fine. Do you want to commit them at the same time (just a > >separate commit)? Or have a separate patch? It seems a bit overkill to > >start a new thread just for those. > > > > Probably sometime later, as a separate patch. I haven't thought very > much about those comments, it just seemed unrelated to the fix so I've > removed that for now. Let's not bother with this until after the minor > release. For whatever it's worth, I didn't originally consider them unrelated; the purpose was to explain why those places were safe relative to the same kinds of issues under discussion here: whether an expr will be in the target and when it might need to be added. EIther way I've split it into a separate patch in this series. > >> (b) Fixes comment in expected output of incremental_sort test. > > > >Thanks. > > > >> (c) Removes else branch from find_em_expr_usable_for_sorting_rel. It's > >> not quite needed thanks to the "return" in the "if" branch. IMO this > >> makes it more elegant. > > > >No objection. > > > >> I do have two questions about find_em_expr_usable_for_sorting_rel. > >> > >> (a) Where does the em_is_const check come from? The comment claims we > >> should not be trying to sort by equivalence class members, but I can't > >> convince myself it's actually true and I don't see it discussed in this > >> thread. > > > >That comes from find_ec_member_for_tle which is called by > >prepare_sort_from_pathkeys which we note in the comments contains the > >set of rules we need to mirror. > > > > Thanks for the pointer. I'll take a look. > > >> (b) In find_em_expr_for_rel, which was what was used before, the > >> condition on relids was this: > >> > >> if (bms_is_subset(em->em_relids, rel->relids) && > >> !bms_is_empty(em->em_relids)) > >> { > >> return em->em_expr; > >> } > >> > >> but here we're using only > >> > >> if (!bms_is_subset(em->em_relids, rel->relids)) > >> continue; > >> > >> Isn't this missing the second bms_is_empty condition? > > > >I definitely remember intentionally removing that condition. For one, > >it's not present in prepare_sort_from_pathkeys. My memory is a bit > >fuzzy on all of the whys, but isn't it the case that if we've already > >excluded const expressions, that we must have vars (and therefore > >rels) present, and therefore the relids bms must be non-empty? > >Probably more importantly, we're going to check that an actual em expr > >matches, which means the bms subset check is really just an > >optimization to avoid unnecessarily scanning the exprs for equality. > >Perhaps it's worth adding a comment saying as such? > > > >By the way, the fact that this is parallel to > >prepare_sort_from_pathkeys is the reason the XXX comment is still > >there about possibly needing to remove relabel types (since > >prepare_sort_from_pathkeys does that). I don't think it's a hard > >requirement: the worst case is that by not digging into relabel types > >we're being unnecessarily strict. > > > >Both of these I can add if desired. > > > > Yeah, it'd be good to explain the reasoning why it's fine to have the > conditions like this. Thanks. I was looking at this some more, and I'm still convinced that this is correct, but I don't think a comment about it being an optimization (though I suspect that that its major benefit), since it came from, and must parallel, find_ec_member_for_tle, and there is no such explanation there. I don't want to backfill reasoning onto that decision, but I do think it's important to parallel it since it's ultimately what is going to cause errors if we're out of sync with it. As to why find_em_expr_for_rel is different, I think the answer is that it's used by the FDW code, and it seems to me to make sense that in that case we'd only send sort conditions to the foreign server if the em actually contains rels. The new series attached includes the primary fix for this issue, the additional comments that could either go in at the same time or not, and my patch with semi-related questions (which isn't intended to be committed, but to keep track of the questions). James
Attachment
On Fri, Oct 30, 2020 at 07:37:33PM -0400, James Coleman wrote: > > ... > >I was looking at this some more, and I'm still convinced that this is >correct, but I don't think a comment about it being an optimization >(though I suspect that that its major benefit), since it came from, >and must parallel, find_ec_member_for_tle, and there is no such >explanation there. I don't want to backfill reasoning onto that >decision, but I do think it's important to parallel it since it's >ultimately what is going to cause errors if we're out of sync with it. > >As to why find_em_expr_for_rel is different, I think the answer is >that it's used by the FDW code, and it seems to me to make sense that >in that case we'd only send sort conditions to the foreign server if >the em actually contains rels. > >The new series attached includes the primary fix for this issue, the >additional comments that could either go in at the same time or not, >and my patch with semi-related questions (which isn't intended to be >committed, but to keep track of the questions). > Thanks. Attached are two patches that I plan to get committed 0001 is what you sent, with slightly reworded commit message. This is the actual fix. I'm still thinking about Robert's comment that perhaps we should be considering only expressions already in the relation's target, which would imply checking for volatile expressions is not enough. But I've been unable to convince myself it's correct or incorrect. In any case 0001 is a clear improvement - in the worst case we'll have to make it stricter in the future. 0002 partially reverts ba3e76cc571 by moving find_em_expr_for_rel back to postgres_fdw.c. We've moved it to equivclass.c so that it could be called from both the FDW and allpaths.c, but now that the functions diverged it's only called from the FDW again. So maybe we should move it back, but I guess that's not a good thing in a minor release. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Nov 03, 2020 at 03:37:43AM +0100, Tomas Vondra wrote: >On Fri, Oct 30, 2020 at 07:37:33PM -0400, James Coleman wrote: >> >>... >> >>I was looking at this some more, and I'm still convinced that this is >>correct, but I don't think a comment about it being an optimization >>(though I suspect that that its major benefit), since it came from, >>and must parallel, find_ec_member_for_tle, and there is no such >>explanation there. I don't want to backfill reasoning onto that >>decision, but I do think it's important to parallel it since it's >>ultimately what is going to cause errors if we're out of sync with it. >> >>As to why find_em_expr_for_rel is different, I think the answer is >>that it's used by the FDW code, and it seems to me to make sense that >>in that case we'd only send sort conditions to the foreign server if >>the em actually contains rels. >> >>The new series attached includes the primary fix for this issue, the >>additional comments that could either go in at the same time or not, >>and my patch with semi-related questions (which isn't intended to be >>committed, but to keep track of the questions). >> > >Thanks. Attached are two patches that I plan to get committed > >0001 is what you sent, with slightly reworded commit message. This is >the actual fix. > >I'm still thinking about Robert's comment that perhaps we should be >considering only expressions already in the relation's target, which >would imply checking for volatile expressions is not enough. But I've >been unable to convince myself it's correct or incorrect. In any case >0001 is a clear improvement - in the worst case we'll have to make it >stricter in the future. > > >0002 partially reverts ba3e76cc571 by moving find_em_expr_for_rel back >to postgres_fdw.c. We've moved it to equivclass.c so that it could be >called from both the FDW and allpaths.c, but now that the functions >diverged it's only called from the FDW again. So maybe we should move >it back, but I guess that's not a good thing in a minor release. > I've pushed the 0001 part, i.e. the main fix. Not sure about the other parts (comments and moving the code back to postgres_fdw) yet. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I've pushed the 0001 part, i.e. the main fix. Not sure about the other
parts (comments and moving the code back to postgres_fdw) yet.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, Tomas!
I'm still having trouble with this problem. Tom commited something [1] to avoid core dumps, but all I get now is: subplan "SubPlan 1" was not initialized
I upgraded to 13.1 over the weekend but the problem still occurs.
This is the topic I started: https://www.postgresql.org/message-id/flat/CAAaqYe--qpnBOUiqnHSAc71K2HpXetnGedRAX0Z04zx%3DY0ExHA%40mail.gmail.com#8cb6c7c5e787d2d3d2dc7aaf00d1cae8
[1] https://git.postgresql.org/pg/commitdiff/936043c9eacb9e9c7356a8190a410d2c4e4ea03a
Hmm, I missed that other thread. That indeed seems like a bug in the same area already tweaked by ebb7ae839d033d0f2 for similar cases. The attached patch fixes this simply by adding is_parallel_safe to get_useful_pathkeys_for_relation - that does fix the reproducer, but I'm not entirely sure that's the right place. Maybe it should be done in find_em_expr_usable_for_sorting_rel (which might make a difference if an EC clause can contain a mix of parallel safe and unsafe expressions). Or maybe we should do it in the caller (which would allow using get_useful_pathkeys_for_relation in contexts not requiring parallel safety in the future). Anyway, while this is not an "incremental sort" bug, it seems like the root cause is the same as for ebb7ae839d033d0f2 - one of the incremental sort patches started considering sorting below gather nodes, not realizing these possible consequences. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On Mon, Nov 16, 2020 at 11:23 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > Hmm, I missed that other thread. That indeed seems like a bug in the > same area already tweaked by ebb7ae839d033d0f2 for similar cases. I meant to bring it up in this thread before we got the other patch committed, but I just ended up not having time to look into it. > The attached patch fixes this simply by adding is_parallel_safe to > get_useful_pathkeys_for_relation - that does fix the reproducer, but I'm > not entirely sure that's the right place. Maybe it should be done in > find_em_expr_usable_for_sorting_rel (which might make a difference if an > EC clause can contain a mix of parallel safe and unsafe expressions). Or > maybe we should do it in the caller (which would allow using > get_useful_pathkeys_for_relation in contexts not requiring parallel > safety in the future). > > Anyway, while this is not an "incremental sort" bug, it seems like the > root cause is the same as for ebb7ae839d033d0f2 - one of the incremental > sort patches started considering sorting below gather nodes, not > realizing these possible consequences. Yeah. I'd like to investigate a bit if really we should also be adding it to prepare_sort_from_pathkeys (which find_em_expr_usable_for_sorting_rel parallels) or similar, since this seems to be a broader concern that's been previously missed (even if the bug was otherwise not yet exposed). In particular, as I understand it, we did do sorts below gather nodes before, just in fewer places, which meant that "in theory" the code was already broken (or at least not complete) for parallel queries. James
On 11/17/20 3:03 PM, James Coleman wrote: > On Mon, Nov 16, 2020 at 11:23 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> Hmm, I missed that other thread. That indeed seems like a bug in the >> same area already tweaked by ebb7ae839d033d0f2 for similar cases. > > I meant to bring it up in this thread before we got the other patch > committed, but I just ended up not having time to look into it. > >> The attached patch fixes this simply by adding is_parallel_safe to >> get_useful_pathkeys_for_relation - that does fix the reproducer, but I'm >> not entirely sure that's the right place. Maybe it should be done in >> find_em_expr_usable_for_sorting_rel (which might make a difference if an >> EC clause can contain a mix of parallel safe and unsafe expressions). Or >> maybe we should do it in the caller (which would allow using >> get_useful_pathkeys_for_relation in contexts not requiring parallel >> safety in the future). >> >> Anyway, while this is not an "incremental sort" bug, it seems like the >> root cause is the same as for ebb7ae839d033d0f2 - one of the incremental >> sort patches started considering sorting below gather nodes, not >> realizing these possible consequences. > > Yeah. I'd like to investigate a bit if really we should also be adding > it to prepare_sort_from_pathkeys (which > find_em_expr_usable_for_sorting_rel parallels) or similar, since this > seems to be a broader concern that's been previously missed (even if > the bug was otherwise not yet exposed). In particular, as I understand > it, we did do sorts below gather nodes before, just in fewer places, > which meant that "in theory" the code was already broken (or at least > not complete) for parallel queries. > True. It's possible the bug was there before, but the affected plans were just very unlikely for some reason, and the new plans introduced by incremental sort just made it easier to trigger. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 17, 2020 at 11:20 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > > > On 11/17/20 3:03 PM, James Coleman wrote: > > On Mon, Nov 16, 2020 at 11:23 PM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> > >> Hmm, I missed that other thread. That indeed seems like a bug in the > >> same area already tweaked by ebb7ae839d033d0f2 for similar cases. > > > > I meant to bring it up in this thread before we got the other patch > > committed, but I just ended up not having time to look into it. > > > >> The attached patch fixes this simply by adding is_parallel_safe to > >> get_useful_pathkeys_for_relation - that does fix the reproducer, but I'm > >> not entirely sure that's the right place. Maybe it should be done in > >> find_em_expr_usable_for_sorting_rel (which might make a difference if an > >> EC clause can contain a mix of parallel safe and unsafe expressions). Or > >> maybe we should do it in the caller (which would allow using > >> get_useful_pathkeys_for_relation in contexts not requiring parallel > >> safety in the future). > >> > >> Anyway, while this is not an "incremental sort" bug, it seems like the > >> root cause is the same as for ebb7ae839d033d0f2 - one of the incremental > >> sort patches started considering sorting below gather nodes, not > >> realizing these possible consequences. > > > > Yeah. I'd like to investigate a bit if really we should also be adding > > it to prepare_sort_from_pathkeys (which > > find_em_expr_usable_for_sorting_rel parallels) or similar, since this > > seems to be a broader concern that's been previously missed (even if > > the bug was otherwise not yet exposed). In particular, as I understand > > it, we did do sorts below gather nodes before, just in fewer places, > > which meant that "in theory" the code was already broken (or at least > > not complete) for parallel queries. > > > > True. It's possible the bug was there before, but the affected plans > were just very unlikely for some reason, and the new plans introduced by > incremental sort just made it easier to trigger. For additional patches on the broader topic (e.g., parallel safety) is it preferable to keep them here or in the thread Luis started (or even a new one besides that -- since that's on -bugs not -hackers)? James
On Wed, Oct 7, 2020 at 6:22 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > I'm still not entirely sure I understand what's happening, or what the > exact rule is. Consider this query: > > explain (verbose) select distinct i, t, md5(t) from ref_0; > > which on PG12 (i.e. before incremental sort) is planned like this: > > QUERY PLAN > ---------------------------------------------------------------------------------- > Unique (cost=78120.92..83120.92 rows=500000 width=65) > Output: i, t, (md5(t)) > -> Sort (cost=78120.92..79370.92 rows=500000 width=65) > Output: i, t, (md5(t)) > Sort Key: ref_0.i, ref_0.t, (md5(ref_0.t)) > -> Seq Scan on public.ref_0 (cost=0.00..10282.00 rows=500000 width=65) > Output: i, t, md5(t) > (7 rows) > > i.e. the (stable) function is pushed all the way to the scan node. And > even if we replace it with a volatile expression it gets pushed down: > > explain (verbose) select distinct i, t, md5(random()::text || t) from ref_0; > > QUERY PLAN > ---------------------------------------------------------------------------------- > Unique (cost=83120.92..88120.92 rows=500000 width=65) > Output: i, t, (md5(((random())::text || t))) > -> Sort (cost=83120.92..84370.92 rows=500000 width=65) > Output: i, t, (md5(((random())::text || t))) > Sort Key: ref_0.i, ref_0.t, (md5(((random())::text || ref_0.t))) > -> Seq Scan on public.ref_0 (cost=0.00..15282.00 rows=500000 width=65) > Output: i, t, md5(((random())::text || t)) > (7 rows) > > > But perhaps I just don't understand the assumption correctly? This isn't a counterexample, because there's no join tree here -- or, well, there is, but it's trivial, because there's only one relation involved. You can't have a non-Var expression computed before you finish all the joins, because there are no joins. What I said was: "target lists for any nodes below the top of the join tree were previously always just Var nodes." The topmost join allowed non-Var nodes before, but not lower levels. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Nov 20, 2020 at 12:06 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Oct 7, 2020 at 6:22 PM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > I'm still not entirely sure I understand what's happening, or what the > > exact rule is. Consider this query: > > > > explain (verbose) select distinct i, t, md5(t) from ref_0; > > > > which on PG12 (i.e. before incremental sort) is planned like this: > > > > QUERY PLAN > > ---------------------------------------------------------------------------------- > > Unique (cost=78120.92..83120.92 rows=500000 width=65) > > Output: i, t, (md5(t)) > > -> Sort (cost=78120.92..79370.92 rows=500000 width=65) > > Output: i, t, (md5(t)) > > Sort Key: ref_0.i, ref_0.t, (md5(ref_0.t)) > > -> Seq Scan on public.ref_0 (cost=0.00..10282.00 rows=500000 width=65) > > Output: i, t, md5(t) > > (7 rows) > > > > i.e. the (stable) function is pushed all the way to the scan node. And > > even if we replace it with a volatile expression it gets pushed down: > > > > explain (verbose) select distinct i, t, md5(random()::text || t) from ref_0; > > > > QUERY PLAN > > ---------------------------------------------------------------------------------- > > Unique (cost=83120.92..88120.92 rows=500000 width=65) > > Output: i, t, (md5(((random())::text || t))) > > -> Sort (cost=83120.92..84370.92 rows=500000 width=65) > > Output: i, t, (md5(((random())::text || t))) > > Sort Key: ref_0.i, ref_0.t, (md5(((random())::text || ref_0.t))) > > -> Seq Scan on public.ref_0 (cost=0.00..15282.00 rows=500000 width=65) > > Output: i, t, md5(((random())::text || t)) > > (7 rows) > > > > > > But perhaps I just don't understand the assumption correctly? > > This isn't a counterexample, because there's no join tree here -- or, > well, there is, but it's trivial, because there's only one relation > involved. You can't have a non-Var expression computed before you > finish all the joins, because there are no joins. > > What I said was: "target lists for any nodes below the top of the join > tree were previously always just Var nodes." The topmost join allowed > non-Var nodes before, but not lower levels. As I understand what you're saying, the attached (from the repro case in [1]'s discussion about parallel safety here) is a counterexample. Specifically we have a plan like: Merge Right Join -> Unique -> Gather Merge -> Sort -> Nested Loop The pathtarget of the nested loop contains non-var expressions (in this case a CASE expression). Am I misunderstanding what you're saying? I've attached verbose output (and the query). James
Attachment
On Fri, Nov 20, 2020 at 1:51 PM James Coleman <jtc331@gmail.com> wrote: > > This isn't a counterexample, because there's no join tree here -- or, > > well, there is, but it's trivial, because there's only one relation > > involved. You can't have a non-Var expression computed before you > > finish all the joins, because there are no joins. > > > > What I said was: "target lists for any nodes below the top of the join > > tree were previously always just Var nodes." The topmost join allowed > > non-Var nodes before, but not lower levels. > > As I understand what you're saying, the attached (from the repro case > in [1]'s discussion about parallel safety here) is a counterexample. > > Specifically we have a plan like: > > Merge Right Join > -> Unique > -> Gather Merge > -> Sort > -> Nested Loop > > The pathtarget of the nested loop contains non-var expressions (in > this case a CASE expression). Well, in this case there are two join trees, one for the subquery and one for the outer query. The expressions for the subquery are computed at the top of the plan tree for that subquery. I guess I didn't specify before that I meant "at the top of the join tree for a subquery level," but I did mean that. On principle, the planner can't very realistically postpone evaluating a subquery's expressions until the top of the outer query's join tree, because, as in your example here, the subquery might contain an ORDER BY or DISTINCT clause. And anyway, if you look at the planner code, you'll see that every subquery is basically planned separately, unless it gets flattened into the parent query, so even if it were possible to postpone expression evaluation to outer subquery levels in some case, the current code structure definitely would fail to achieve that goal. However, apart from incremental sort, you can postpone evaluating a join tree's expressions until you reach the top of the join tree, and I think that's what we have always done in the past. Each baserel in the join tree gets a target list containing a list of vars which corresponds to its column list, and the joinrels get lists of vars which correspond to the columns from the inputs are needed at higher levels of the plan tree. At the top of the join tree, we stick in the expressions, replacing the target list of the top node in the plan tree; the expressions have to be computable from the inputs because the inputs contain all the columns that we figured out were needed at this level at the beginning of planning. But, if there are scans or joins below the top level of the plan tree, then do they bubble up to higher levels of the plan? Should they? It's pretty complicated. Theoretically if I've computed length(somelongvar) then perhaps I can just bubble the computed value up the plan tree, rather than the original column, which might be cheaper. But that only works if the higher levels only need length(somelongvar) and not somelongvar itself. I don't think there's any logic to work stuff like this out, unless the incremental sort patch added some, because I don't think we ever did it before. And it may be that it doesn't really matter, at least when there aren't volatile functions: perhaps the worst case is that we evaluate a non-volatile function twice, and maybe that's just a performance consequence that we can live with. But on the other hand, maybe it breaks stuff. Or, on the third hand, maybe I'm wrong about what the rule was in the first place. This is certainly a complicated topic. I don't think I'm wrong, or I wouldn't be bothering to write emails about it, but that doesn't mean I'm not wrong anyway. -- Robert Haas EDB: http://www.enterprisedb.com
Thanks much for the detailed followup; this is super helpful. On Fri, Nov 20, 2020 at 2:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Nov 20, 2020 at 1:51 PM James Coleman <jtc331@gmail.com> wrote: > > > This isn't a counterexample, because there's no join tree here -- or, > > > well, there is, but it's trivial, because there's only one relation > > > involved. You can't have a non-Var expression computed before you > > > finish all the joins, because there are no joins. > > > > > > What I said was: "target lists for any nodes below the top of the join > > > tree were previously always just Var nodes." The topmost join allowed > > > non-Var nodes before, but not lower levels. > > > > As I understand what you're saying, the attached (from the repro case > > in [1]'s discussion about parallel safety here) is a counterexample. > > > > Specifically we have a plan like: > > > > Merge Right Join > > -> Unique > > -> Gather Merge > > -> Sort > > -> Nested Loop > > > > The pathtarget of the nested loop contains non-var expressions (in > > this case a CASE expression). > > Well, in this case there are two join trees, one for the subquery and > one for the outer query. The expressions for the subquery are computed > at the top of the plan tree for that subquery. > > I guess I didn't specify before that I meant "at the top of the join > tree for a subquery level," but I did mean that. On principle, the > planner can't very realistically postpone evaluating a subquery's > expressions until the top of the outer query's join tree, because, as > in your example here, the subquery might contain an ORDER BY or > DISTINCT clause. And anyway, if you look at the planner code, you'll > see that every subquery is basically planned separately, unless it > gets flattened into the parent query, so even if it were possible to > postpone expression evaluation to outer subquery levels in some case, > the current code structure definitely would fail to achieve that goal. Ah, the "for a subquery level" clarification is helpful. > However, apart from incremental sort, you can postpone evaluating a > join tree's expressions until you reach the top of the join tree, and > I think that's what we have always done in the past. Each baserel in > the join tree gets a target list containing a list of vars which > corresponds to its column list, and the joinrels get lists of vars > which correspond to the columns from the inputs are needed at higher > levels of the plan tree. At the top of the join tree, we stick in the > expressions, replacing the target list of the top node in the plan > tree; the expressions have to be computable from the inputs because > the inputs contain all the columns that we figured out were needed at > this level at the beginning of planning. But, if there are scans or > joins below the top level of the plan tree, then do they bubble up to > higher levels of the plan? Should they? It's pretty complicated. > Theoretically if I've computed length(somelongvar) then perhaps I can > just bubble the computed value up the plan tree, rather than the > original column, which might be cheaper. But that only works if the > higher levels only need length(somelongvar) and not somelongvar > itself. Apologies for some rambling/overlapping thoughts below; I'm try to absorb all of this. A join rel (at the top of a subquery join tree) would have to somehow know that it needs to pass up both length(somelongvar) and somelongvar right? Since both might be needed by a higher level? If I understand correctly you're saying that: 1. Target list entries for join rels contain vars and non-var expressions 2. Target list entries for base rels contain only vars 3. By implication the planner knows how to propagate non-var expression results up from join rels, but not from base rels. Point 3 is the one that would surprise me, or at least, I'd be interested in knowing what makes the two different in that case (is it an artificial restriction, a result of how things are structured in code, something else?). I feel like I'm still missing context on why this is inherently a problem. Does this mean a simple "SELECT bar * 2 FROM foo" ends up with a join rel at the top? Or that a projection handles it (maybe this is the wrong question -- perhaps the project results in a join rel? I don't know enough here to tell if this ends up being an odd question)? And if a projection, then would inserting a projection above a base rel but before the top of the join tree solve the problem while allowing us to push expression evaluation down? Perhaps the key (to what I'm missing) is just saying something like: "the work hasn't been down to push expression evaluation down and ensure it remains in the target lists at every level of the join tree so as to be propagated up, and so adding it to the base rel will almost certainly mean duplicate evaluation". That still leaves my question about how propagating it up from a subquery join tree works (maybe that's an already handled "special" case). > I don't think there's any logic to work stuff like this out, > unless the incremental sort patch added some, because I don't think we > ever did it before. And it may be that it doesn't really matter, at > least when there aren't volatile functions: perhaps the worst case is > that we evaluate a non-volatile function twice, and maybe that's just > a performance consequence that we can live with. But on the other > hand, maybe it breaks stuff. It seems to me that (given the new restriction on volatile expressions) it'd likely not be the end of the world from the unnecessary execution perspective; after all, we already do lots of duplicate expression evaluation in other areas like filter clauses. > Or, on the third hand, maybe I'm wrong about what the rule was in the > first place. This is certainly a complicated topic. I don't think I'm > wrong, or I wouldn't be bothering to write emails about it, but that > doesn't mean I'm not wrong anyway. Well, I for one definitely didn't understand what you were trying to point out earlier, so I appreciate your continuing the discussion. It seems to me a (likely?) outcome of this is requiring that the expressions be in the target list to be considered for sorting here. And indeed that fixes the non-parallel subplan issue we're also looking at. But I think that still leaves something missing: after all, prepare_sort_from_pathkeys does know how to insert new target list entries, so something either there (or in the caller/how its called) has to be enforcing an apparently implicit rule about what point in the tree it's safe to do such. Or even just no path generation had ever considered it before (that feels unsatisfactory as an explanation to me, because it feels more implicit than I'd like, but...) James
James Coleman <jtc331@gmail.com> writes: > But I think that still leaves something missing: after all, > prepare_sort_from_pathkeys does know how to insert new target list > entries, so something either there (or in the caller/how its called) > has to be enforcing an apparently implicit rule about what point in > the tree it's safe to do such. Or even just no path generation had > ever considered it before (that feels unsatisfactory as an explanation > to me, because it feels more implicit than I'd like, but...) Hi guys, I hadn't been paying any attention to this thread, but Robert asked me to take a look. A few comments: 1. James was wondering, far upthread, why we would do projections pre-sort or post-sort. I think the answers can be found by studying planner.c's make_sort_input_target(), which separates out what we want to do pre-sort and post-sort. (There, the "sort" is envisioned as a standalone Sort node atop all the scan/join steps; but its decisions nonetheless constrain what will be considered for sorting that's implemented further down.) It has a *very* long introductory comment laying out all the considerations. 2. Robert is correct that under normal circumstances, the targetlists of both baserels and join rels contain only Vars. Any computations we have to do are postponed to the final top-level targetlist, whether they are volatile or not. The fact that this policy is applied regardless of volatility may explain why James isn't seeing volatility checks where he expected them. The core (and, I think, only) exception to this is that an expression that is a sort or group key has to be evaluated earlier. But even then, it won't be pushed down as far as the reltargets of any scan or join relations; it'll be computed after the final join. 3. If we have a volatile sort/group key, that constrains the set of plans we can validly generate, because the key expression mustn't be evaluated redundantly. With the addition of parallelism, a non-parallel-safe sort/group key expression likewise constrains the set of valid plans, even if it's not considered volatile. 4. In the past, the only way we'd consider non-Var sort keys in any way during scan/join planning was if (a) a relation had an expression index matching a requested sort key; of course, that's a-fortiori going to be a non-volatile expression. Or (b) we needed to sort in order to provide suitable input for a merge join ... but again, volatile expressions aren't considered candidates for mergejoin. So there basically wasn't any need to consider sorting by volatiles below the top level sort/group processing, and that again contributes to why you don't see explicit tests in that area. I'm not sure offhand whether the parallel-query patches added anything to guard against nonvolatile-but-parallel-unsafe sort expressions. If not, there might be live bugs there independently of incremental sort; or that might be unreachable because of overall limitations on parallelism. 5. While I've not dug far enough into the code to identify the exact issue, the impression I have about this bug and the parallel-unsafe subplan bug is that the incremental sort code is generating Paths without due regard to point 3. If it is making paths based on explicit sorts for final-stage pathkeys, independently of either expression indexes or mergejoin requirements, that could well explain why it's got problems that weren't seen before. 6. I did look at ebb7ae839, and I suspect that the last part of find_em_expr_usable_for_sorting_rel(), ie the search of the reltarget list, is dead code. There is no case where a volatile expression would appear there, per point 2. The committed regression test cases certainly do not provide a counterexample: a look at the code coverage report will show you that that loop never finds a match in any regression test case. I'd be inclined to flush that code and just say that we won't return volatile expressions here. regards, tom lane
On Mon, Nov 23, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > James Coleman <jtc331@gmail.com> writes: > > But I think that still leaves something missing: after all, > > prepare_sort_from_pathkeys does know how to insert new target list > > entries, so something either there (or in the caller/how its called) > > has to be enforcing an apparently implicit rule about what point in > > the tree it's safe to do such. Or even just no path generation had > > ever considered it before (that feels unsatisfactory as an explanation > > to me, because it feels more implicit than I'd like, but...) > > Hi guys, > > I hadn't been paying any attention to this thread, but Robert asked > me to take a look. A few comments: Thanks for jumping in (and thanks Robert for asking for Tom to take a look); I appreciate the input. > 1. James was wondering, far upthread, why we would do projections > pre-sort or post-sort. I think the answers can be found by studying > planner.c's make_sort_input_target(), which separates out what we want > to do pre-sort and post-sort. (There, the "sort" is envisioned as a > standalone Sort node atop all the scan/join steps; but its decisions > nonetheless constrain what will be considered for sorting that's > implemented further down.) It has a *very* long introductory comment > laying out all the considerations. That comment is helpful; I wish I'd discovered it sooner. Does it imply we need to intentionally avoid SRFs also? > 2. Robert is correct that under normal circumstances, the targetlists of > both baserels and join rels contain only Vars. Any computations we have > to do are postponed to the final top-level targetlist, whether they are > volatile or not. The fact that this policy is applied regardless of > volatility may explain why James isn't seeing volatility checks where he > expected them. The core (and, I think, only) exception to this is that > an expression that is a sort or group key has to be evaluated earlier. > But even then, it won't be pushed down as far as the reltargets of any > scan or join relations; it'll be computed after the final join. Is that primarily a project policy? Or a limitation of our current planner (just can't push things down/carry the results back up as much as we'd like)? Or something else? In particular, it seems plausible there are cases where pushing down the sort on a non-Var expression to the base rel could improve plans, so I'm wondering if there's a reason to intentionally avoid that in the long or short run (or both). > 3. If we have a volatile sort/group key, that constrains the set of plans > we can validly generate, because the key expression mustn't be evaluated > redundantly. With the addition of parallelism, a non-parallel-safe > sort/group key expression likewise constrains the set of valid plans, > even if it's not considered volatile. This makes a lot of sense (just unfortunate we had to re-discover it). > 4. In the past, the only way we'd consider non-Var sort keys in any way > during scan/join planning was if (a) a relation had an expression index > matching a requested sort key; of course, that's a-fortiori going to be > a non-volatile expression. Or (b) we needed to sort in order to provide > suitable input for a merge join ... but again, volatile expressions > aren't considered candidates for mergejoin. So there basically wasn't > any need to consider sorting by volatiles below the top level sort/group > processing, and that again contributes to why you don't see explicit > tests in that area. I'm not sure offhand whether the parallel-query > patches added anything to guard against nonvolatile-but-parallel-unsafe > sort expressions. If not, there might be live bugs there independently > of incremental sort; or that might be unreachable because of overall > limitations on parallelism. Interesting: so merge joins are an example of us pushing down sorts, which I assume means (part of) the answer to my question on (2) is that there's nothing inherently wrong/broken with evaluating expressions lower down the tree as long as we're careful about what is safe/unsafe with respect to volatility and parallelism? > 5. While I've not dug far enough into the code to identify the exact > issue, the impression I have about this bug and the parallel-unsafe > subplan bug is that the incremental sort code is generating Paths > without due regard to point 3. If it is making paths based on explicit > sorts for final-stage pathkeys, independently of either expression > indexes or mergejoin requirements, that could well explain why it's > got problems that weren't seen before. Yes, that's correct. Tomas already pushed the fix for the volatility case, and there's a new thread [1] for the parallel safety concerns. > 6. I did look at ebb7ae839, and I suspect that the last part of > find_em_expr_usable_for_sorting_rel(), ie the search of the reltarget > list, is dead code. There is no case where a volatile expression would > appear there, per point 2. The committed regression test cases > certainly do not provide a counterexample: a look at the code coverage > report will show you that that loop never finds a match in any > regression test case. I'd be inclined to flush that code and just > say that we won't return volatile expressions here. Hmm, I see what you mean. It's theoretically valuable if we ended up with volatile expressions there, but...that seems at best unlikely. I have wondered if we should strictly require the expression to be in the target list even if nonvolatile, but prepare_sort_from_pathkeys doesn't seem to think that's necessary, so I'm comfortable with that unless there's something you know we haven't considered. Alternatively, if we decide we need to be more strict (whether because of discussion under (1) or because we otherwise decide only Vars can be allowed here), then would we still potentially need/want the relabel stripping? Depending on what we conclude on that, I'll plan to address that in the thread in [1]. James 1: https://www.postgresql.org/message-id/flat/CAAaqYe8cK3g5CfLC4w7bs%3DhC0mSksZC%3DH5M8LSchj5e5OxpTAg%40mail.gmail.com
James Coleman <jtc331@gmail.com> writes: > On Mon, Nov 23, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 1. James was wondering, far upthread, why we would do projections >> pre-sort or post-sort. I think the answers can be found by studying >> planner.c's make_sort_input_target(), which separates out what we want >> to do pre-sort and post-sort. > Does it imply we need to intentionally avoid SRFs also? It's sort of a wart that we allow SRFs in ORDER BY at all, but my expectation is that make_sort_input_target would prevent lower levels of the planner from needing to think about that. We don't allow SRFs in index expressions, nor in WHERE clauses (so they'll never come up as mergejoin sort keys). So really there's no way that scan/join processing should need to consider such cases. Such a sort would only ever be implemented via a final Sort atop ProjectSet. >> 2. Robert is correct that under normal circumstances, the targetlists of >> both baserels and join rels contain only Vars. > Is that primarily a project policy? Or a limitation of our current > planner (just can't push things down/carry the results back up as much > as we'd like)? Or something else? In particular, it seems plausible > there are cases where pushing down the sort on a non-Var expression to > the base rel could improve plans, so I'm wondering if there's a reason > to intentionally avoid that in the long or short run (or both). I think you've just rediscovered Joe Hellerstein's thesis topic [1]. We ripped out the remnants of that code ages ago (search very early git states for "JMH" if you're interested), because the complexity vs. benefit ratio seemed pretty bad. Maybe there'll be a case for putting it back some day, but I'm dubious. Note that we have the ability to push down sorts-on-expressions anyway; that's not constrained by what is in the relation targetlists. > Interesting: so merge joins are an example of us pushing down sorts, > which I assume means (part of) the answer to my question on (2) is > that there's nothing inherently wrong/broken with evaluating > expressions lower down the tree as long as we're careful about what is > safe/unsafe with respect to volatility and parallelism? Right, I don't see any fundamental problem with that, we just have to be careful about these constraints. > I have wondered if we should strictly require the expression to be in > the target list even if nonvolatile, but prepare_sort_from_pathkeys > doesn't seem to think that's necessary, so I'm comfortable with that > unless there's something you know we haven't considered. No, prepare_sort_from_pathkeys is happy to build a sort expression if it can't find it already computed in the input. The secret here is that we should never get to that code with a "dangerous" sort expression, because we should never have made a Path that would request such a thing in the first place. It's fairly obvious to me how we deal with the consideration for volatile sortkeys. We cannot restrict parallel-unsafe sortkeys quite the same way, because the restriction only applies to parallel not non-parallel plans. Maybe it's sufficient to mark Paths as parallel unsafe if they sort by parallel-unsafe sortkeys? Is there anyplace that is Just Assuming that paths it builds will be parallel-safe? regards, tom lane [1] https://www.postgresql.org/message-id/28216.1023340706%40sss.pgh.pa.us
On Tue, Nov 24, 2020 at 2:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > James Coleman <jtc331@gmail.com> writes: > > On Mon, Nov 23, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> 1. James was wondering, far upthread, why we would do projections > >> pre-sort or post-sort. I think the answers can be found by studying > >> planner.c's make_sort_input_target(), which separates out what we want > >> to do pre-sort and post-sort. > > > Does it imply we need to intentionally avoid SRFs also? > > It's sort of a wart that we allow SRFs in ORDER BY at all, but my > expectation is that make_sort_input_target would prevent lower levels > of the planner from needing to think about that. We don't allow SRFs > in index expressions, nor in WHERE clauses (so they'll never come up > as mergejoin sort keys). So really there's no way that scan/join > processing should need to consider such cases. Such a sort would > only ever be implemented via a final Sort atop ProjectSet. In this case though we're sorting based on "interesting" pathkeys, which means we don't don't necessarily have index expressions or mergejoin sort keys. For example, even with the bugfix patch (from the parallel fix thread I've linked to previously) applied, I'm able to generate this (with of course some GUCs "tweaked"): select unique1 from tenk1 order by unnest('{1}'::int[]); Gather Merge Workers Planned: 2 -> Sort Sort Key: (unnest('{1}'::integer[])) -> ProjectSet -> Parallel Index Only Scan using tenk1_unique1 on tenk1 If I understand correctly, that's a violation of the spirit of what the comments above make_sort_input_target(). If that's true, then it should be pretty easy to disallow them from being considered. Given the existing restrictions on where SRFs can be placed in a SELECT (e.g., no CASE/COALESCE) and my assumption (without having thoroughly verified this) that SRFs aren't allowed as arguments to functions or as arguments to any other expression (I assume only scalars are allowed), would it be sufficient to check the pathkey expression (without recursion) to see if it's a FuncExpr that returns a set? > >> 2. Robert is correct that under normal circumstances, the targetlists of > >> both baserels and join rels contain only Vars. > > > Is that primarily a project policy? Or a limitation of our current > > planner (just can't push things down/carry the results back up as much > > as we'd like)? Or something else? In particular, it seems plausible > > there are cases where pushing down the sort on a non-Var expression to > > the base rel could improve plans, so I'm wondering if there's a reason > > to intentionally avoid that in the long or short run (or both). > > I think you've just rediscovered Joe Hellerstein's thesis topic [1]. > We ripped out the remnants of that code ages ago (search very early > git states for "JMH" if you're interested), because the complexity > vs. benefit ratio seemed pretty bad. Maybe there'll be a case for > putting it back some day, but I'm dubious. Note that we have the > ability to push down sorts-on-expressions anyway; that's not constrained > by what is in the relation targetlists. I'm doing some grepping now to see what that work entailed, but I'm not sure that what I'm describing is the same thing. By "pushing down a non-Var expression to the base rel" I mean what (to my ears) sounds like what you're saying by "we have the ability to push down sorts-on-expressions anyway; that's not constrained by what is in the relation targetlists". The target list entries I'm talking about for these expressions are the ones inserted by prepare_sort_from_pathkeys(), which in PG13 happens just a bit more frequently since, at least in parallel query, consider sort paths lower than we did before based on interesting pathkeys (for now just query_pathkeys) in generate_useful_gather_paths(). > > Interesting: so merge joins are an example of us pushing down sorts, > > which I assume means (part of) the answer to my question on (2) is > > that there's nothing inherently wrong/broken with evaluating > > expressions lower down the tree as long as we're careful about what is > > safe/unsafe with respect to volatility and parallelism? > > Right, I don't see any fundamental problem with that, we just have > to be careful about these constraints. Great. That's exactly what I was looking for. To summarize: the constraints are on volatility, parallel safety, and SRFs. > > I have wondered if we should strictly require the expression to be in > > the target list even if nonvolatile, but prepare_sort_from_pathkeys > > doesn't seem to think that's necessary, so I'm comfortable with that > > unless there's something you know we haven't considered. > > No, prepare_sort_from_pathkeys is happy to build a sort expression if it > can't find it already computed in the input. The secret here is that > we should never get to that code with a "dangerous" sort expression, > because we should never have made a Path that would request such a thing > in the first place. It's fairly obvious to me how we deal with the > consideration for volatile sortkeys. We cannot restrict parallel-unsafe > sortkeys quite the same way, because the restriction only applies to > parallel not non-parallel plans. Maybe it's sufficient to mark Paths > as parallel unsafe if they sort by parallel-unsafe sortkeys? Is there > anyplace that is Just Assuming that paths it builds will be > parallel-safe? It seems actually quite simple as I understand it: in my proposed patch in [1] we know in find_em_expr_usable_for_sorting_rel() whether or not we're working on parallel paths, and so we're able to rely simply on is_parallel_safe() for the expr under consideration. James 1: https://www.postgresql.org/message-id/flat/CAAaqYe8cK3g5CfLC4w7bs%3DhC0mSksZC%3DH5M8LSchj5e5OxpTAg%40mail.gmail.com
On Wed, Nov 25, 2020 at 2:53 PM James Coleman <jtc331@gmail.com> wrote: > > On Tue, Nov 24, 2020 at 2:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > James Coleman <jtc331@gmail.com> writes: > > > On Mon, Nov 23, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > >> 1. James was wondering, far upthread, why we would do projections > > >> pre-sort or post-sort. I think the answers can be found by studying > > >> planner.c's make_sort_input_target(), which separates out what we want > > >> to do pre-sort and post-sort. > > > > > Does it imply we need to intentionally avoid SRFs also? > > > > It's sort of a wart that we allow SRFs in ORDER BY at all, but my > > expectation is that make_sort_input_target would prevent lower levels > > of the planner from needing to think about that. We don't allow SRFs > > in index expressions, nor in WHERE clauses (so they'll never come up > > as mergejoin sort keys). So really there's no way that scan/join > > processing should need to consider such cases. Such a sort would > > only ever be implemented via a final Sort atop ProjectSet. > > In this case though we're sorting based on "interesting" pathkeys, > which means we don't don't necessarily have index expressions or > mergejoin sort keys. For example, even with the bugfix patch (from the > parallel fix thread I've linked to previously) applied, I'm able to > generate this (with of course some GUCs "tweaked"): > > select unique1 from tenk1 order by unnest('{1}'::int[]); > > Gather Merge > Workers Planned: 2 > -> Sort > Sort Key: (unnest('{1}'::integer[])) > -> ProjectSet > -> Parallel Index Only Scan using tenk1_unique1 on tenk1 > > If I understand correctly, that's a violation of the spirit of what > the comments above make_sort_input_target(). If that's true, then it > should be pretty easy to disallow them from being considered. Given > the existing restrictions on where SRFs can be placed in a SELECT > (e.g., no CASE/COALESCE) and my assumption (without having thoroughly > verified this) that SRFs aren't allowed as arguments to functions or > as arguments to any other expression (I assume only scalars are > allowed), would it be sufficient to check the pathkey expression > (without recursion) to see if it's a FuncExpr that returns a set? Here's the plan with a change to restrict SRFs here: Sort Sort Key: (unnest('{1,2}'::integer[])) -> Gather Workers Planned: 2 -> ProjectSet -> Parallel Index Only Scan using tenk1_unique1 on tenk1 I'm a bit surprised the ProjectSet is above the Index Scan rather than above the Gather, but maybe this is still more correct? James
On Tue, Nov 3, 2020 at 4:39 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > I've pushed the 0001 part, i.e. the main fix. Not sure about the other > parts (comments and moving the code back to postgres_fdw) yet. I noticed the CF entry [1] got moved to the next CF; I'm thinking this entry should be marked as committed since the fix for the initial bug reported on this thread has been pushed. We have the parallel safety issue outstanding, but there's a separate thread and patch for that, so I'll make a new CF entry for that. I can mark it as committed, but I'm not sure how to "undo" (or if that's desirable) the move to the next CF. Thoughts? James 1: https://commitfest.postgresql.org/30/2754/
On 01.12.2020 03:08, James Coleman wrote: > On Tue, Nov 3, 2020 at 4:39 PM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> I've pushed the 0001 part, i.e. the main fix. Not sure about the other >> parts (comments and moving the code back to postgres_fdw) yet. > I noticed the CF entry [1] got moved to the next CF; I'm thinking this > entry should be marked as committed since the fix for the initial bug > reported on this thread has been pushed. We have the parallel safety > issue outstanding, but there's a separate thread and patch for that, > so I'll make a new CF entry for that. > > I can mark it as committed, but I'm not sure how to "undo" (or if > that's desirable) the move to the next CF. > > Thoughts? > > James > > 1: https://commitfest.postgresql.org/30/2754/ > > Oops... I must have rushed with this one, thank you for noticing. I don't see how to move it back either. I think it's fine to mark it as Committed where it is now. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Tue, Dec 1, 2020 at 4:08 AM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote: > > On 01.12.2020 03:08, James Coleman wrote: > > On Tue, Nov 3, 2020 at 4:39 PM Tomas Vondra > > <tomas.vondra@2ndquadrant.com> wrote: > >> I've pushed the 0001 part, i.e. the main fix. Not sure about the other > >> parts (comments and moving the code back to postgres_fdw) yet. > > I noticed the CF entry [1] got moved to the next CF; I'm thinking this > > entry should be marked as committed since the fix for the initial bug > > reported on this thread has been pushed. We have the parallel safety > > issue outstanding, but there's a separate thread and patch for that, > > so I'll make a new CF entry for that. > > > > I can mark it as committed, but I'm not sure how to "undo" (or if > > that's desirable) the move to the next CF. > > > > Thoughts? > > > > James > > > > 1: https://commitfest.postgresql.org/30/2754/ > > > > > Oops... > I must have rushed with this one, thank you for noticing. > I don't see how to move it back either. I think it's fine to mark it as > Committed where it is now. > BTW, I still see this one as needs review -- Jaime Casanova Professional PostgreSQL: Soporte 24x7 y capacitación
On 12/16/20 1:51 AM, Jaime Casanova wrote: > On Tue, Dec 1, 2020 at 4:08 AM Anastasia Lubennikova > <a.lubennikova@postgrespro.ru> wrote: >> >> On 01.12.2020 03:08, James Coleman wrote: >>> On Tue, Nov 3, 2020 at 4:39 PM Tomas Vondra >>> <tomas.vondra@2ndquadrant.com> wrote: >>>> I've pushed the 0001 part, i.e. the main fix. Not sure about the other >>>> parts (comments and moving the code back to postgres_fdw) yet. >>> I noticed the CF entry [1] got moved to the next CF; I'm thinking this >>> entry should be marked as committed since the fix for the initial bug >>> reported on this thread has been pushed. We have the parallel safety >>> issue outstanding, but there's a separate thread and patch for that, >>> so I'll make a new CF entry for that. >>> >>> I can mark it as committed, but I'm not sure how to "undo" (or if >>> that's desirable) the move to the next CF. >>> >>> Thoughts? >>> >>> James >>> >>> 1: https://commitfest.postgresql.org/30/2754/ >>> >>> >> Oops... >> I must have rushed with this one, thank you for noticing. >> I don't see how to move it back either. I think it's fine to mark it as >> Committed where it is now. >> > > BTW, I still see this one as needs review > Hmm, not sure what was the plan, but I'll mark it as committed once I push the remaining patches (for parallel-safe checks, SRFs etc.) in a couple days. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company