Thread: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Ranier Vilela
Date:
Hi.

IMO, I think that commit a61b1f7, has an oversight.
Currently is losing the result of recursion of function translate_col_privs_multilevel.

Once the variable result (Bitmapset pointer) is reassigned.

Without a test case for this patch.
But also, do not have a test case for the current thinko in head.

Pass regress check.

regards,
Ranier Vilela
Attachment

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
David Rowley
Date:
On Wed, 21 Dec 2022 at 13:15, Ranier Vilela <ranier.vf@gmail.com> wrote:
> IMO, I think that commit a61b1f7, has an oversight.
> Currently is losing the result of recursion of function translate_col_privs_multilevel.
>
> Once the variable result (Bitmapset pointer) is reassigned.
>
> Without a test case for this patch.
> But also, do not have a test case for the current thinko in head.

hmm, that code looks a bit suspect to me too.

Are you able to write a test that shows the bug which fails before
your change and passes after applying it? I don't think it's quite
enough to claim that your changes pass make check given that didn't
fail before your change.

Also, I think it might be better to take the opportunity to rewrite
the function to not use recursion. I don't quite see the need for it
here and it looks like that might have helped contribute to the
reported issue.  Can't we just write this as a while loop instead of
having the function call itself? It's not as if we need stack space
for keeping track of multiple parents. A child relation can only have
1 parent. It seems to me that we can just walk there by looping.

David



Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Ranier Vilela
Date:
Thanks David, for looking at this.

Em ter., 20 de dez. de 2022 às 22:45, David Rowley <dgrowleyml@gmail.com> escreveu:
On Wed, 21 Dec 2022 at 13:15, Ranier Vilela <ranier.vf@gmail.com> wrote:
> IMO, I think that commit a61b1f7, has an oversight.
> Currently is losing the result of recursion of function translate_col_privs_multilevel.
>
> Once the variable result (Bitmapset pointer) is reassigned.
>
> Without a test case for this patch.
> But also, do not have a test case for the current thinko in head.

hmm, that code looks a bit suspect to me too.

Are you able to write a test that shows the bug which fails before
your change and passes after applying it? I don't think it's quite
enough to claim that your changes pass make check given that didn't
fail before your change.
No, unfortunately not yet. Of course that test case would be very nice.
But my time for postgres is very limited.
For voluntary work, without any payment, I think what I have contributed is good.


Also, I think it might be better to take the opportunity to rewrite
the function to not use recursion. I don't quite see the need for it
here and it looks like that might have helped contribute to the
reported issue.  Can't we just write this as a while loop instead of
having the function call itself? It's not as if we need stack space
for keeping track of multiple parents. A child relation can only have
1 parent. It seems to me that we can just walk there by looping.
 I took a look at the code that deals with the array (append_rel_array) and
all the loops seem different from each other and out of any pattern.
Unfortunately, I still can't get this loop to work correctly,
I need to learn more about Postgres structures and the correct way to process them.
If you can do it, I'd be happy to learn the right way.

regards,
Ranier Vilela

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Richard Guo
Date:

On Wed, Dec 21, 2022 at 9:45 AM David Rowley <dgrowleyml@gmail.com> wrote:
Also, I think it might be better to take the opportunity to rewrite
the function to not use recursion. I don't quite see the need for it
here and it looks like that might have helped contribute to the
reported issue.  Can't we just write this as a while loop instead of
having the function call itself? It's not as if we need stack space
for keeping track of multiple parents. A child relation can only have
1 parent. It seems to me that we can just walk there by looping.
 
My best guess is that this function is intended to share the same code
pattern as in adjust_appendrel_attrs_multilevel.  The recursion is
needed as 'rel' can be more than one inheritance level below the top
parent.  I think we can keep the recursion, as in other similar
functions, as long as we make it right, as in attached patch.

Thanks
Richard
Attachment

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
David Rowley
Date:
On Thu, 22 Dec 2022 at 21:18, Richard Guo <guofenglinux@gmail.com> wrote:
> My best guess is that this function is intended to share the same code
> pattern as in adjust_appendrel_attrs_multilevel.  The recursion is
> needed as 'rel' can be more than one inheritance level below the top
> parent.  I think we can keep the recursion, as in other similar
> functions, as long as we make it right, as in attached patch.

I still think we should have a test to ensure this is actually
working. Do you want to write one?

David



Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Richard Guo
Date:

On Thu, Dec 22, 2022 at 5:22 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 22 Dec 2022 at 21:18, Richard Guo <guofenglinux@gmail.com> wrote:
> My best guess is that this function is intended to share the same code
> pattern as in adjust_appendrel_attrs_multilevel.  The recursion is
> needed as 'rel' can be more than one inheritance level below the top
> parent.  I think we can keep the recursion, as in other similar
> functions, as long as we make it right, as in attached patch.

I still think we should have a test to ensure this is actually
working. Do you want to write one?
 
I agree that we should have a test.  According to the code coverage
report, the recursion part of this function is never tested.  I will
have a try to see if I can come up with a proper test case.

Thanks
Richard

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
David Rowley
Date:
On Fri, 23 Dec 2022 at 15:21, Richard Guo <guofenglinux@gmail.com> wrote:
>
> On Thu, Dec 22, 2022 at 5:22 PM David Rowley <dgrowleyml@gmail.com> wrote:
>> I still think we should have a test to ensure this is actually
>> working. Do you want to write one?
>
>
> I agree that we should have a test.  According to the code coverage
> report, the recursion part of this function is never tested.  I will
> have a try to see if I can come up with a proper test case.

I started having a go at writing one yesterday. I only got as far as
the following.
It looks like it'll need a trigger or something added to the foreign
table to hit the code path that would be needed to trigger the issue,
so it'll need more work. It might be a worthy starting point.

CREATE EXTENSION postgres_fdw;

-- this will need to work the same way as it does in postgres_fdw.sql
by using current_database()
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS
(dbname 'postgres', port '5432');

create table t_gc (a int, b int, c int);
create table t_c (b int, c int, a int) partition by list(a);
create table t_tlp (c int, a int, b int) partition by list (a);

CREATE FOREIGN TABLE ft_tlp (
c int,
a int,
b int
) SERVER loopback OPTIONS (schema_name 'public', table_name 't_tlp');


alter table t_c attach partition t_gc for values in (1);
alter table t_tlp attach partition t_c for values in (1);

create role perm_check login;

CREATE USER MAPPING FOR perm_check SERVER loopback OPTIONS (user
'perm_check', password_required 'false');

grant update (b) on t_tlp to perm_check;
grant update (b) on ft_tlp to perm_check;

set session authorization perm_check;

-- should pass
update ft_tlp set b = 1;

-- should fail
update ft_tlp set a = 1;
update ft_tlp set c = 1;

-- cleanup

drop foreign table ft_tlp cascade;
drop table t_tlp cascade;
drop role perm_check;
drop server loopback cascade;
drop extension postgres_fdw;

David



Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Amit Langote
Date:
Hi,

Thanks everyone for noticing this.  It is indeed very obviously broken. :(

On Fri, Dec 23, 2022 at 11:26 AM David Rowley <dgrowleyml@gmail.com> wrote:
> On Fri, 23 Dec 2022 at 15:21, Richard Guo <guofenglinux@gmail.com> wrote:
> >
> > On Thu, Dec 22, 2022 at 5:22 PM David Rowley <dgrowleyml@gmail.com> wrote:
> >> I still think we should have a test to ensure this is actually
> >> working. Do you want to write one?
> >
> >
> > I agree that we should have a test.  According to the code coverage
> > report, the recursion part of this function is never tested.  I will
> > have a try to see if I can come up with a proper test case.
>
> I started having a go at writing one yesterday. I only got as far as
> the following.
> It looks like it'll need a trigger or something added to the foreign
> table to hit the code path that would be needed to trigger the issue,
> so it'll need more work. It might be a worthy starting point.

I was looking at this last night and found that having a generated
column in the table, but not a trigger, helps hit the buggy code.
Having a generated column in the foreign partition prevents a direct
update and thus hitting the following block of
postgresPlanForeignModify():

    else if (operation == CMD_UPDATE)
    {
        int         col;
        RelOptInfo *rel = find_base_rel(root, resultRelation);
        Bitmapset  *allUpdatedCols = get_rel_all_updated_cols(root, rel);

        col = -1;
        while ((col = bms_next_member(allUpdatedCols, col)) >= 0)
        {
            /* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
            AttrNumber  attno = col + FirstLowInvalidHeapAttributeNumber;

            if (attno <= InvalidAttrNumber) /* shouldn't happen */
                elog(ERROR, "system-column update is not supported");
            targetAttrs = lappend_int(targetAttrs, attno);
        }
    }

If you add a trigger, which does help with getting a non-direct
update, the code block above this one is executed, so
get_rel_all_updated_cols() isn't called.

Attached shows a test case I was able to come up with that I can see
is broken by a61b1f74 though passes after applying Richard's patch.
What's broken is that deparseUpdateSql() outputs a remote UPDATE
statement with the wrong SET column list, because the wrong attribute
numbers would be added to the targetAttrs list by the above code block
after the buggy multi-level translation in ger_rel_all_updated_cols().

Thanks for writing the patch, Richard.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Attachment

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Amit Langote
Date:
On Fri, Dec 23, 2022 at 12:22 PM Amit Langote <amitlangote09@gmail.com> wrote:
> Attached shows a test case I was able to come up with that I can see
> is broken by a61b1f74 though passes after applying Richard's patch.

BTW, I couldn't help but notice in the output of the test case I wrote
that a generated column of a foreign table is not actually generated
locally, neither when inserting into the foreign table nor when
updating it, so it is left NULL when passing the NEW row to the remote
server.  Behavior is the same irrespective of whether the
insert/update is performed directly on the foreign table or indirectly
via an insert/update on the parent. If that's documented behavior of
postgres_fdw, maybe we are fine, but just wanted to mention that it's
not related to the bug being discussed here.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com



Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
David Rowley
Date:
On Fri, 23 Dec 2022 at 16:22, Amit Langote <amitlangote09@gmail.com> wrote:
> Attached shows a test case I was able to come up with that I can see
> is broken by a61b1f74 though passes after applying Richard's patch.

Thanks for the test case.  I'll look at this now.

+UPDATE rootp SET b = b || 'd' RETURNING a, b, c, d;
+ a |  b   |  c  | d
+---+------+-----+---
+ 1 | food | 1.1 |

Coding on an empty stomach I see! :)

David



Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Richard Guo
Date:

On Fri, Dec 23, 2022 at 11:22 AM Amit Langote <amitlangote09@gmail.com> wrote:
Attached shows a test case I was able to come up with that I can see
is broken by a61b1f74 though passes after applying Richard's patch.
What's broken is that deparseUpdateSql() outputs a remote UPDATE
statement with the wrong SET column list, because the wrong attribute
numbers would be added to the targetAttrs list by the above code block
after the buggy multi-level translation in ger_rel_all_updated_cols().
 
Thanks for the test!  I looked at this and found that with WCO
constraints we can also hit the buggy code.  Based on David's test case,
I came up with the following in the morning.

CREATE FOREIGN TABLE ft_gc (
a int,
b int,
c int
) SERVER loopback OPTIONS (schema_name 'public', table_name 't_gc');

alter table t_c attach partition ft_gc for values in (1);
alter table t_tlp attach partition t_c for values in (1);

CREATE VIEW rw_view AS SELECT * FROM t_tlp where a < b WITH CHECK OPTION;

explain (verbose, costs off) update rw_view set c = 42;

Currently on HEAD, we can see something wrong in the plan.

                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Update on public.t_tlp
   Foreign Update on public.ft_gc t_tlp_1
     Remote SQL: UPDATE public.t_gc SET b = $2 WHERE ctid = $1 RETURNING a, b
   ->  Foreign Scan on public.ft_gc t_tlp_1
         Output: 42, t_tlp_1.tableoid, t_tlp_1.ctid, t_tlp_1.*
         Remote SQL: SELECT a, b, c, ctid FROM public.t_gc WHERE ((a < b)) FOR UPDATE
(6 rows)

Note that this is wrong: 'UPDATE public.t_gc SET b = $2'.

Thanks
Richard

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Amit Langote
Date:
On Fri, Dec 23, 2022 at 1:04 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Fri, Dec 23, 2022 at 11:22 AM Amit Langote <amitlangote09@gmail.com> wrote:
>> Attached shows a test case I was able to come up with that I can see
>> is broken by a61b1f74 though passes after applying Richard's patch.
>> What's broken is that deparseUpdateSql() outputs a remote UPDATE
>> statement with the wrong SET column list, because the wrong attribute
>> numbers would be added to the targetAttrs list by the above code block
>> after the buggy multi-level translation in ger_rel_all_updated_cols().
>
> Thanks for the test!  I looked at this and found that with WCO
> constraints we can also hit the buggy code.

Ah, yes.

        /*
         * Try to modify the foreign table directly if (1) the FDW provides
         * callback functions needed for that and (2) there are no local
         * structures that need to be run for each modified row: row-level
         * triggers on the foreign table, stored generated columns, WITH CHECK
         * OPTIONs from parent views.
         */
        direct_modify = false;
        if (fdwroutine != NULL &&
            fdwroutine->PlanDirectModify != NULL &&
            fdwroutine->BeginDirectModify != NULL &&
            fdwroutine->IterateDirectModify != NULL &&
            fdwroutine->EndDirectModify != NULL &&
            withCheckOptionLists == NIL &&
            !has_row_triggers(root, rti, operation) &&
            !has_stored_generated_columns(root, rti))
            direct_modify = fdwroutine->PlanDirectModify(root, node, rti, i);


>  Based on David's test case,
> I came up with the following in the morning.
>
> CREATE FOREIGN TABLE ft_gc (
> a int,
> b int,
> c int
> ) SERVER loopback OPTIONS (schema_name 'public', table_name 't_gc');
>
> alter table t_c attach partition ft_gc for values in (1);
> alter table t_tlp attach partition t_c for values in (1);
>
> CREATE VIEW rw_view AS SELECT * FROM t_tlp where a < b WITH CHECK OPTION;
>
> explain (verbose, costs off) update rw_view set c = 42;
>
> Currently on HEAD, we can see something wrong in the plan.
>
>                                       QUERY PLAN
> --------------------------------------------------------------------------------------
>  Update on public.t_tlp
>    Foreign Update on public.ft_gc t_tlp_1
>      Remote SQL: UPDATE public.t_gc SET b = $2 WHERE ctid = $1 RETURNING a, b
>    ->  Foreign Scan on public.ft_gc t_tlp_1
>          Output: 42, t_tlp_1.tableoid, t_tlp_1.ctid, t_tlp_1.*
>          Remote SQL: SELECT a, b, c, ctid FROM public.t_gc WHERE ((a < b)) FOR UPDATE
> (6 rows)
>
> Note that this is wrong: 'UPDATE public.t_gc SET b = $2'.

Right, because of the mangled targetAttrs in postgresPlanForeignModify().

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com



Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
David Rowley
Date:
On Fri, 23 Dec 2022 at 17:04, Richard Guo <guofenglinux@gmail.com> wrote:
> Thanks for the test!  I looked at this and found that with WCO
> constraints we can also hit the buggy code.  Based on David's test case,
> I came up with the following in the morning.

I ended up going with a WCO option test in the end.  I wanted to steer
clear of having a test that has expected broken results from the
generated column code.  Also, I just couldn't help thinking the
generated column test felt like it was just glued on the end and not
really in the correct place in the file.

I've put the new WCO test in along with the existing one.  I also
considered modifying one of the existing tests to add another
partitioning level, but I ended up staying clear of that as I felt
like it caused a bit more churn than I wanted with an existing test.
The test I put together tests for the bug and also checks the WCO
works by not updating the row that's outside of the scope of the WCO
view and updating the row that is in the scope of the view.

I've now pushed your fix plus that test.

It feels a bit like famine to feast when it comes to tests for this bug today.

David



Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Ranier Vilela
Date:
Em sex., 23 de dez. de 2022 às 09:10, David Rowley <dgrowleyml@gmail.com> escreveu:

I've now pushed your fix plus that test.
Thank you all for getting involved to resolve this.

regards,
Ranier Vilela

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

From
Amit Langote
Date:
On Fri, Dec 23, 2022 at 9:10 PM David Rowley <dgrowleyml@gmail.com> wrote
> On Fri, 23 Dec 2022 at 17:04, Richard Guo <guofenglinux@gmail.com> wrote:
> > Thanks for the test!  I looked at this and found that with WCO
> > constraints we can also hit the buggy code.  Based on David's test case,
> > I came up with the following in the morning.
>
> I ended up going with a WCO option test in the end.  I wanted to steer
> clear of having a test that has expected broken results from the
> generated column code.  Also, I just couldn't help thinking the
> generated column test felt like it was just glued on the end and not
> really in the correct place in the file.
>
> I've put the new WCO test in along with the existing one.  I also
> considered modifying one of the existing tests to add another
> partitioning level, but I ended up staying clear of that as I felt
> like it caused a bit more churn than I wanted with an existing test.
> The test I put together tests for the bug and also checks the WCO
> works by not updating the row that's outside of the scope of the WCO
> view and updating the row that is in the scope of the view.
>
> I've now pushed your fix plus that test.
>
> It feels a bit like famine to feast when it comes to tests for this bug today.

Thanks for working on this.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com