Thread: Free indexed_tlist memory explicitly within set_plan_refs()

Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

25 May 2015, 04:17:45

While trying to fix a largely unrelated bug, I noticed that the new
build_tlist_index() call for the "excluded" targetlist (used by ON
CONFLICT DO UPDATE queries) does not have its memory subsequently
freed by the caller. Since every other call to build_tlist_index()
does this, and comments above build_tlist_index() encourage it, I
think the new caller should do the same.

Attached patch adds such a pfree() call.
--
Peter Geoghegan

Attachment

add-pfree-indexed_tlist.patch

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Michael Paquier

Date:

25 May 2015, 05:45:37

On Mon, May 25, 2015 at 10:17 AM, Peter Geoghegan <pg@heroku.com> wrote:
> While trying to fix a largely unrelated bug, I noticed that the new
> build_tlist_index() call for the "excluded" targetlist (used by ON
> CONFLICT DO UPDATE queries) does not have its memory subsequently
> freed by the caller. Since every other call to build_tlist_index()
> does this, and comments above build_tlist_index() encourage it, I
> think the new caller should do the same.
>
> Attached patch adds such a pfree() call.

Yep. This looks correct to me.
-- 
Michael

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

29 May 2015, 00:38:06

On Sun, May 24, 2015 at 6:17 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Attached patch adds such a pfree() call.

Attached, revised version incorporates this small fix, while adding an
additional big fix, and a number of small style tweaks.

This is mainly concerned with fixing the bug I was trying to fix when
I spotted the minor pfree() issue:

postgres=# insert into upsert (key, val) values('Foo', 'Bar') on
conflict (key) do update set val = excluded.val where excluded.* is
not null;
ERROR:  XX000: variable not found in subplan target lists
LOCATION:  fix_join_expr_mutator, setrefs.c:2003
postgres=# insert into upsert (key, val) values(('Foo', 'Bar') on
conflict (key) do update set val = excluded.val where excluded.ctid is
not null;
ERROR:  XX000: variable not found in subplan target lists
LOCATION:  fix_join_expr_mutator, setrefs.c:2003

The first query shown should clearly finish processing by the
optimizer without raising this error message (execution should work
correctly too, of course). The second query shown should fail with a
user visible error message about referencing the excluded
pseudo-relation's ctid column not making sense.

The basic problem is that there wasn't much thought put into how the
excluded pseudo-relation's "reltargetlist" is generated -- it
currently comes from a call to expandRelAttrs() during parse analysis,
which, on its own, doesn't allow whole row Vars to work.

One approach to fixing this is to follow the example of RETURNING
lists with references to more than one relation:
preprocess_targetlist() handles this by calling pull_var_clause() and
making new TargetEntry entries for Vars found to not be referencing
the target (and not already in the targetlist for some other reason).
Another approach, preferred by Andres, is to have query_planner() do
more. I understand that the idea there is to make excluded.* closer to
a regular table, in that it can be expected to have a baserel, and
within the executor we have something closer to a bona-fide scan
reltargetlist, that we can expect to have all Vars appear in. This
should be enough to make the reltargetlist have the appropriate Vars
more or less in the regular course of planning, including excluded.*
whole row Vars.  To make this work we could call
add_vars_to_targetlist(), and call add_base_rels_to_query()  and then
build_base_rel_tlists() within query_planner() (while moving a few
other things around).

However, the ordering dependencies within query_planner() seemed quite
complicated to me, and I didn't want to modify such an important
routine just to fix this bug. RETURNING seemed like a perfectly good
precedent to follow, so that's what I did. Now, it might have been
that  I misunderstood Andres when we discussed this problem on
Jabber/IM, but ISTM that the second idea doesn't have much advantage
over the first (I'm sure that Andres will be able to explain what he'd
like to do better here -- it was a quick conversation). I did
prototype the second approach, and the code footprint of what I have
here (the first approach) seems lower than it would have to be with
the second. Besides, I didn't see a convenient choke point to reject
system column references with the second approach. Attached patch
fixes the bug using the first approach. Tests were added demonstrating
that the cases above are fixed.

A second attached patch fixes another, largely independent bug. I
noticed array assignment with ON CONFLICT DO UPDATE was broken. See
commit message for full details.

Thoughts?
--
Peter Geoghegan

Attachment

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

29 May 2015, 04:32:03

On Thu, May 28, 2015 at 2:37 PM, Peter Geoghegan <pg@heroku.com> wrote:
> A second attached patch fixes another, largely independent bug. I
> noticed array assignment with ON CONFLICT DO UPDATE was broken. See
> commit message for full details.

Finally, here is a third patch, fixing the final bug that I discussed
with you privately. There are now fixes for all bugs that I'm
currently aware of.

This concerns a thinko in unique index inference. See the commit
message for full details.

--
Peter Geoghegan

Attachment

0003-Fix-bug-in-unique-index-inference.patch

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

29 May 2015, 13:04:37

On Thu, May 28, 2015 at 6:31 PM, Peter Geoghegan <pg@heroku.com> wrote:
> This concerns a thinko in unique index inference. See the commit
> message for full details.

It seems I missed a required defensive measure here. Attached patch
adds it, too.

--
Peter Geoghegan

Attachment

0004-Additional-defensive-measure.patch

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

30 May 2015, 11:07:20

On Thu, May 28, 2015 at 2:37 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Attached, revised version incorporates this small fix, while adding an
> additional big fix, and a number of small style tweaks.
>
> This is mainly concerned with fixing the bug I was trying to fix when
> I spotted the minor pfree() issue:
>
> postgres=# insert into upsert (key, val) values('Foo', 'Bar') on
> conflict (key) do update set val = excluded.val where excluded.* is
> not null;
> ERROR:  XX000: variable not found in subplan target lists
> LOCATION:  fix_join_expr_mutator, setrefs.c:2003

My fix for this issue
(0001-Fix-bug-with-whole-row-Vars-in-excluded-targetlist.patch) still
missed something. There needs to be additional handling in
ruleutils.c:

postgres=# explain insert into upsert as u values (1, 'fooz') on conflict (key) do update set val = excluded.val where
excluded.*is not null;
 
ERROR:  XX000: bogus varattno for INNER_VAR var: 0
LOCATION:  get_variable, ruleutils.c:5904

I'll look for a fix for this additional issue tomorrow.
-- 
Peter Geoghegan

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

31 May 2015, 01:12:29

On Sat, May 30, 2015 at 1:07 AM, Peter Geoghegan <pg@heroku.com> wrote:
> My fix for this issue
> (0001-Fix-bug-with-whole-row-Vars-in-excluded-targetlist.patch) still
> missed something. There needs to be additional handling in
> ruleutils.c:

Debugging this allowed me to come up with a significantly simplified
approach. Attached is a new version of the original fix. Details are
in commit message -- there is no actual need to have
search_indexed_tlist_for_var() care about Vars being resjunk in a
special way, which is a good thing. There is also no need for further
ruleutils.c specialization, as I implied before.

Some deparsing tests are now included on top of what was already in
the first version.
--
Peter Geoghegan

Attachment

0001-Fix-bug-with-whole-row-Vars-in-excluded-targetlist.patch

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

31 May 2015, 01:30:50

On Sat, May 30, 2015 at 3:12 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Debugging this allowed me to come up with a significantly simplified
> approach. Attached is a new version of the original fix. Details are
> in commit message -- there is no actual need to have
> search_indexed_tlist_for_var() care about Vars being resjunk in a
> special way, which is a good thing.

It feels wrong to not have the additional, paranoid IsVar() check
within pull_var_targetlist_clause() check added in most recent
revision, even though it should not be necessary. Attached delta patch
adds this check.

I need to stop working on weekends...
--
Peter Geoghegan

Attachment

additional-paranoid-check.patch

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Andres Freund

Date:

12 July 2015, 23:45:42

On 2015-05-28 14:37:43 -0700, Peter Geoghegan wrote:
> To fix, allow ParseState to reflect that an individual statement can be
> both p_is_insert and p_is_update at the same time.

>      /* Process DO UPDATE */
>      if (onConflictClause->action == ONCONFLICT_UPDATE)
>      {
> +        /* p_is_update must be set here, after INSERT targetlist processing */
> +        pstate->p_is_update = true;
> +

It's not particularly pretty that you document in the commit message
that both is_insert and is_update can be set at the same time, and then
it has constraints like the above.

But that's more crummy API's fault than yours.

I'm right now not really coming up with a better idea how to fix
this. So I guess I'll apply something close to this tomorrow.

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

13 July 2015, 00:41:32

On Sun, Jul 12, 2015 at 1:45 PM, Andres Freund <andres@anarazel.de> wrote:
> But that's more crummy API's fault than yours.

As you probably noticed, the only reason the p_is_update and
p_is_insert fields exist is for transformAssignedExpr() -- in fact, in
the master branch, nothing checks the value of p_is_update (although I
suppose a hook into a third-party module could see that module test
ParseState.p_is_update, so that isn't quite true).

> I'm right now not really coming up with a better idea how to fix
> this. So I guess I'll apply something close to this tomorrow.

Sounds good.

-- 
Peter Geoghegan

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Andres Freund

Date:

24 July 2015, 12:58:08

On 2015-07-12 22:45:18 +0200, Andres Freund wrote:
> I'm right now not really coming up with a better idea how to fix
> this. So I guess I'll apply something close to this tomorrow.

Took a bit longer than that :(

I've pushed a version of your patch. I just opted to remove p_is_update
instead of allowing both to be set at the same time. To me that seemed
simpler.

Thanks for the fix!

Andres

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Andres Freund

Date:

24 July 2015, 13:08:48

On 2015-05-28 18:31:56 -0700, Peter Geoghegan wrote:
> Subject: [PATCH 3/3] Fix bug in unique index inference
> 
> ON CONFLICT unique index inference had a thinko that could affect cases
> where the user-supplied inference clause required that an attribute
> match a particular (user named) collation and/or opclass.
> 
> Firstly, infer_collation_opclass_match() matched on opclass and/or
> collation.  Secondly, the attribute had to be in the list of attributes
> or expressions known to be in the definition of the index under
> consideration.  The second step wasn't correct though, because having
> some match doesn't necessarily mean that the second step found the same
> index attribute as the (collation/opclass wise) match from the first
> step.

Yes, makes sense.

> +        else
> +        {
> +            Node       *nattExpr = list_nth(idxExprs, (natt - 1) - nplain);
> +
> +            /*
> +             * Note that unlike routines like match_index_to_operand(), we're
> +             * unconcerned about RelabelType.  An exact match is required.
> +             */
> +            if (equal(elem->expr, nattExpr))
> +                return true;

Why is that?

Regads,

Andres

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

24 July 2015, 20:57:47

On Fri, Jul 24, 2015 at 2:58 AM, Andres Freund <andres@anarazel.de> wrote:
> I've pushed a version of your patch. I just opted to remove p_is_update
> instead of allowing both to be set at the same time. To me that seemed
> simpler.

I would be hesitant to remove a struct field from a struct that
appears as a hook argument. Someone's extension (that uses parser
hooks) might have been relying on that. Perhaps not a big deal.

Thanks
-- 
Peter Geoghegan

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Andres Freund

Date:

24 July 2015, 20:58:31

On July 24, 2015 7:57:43 PM GMT+02:00, Peter Geoghegan <pg@heroku.com> wrote:
>On Fri, Jul 24, 2015 at 2:58 AM, Andres Freund <andres@anarazel.de>
>wrote:
>> I've pushed a version of your patch. I just opted to remove
>p_is_update
>> instead of allowing both to be set at the same time. To me that
>seemed
>> simpler.
>
>I would be hesitant to remove a struct field from a struct that
>appears as a hook argument. Someone's extension (that uses parser
>hooks) might have been relying on that. Perhaps not a big deal.

They'd also be affected by the change in meaning...

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Re: Free indexed_tlist memory explicitly within set_plan_refs()

From

Peter Geoghegan

Date:

24 July 2015, 21:55:35

On Fri, Jul 24, 2015 at 3:08 AM, Andres Freund <andres@anarazel.de> wrote:
>> +             else
>> +             {
>> +                     Node       *nattExpr = list_nth(idxExprs, (natt - 1) - nplain);
>> +
>> +                     /*
>> +                      * Note that unlike routines like match_index_to_operand(), we're
>> +                      * unconcerned about RelabelType.  An exact match is required.
>> +                      */
>> +                     if (equal(elem->expr, nattExpr))
>> +                             return true;
>
> Why is that?

No very strong reason. RelabelType exists to represent a dummy
coercion between two binary-compatible types. I think that a unique
index inference specification (which is novel in some ways) does not
need to do anything special for this case.

Each inference specification attribute that is an expression should
match some attribute in some index's cataloged definition. The
inference specification looks very much like the CREATE UNIQUE INDEX
that created the unique index that is inferred (usually, they'll be
identical). No need to make it any more complicated than that.

In fact, I don't think it's possible to construct a case where it
could even be argued that it matters. I'm not very caffeinated at the
moment, so I'm not sure of that.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

20 September 2015, 03:11:31

Hi,

To recap for other readers: There's a problem with ON CONFLICT when the
SET or ON CONFLICT ... WHERE clause references excluded.* (i.e. as a
whole row var).  The problem is that setrefs.c in
fix_join_expr_mutator() currently won't find a matching entry in the
indexed tlist and thus error out with             elog(ERROR, "variable not found in subplan target lists");

The reason is that the targetlist we build the index list on just
contains the attributes in excluded.*.

Peter's patch upthread fixes this by pulling expressions from
onConflictSet/Where into the targetlist. I disliked this - much less
than initially - a bit because that seems a bit crufty given that we're
not actually getting data from a child node.  This is different to
RETURNING where the targetlist massaging is actually important to get
the data up the tree.

An actually trivial, although not all that pretty, fix is to simply
accept wholerow references in fix_join_expr_mutator(), even if not in
the targetlist. As far as I can see the problem right now really can
only be hit for whole row references.

A variant of the second approach is to have a fix_onconflict_expr()
mutator that has such special handler.

Any opinions on either approach?

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

20 September 2015, 04:40:20

On Sat, Sep 19, 2015 at 5:11 PM, Andres Freund <andres@anarazel.de> wrote:
> Peter's patch upthread fixes this by pulling expressions from
> onConflictSet/Where into the targetlist. I disliked this - much less
> than initially - a bit because that seems a bit crufty given that we're
> not actually getting data from a child node.  This is different to
> RETURNING where the targetlist massaging is actually important to get
> the data up the tree.

Maybe the massaging is somewhat justified by the fact that it's just
as good a place as any to reject system columns, and that needs to
happen somewhere.  I know that you suggested that this be done during
parse analysis, but not sure how attached you are to that. Might also
be a good choke point for detecting when unexpected vars/expressions
appear in the targetlist due to unforeseen circumstances/bugs. I
actually cover a couple of "can't happen" cases at the same time,
IIRC.

Continuing to follow RETURNING may have some value, even if the
analogy is a bit more forced here.

> An actually trivial, although not all that pretty, fix is to simply
> accept wholerow references in fix_join_expr_mutator(), even if not in
> the targetlist. As far as I can see the problem right now really can
> only be hit for whole row references.

I am concerned about the risk of adding bugs to unrelated code paths
that this could create. I must admit that this concern is mostly
driven by paranoia, and not a seasoned appreciation of problems that
could arise from ordinary post-processing of join expressions.

> A variant of the second approach is to have a fix_onconflict_expr()
> mutator that has such special handler.

I suppose you could add a fix_join_expr_context field that had
fix_join_expr_mutator() avoid the special handler for post-processing
of join expressions. That might be a bit ugly too, but would involve
less code duplication.

> Any opinions on either approach?

I think that I favor my original solution, although only by a tiny
margin. I will avoid offering either a -1 or a +1 to any proposal
here, although they all sound basically reasonable to me. A more
complete targetlist representation would have been something that I'd
probably vote against, since it seems complex and invasive, but that
doesn't matter now. In short, I defer to others here.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

24 September 2015, 18:25:25

On 2015-09-19 18:40:14 -0700, Peter Geoghegan wrote:
> > An actually trivial, although not all that pretty, fix is to simply
> > accept wholerow references in fix_join_expr_mutator(), even if not in
> > the targetlist. As far as I can see the problem right now really can
> > only be hit for whole row references.
> 
> I am concerned about the risk of adding bugs to unrelated code paths
> that this could create.

How? This is a must-not-reach code path currently?

Stuff I want to fix by tomorrow:
* Whole row var references to exclude
* wrong offsets for columns after dropped ones
* INSTEAD DO UPDATE for tables with oids

Do you know of anything else?

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

25 September 2015, 11:43:12

On Thu, Sep 24, 2015 at 8:25 AM, Andres Freund <andres@anarazel.de> wrote:
> Stuff I want to fix by tomorrow:
> * Whole row var references to exclude
> * wrong offsets for columns after dropped ones
> * INSTEAD DO UPDATE for tables with oids
>
> Do you know of anything else?

You said something in Dallas about the test case developed by Amit
Langote touching on a different bug to the regression test I came up
with. If that is the case, then you didn't list that one separately.
Otherwise, no.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

29 September 2015, 18:24:52

On 2015-09-24 17:25:21 +0200, Andres Freund wrote:
> Stuff I want to fix by tomorrow:
> * Whole row var references to exclude
> * wrong offsets for columns after dropped ones
> * INSTEAD DO UPDATE for tables with oids
>
> Do you know of anything else?

So, took a bit longer than "tomorrow. I fought for a long while with a
mysterious issue, which turned out to be separate bug: The excluded
relation was affected by row level security policies, which doesn't make
sense.

My proposal in this WIP patch is to make it a bit clearer that
'EXCLUDED' isn't a real relation. I played around with adding a
different rtekind, but that's too heavy a hammer. What I instead did was
to set relkind to composite - which seems to signal pretty well that
we're not dealing with a real relation. That immediately fixes the RLS
issue as fireRIRrules has the following check:
        if (rte->rtekind != RTE_RELATION ||
            rte->relkind != RELKIND_RELATION)
            continue;
It also makes it relatively straightforward to fix the system column
issue by adding an additional relkind check to scanRTEForColumn's system
column handling.

WRT to the wholerow issue: There's currently two reasons we need a
targetlist entry for excluded wholerow vars: 1) setrefs.c errors out
without - that can relativley easily be worked around 2) ruleutils.c
expects an entry in the child tlist. That could also be worked around,
but it's a bit more verbose.  I'm inclined to not go the pullup route
but instead simply unconditionally add a wholerow var to the excluded
tlist.

Peter, what do you think?

Andres

Attachment

0001-wip-upsert.patch

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

29 September 2015, 21:52:25

On Tue, Sep 29, 2015 at 8:24 AM, Andres Freund <andres@anarazel.de> wrote:
> So, took a bit longer than "tomorrow. I fought for a long while with a
> mysterious issue, which turned out to be separate bug: The excluded
> relation was affected by row level security policies, which doesn't make
> sense.

Why? You certainly thought that it made sense for conventional column
permissions due to potential problems with before row insert triggers.
Why would RLS be different? Some of my concerns with RLS were that it
is different to the existing permissions model in a way that seems a
bit arbitrary. I don't think that they were changed to do anything
special with SELECT ... FOR UPDATE, which has always required some
amount of conventional UPDATE privilege.

I specifically remember discussing this with you off list (on IM,
roughly a couple of weeks prior to initial commit). I recommended that
we err towards a more restrictive behavior in the absence of any
strong principle pushing us one way or the other. You seemed to agree.

> My proposal in this WIP patch is to make it a bit clearer that
> 'EXCLUDED' isn't a real relation. I played around with adding a
> different rtekind, but that's too heavy a hammer. What I instead did was
> to set relkind to composite - which seems to signal pretty well that
> we're not dealing with a real relation. That immediately fixes the RLS
> issue as fireRIRrules has the following check:
>                 if (rte->rtekind != RTE_RELATION ||
>                         rte->relkind != RELKIND_RELATION)
>                         continue;

Well, not sure that that's a good thing. Let's discuss.

> It also makes it relatively straightforward to fix the system column
> issue by adding an additional relkind check to scanRTEForColumn's system
> column handling.

That seems fine.

> WRT to the wholerow issue: There's currently two reasons we need a
> targetlist entry for excluded wholerow vars: 1) setrefs.c errors out
> without - that can relativley easily be worked around 2) ruleutils.c
> expects an entry in the child tlist. That could also be worked around,
> but it's a bit more verbose.  I'm inclined to not go the pullup route
> but instead simply unconditionally add a wholerow var to the excluded
> tlist.

I suppose that we have a tight enough grip on the targetlist that it's
hard to imagine anything else being introduced there spuriously. I had
thought that the pull-up did allow useful additional
defense/sanitization, but that may not be an excellent argument. The
only remaining argument is that my approach is closer to RETURNING,
but that doesn't seem like an excellent argument.

Basically, I think that this is fine.

However, there were a number of small stylistic tweaks made in passing
within my original patch -- minor things around consistency. Please
either restore these, or commit them separately.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

29 September 2015, 22:35:24

On September 29, 2015 8:52:14 PM GMT+02:00, Peter Geoghegan <pg@heroku.com> wrote:
>On Tue, Sep 29, 2015 at 8:24 AM, Andres Freund <andres@anarazel.de>
>wrote:
>> So, took a bit longer than "tomorrow. I fought for a long while with
>a
>> mysterious issue, which turned out to be separate bug: The excluded
>> relation was affected by row level security policies, which doesn't
>make
>> sense.
>
>Why? You certainly thought that it made sense for conventional column
>permissions due to potential problems with before row insert triggers.
>Why would RLS be different? 

What would it mean? And why would it make sense to apply rls to a values list?

nodeModify already has the necessary rls invocations, no?

Andres


--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Re: ON CONFLICT issues around whole row vars,

From

Stephen Frost

Date:

29 September 2015, 22:49:36

* Peter Geoghegan (pg@heroku.com) wrote:
> On Tue, Sep 29, 2015 at 8:24 AM, Andres Freund <andres@anarazel.de> wrote:
> > So, took a bit longer than "tomorrow. I fought for a long while with a
> > mysterious issue, which turned out to be separate bug: The excluded
> > relation was affected by row level security policies, which doesn't make
> > sense.
>
> Why? You certainly thought that it made sense for conventional column
> permissions due to potential problems with before row insert triggers.
> Why would RLS be different? Some of my concerns with RLS were that it
> is different to the existing permissions model in a way that seems a
> bit arbitrary. I don't think that they were changed to do anything
> special with SELECT ... FOR UPDATE, which has always required some
> amount of conventional UPDATE privilege.

I'm just about to push a patch to address exactly the SELECT .. FOR
UPDATE case, actually, and to try and make sure that RLS is more in line
with the existing permissions model..  I admit that I've not been
entirely following this thread though, so I'm not quite sure how that's
relevant to this discussion.

From Andres' reply, it looks like this is about the EXCLUDED pseudo
relation which comes from the INSERT'd values themselves, in which case,
I tend to agree with his assessment that it doesn't make sense for those
to be subject to RLS policies, given that it's all user-provided data,
as long as the USING check is done on the row found to be conflicting
and the CHECK constraints are dealt with correctly for any row added,
which I believe is what we had agreed was the correct way to handle this
case in prior discussions.

Thanks!

Stephen

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

01 October 2015, 13:42:12

On 2015-09-29 15:49:28 -0400, Stephen Frost wrote:
> From Andres' reply, it looks like this is about the EXCLUDED pseudo
> relation which comes from the INSERT'd values themselves

Right.

> in which case, I tend to agree with his assessment that it doesn't
> make sense for those to be subject to RLS policies, given that it's
> all user-provided data, as long as the USING check is done on the row
> found to be conflicting and the CHECK constraints are dealt with
> correctly for any row added, which I believe is what we had agreed was
> the correct way to handle this case in prior discussions.

Yes, that what I think as well.  At this point we'll already have
executed insert rls stuff on the EXCLUDED tuple:    /*     * Check any RLS INSERT WITH CHECK policies     *     *
ExecWithCheckOptions()will skip any WCOs which are not of the kind     * we are looking for at this point.     */    if
(resultRelInfo->ri_WithCheckOptions!= NIL)        ExecWithCheckOptions(WCO_RLS_INSERT_CHECK,
resultRelInfo, slot, estate);

and before executing the actual projection we also checked the existing
tuple:    ExecWithCheckOptions(WCO_RLS_CONFLICT_CHECK, resultRelInfo,                         mtstate->mt_existing,
                   mtstate->ps.state);

after the update triggers have, if applicable run, we run the the normal
checks there as well because it's just ExecUpdate()    if (resultRelInfo->ri_WithCheckOptions != NIL)
ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK,                            resultRelInfo, slot, estate);

so I do indeed think that there's no point in layering RLS above
EXCLUDED.

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

01 October 2015, 13:53:30

On 2015-09-29 11:52:14 -0700, Peter Geoghegan wrote:
> On Tue, Sep 29, 2015 at 8:24 AM, Andres Freund <andres@anarazel.de> wrote:
> > So, took a bit longer than "tomorrow. I fought for a long while with a
> > mysterious issue, which turned out to be separate bug: The excluded
> > relation was affected by row level security policies, which doesn't make
> > sense.
> 
> Why? You certainly thought that it made sense for conventional column
> permissions due to potential problems with before row insert triggers.

I don't see how those compare:

> I specifically remember discussing this with you off list (on IM,
> roughly a couple of weeks prior to initial commit). I recommended that
> we err towards a more restrictive behavior in the absence of any
> strong principle pushing us one way or the other. You seemed to agree.

I don't think this really is comparable. Comparing this with a plain
INSERT or UPDATE this would be akin to running RLS on the RETURNING
tuple - which we currently don't.

I think this is was just a bug.

> I suppose that we have a tight enough grip on the targetlist that it's
> hard to imagine anything else being introduced there spuriously. I had
> thought that the pull-up did allow useful additional
> defense/sanitization, but that may not be an excellent argument. The
> only remaining argument is that my approach is closer to RETURNING,
> but that doesn't seem like an excellent argument.

I indeed don't think this is comparable to RETURNING - the pullup there
is into an actual querytree above unrelated relations.

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

02 October 2015, 02:13:17

On Thu, Oct 1, 2015 at 3:42 AM, Andres Freund <andres@anarazel.de> wrote:
> Yes, that what I think as well.  At this point we'll already have
> executed insert rls stuff on the EXCLUDED tuple:
>                 /*
>                  * Check any RLS INSERT WITH CHECK policies
>                  *
>                  * ExecWithCheckOptions() will skip any WCOs which are not of the kind
>                  * we are looking for at this point.
>                  */
>                 if (resultRelInfo->ri_WithCheckOptions != NIL)
>                         ExecWithCheckOptions(WCO_RLS_INSERT_CHECK,
>                                                                  resultRelInfo, slot, estate);
> and before executing the actual projection we also checked the existing
> tuple:
>                 ExecWithCheckOptions(WCO_RLS_CONFLICT_CHECK, resultRelInfo,
>                                                          mtstate->mt_existing,
>                                                          mtstate->ps.state);
>
> after the update triggers have, if applicable run, we run the the normal
> checks there as well because it's just ExecUpdate()
>                 if (resultRelInfo->ri_WithCheckOptions != NIL)
>                         ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK,
>                                                                  resultRelInfo, slot, estate);
>
> so I do indeed think that there's no point in layering RLS above
> EXCLUDED.

I see your point, I think. It might be a problem if we weren't already
making the statement error out, but we are.

However, we're checking the excluded tuple (the might-be-inserted,
might-be-excluded tuple that reflects before row insert trigger
effects) with WCO_RLS_INSERT_CHECK, not WCO_RLS_UPDATE_CHECK. The
WCO_RLS_UPDATE_CHECK applies to the tuple to be appended to the
relation (the tuple that an UPDATE makes supersede some existing
tuple, a new row version).

We all seem to be in agreement that excluded.* ought to be subject to
column-level privilege enforcement, mostly due to possible leaks with
before row insert triggers (these could be SoD; a malicious UPSERT
could be written a certain way). None of the checks in the code above
are the exact RLS equivalent of the principle we have for column
privileges, AFAICT, because update-applicable policies (everything but
insert-applicable policies, actually) are not checked against the
excluded tuple. Shouldn't select-applicable policies also be applied
to the excluded tuples, just as with UPDATE ... FROM "join from"
tables, which excluded is kinda similar to?

I'm not trying to be pedantic; I just don't grok the underlying
principles here. Couldn't a malicious WHERE clause leak the excluded.*
tuple contents (and cause the UPDATE to not proceed) before the
WCO_RLS_CONFLICT_CHECK call site was reached, while also preventing it
from ever actually being reached (with a malicious function that
returns false after stashing excluded.* elsewhere)? You can put
volatile functions in UPDATE WHERE clauses, even if it is generally a
bad idea.

Perhaps I'm simply not following you here, though. I think that this
is one challenge with having per-command policies with a system that
checks permissions dynamically (not during parse analysis). Note that
I'm not defending the status quo of the master branch -- I'm just a
little uneasy about what the ideal, least surprising behavior is here.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

02 October 2015, 02:18:03

On 2015-10-01 16:13:12 -0700, Peter Geoghegan wrote:
> However, we're checking the excluded tuple (the might-be-inserted,
> might-be-excluded tuple that reflects before row insert trigger
> effects) with WCO_RLS_INSERT_CHECK, not WCO_RLS_UPDATE_CHECK. The
> WCO_RLS_UPDATE_CHECK applies to the tuple to be appended to the
> relation (the tuple that an UPDATE makes supersede some existing
> tuple, a new row version).

You can already see the effects of an INSERT modified by before triggers
via RETURNING. No?

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

02 October 2015, 02:20:04

On Thu, Oct 1, 2015 at 3:53 AM, Andres Freund <andres@anarazel.de> wrote:
>> I specifically remember discussing this with you off list (on IM,
>> roughly a couple of weeks prior to initial commit). I recommended that
>> we err towards a more restrictive behavior in the absence of any
>> strong principle pushing us one way or the other. You seemed to agree.
>
> I don't think this really is comparable. Comparing this with a plain
> INSERT or UPDATE this would be akin to running RLS on the RETURNING
> tuple - which we currently don't.
>
> I think this is was just a bug.

Maybe that's the problem here; I still thought that we were planning
on changing RLS in this regard, but it actually seems we changed
course, looking at the 9.5 open items list.

I would say that that's a clear divergence between RLS and column
privileges. That might be fine, but it doesn't match my prior
understanding of RLS (or, more accurately, how it was likely to change
pre-release).

If that's the design that we want for RLS across the board, then I'm
happy to defer to that decision.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

02 October 2015, 02:26:11

On Thu, Oct 1, 2015 at 4:17 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-10-01 16:13:12 -0700, Peter Geoghegan wrote:
>> However, we're checking the excluded tuple (the might-be-inserted,
>> might-be-excluded tuple that reflects before row insert trigger
>> effects) with WCO_RLS_INSERT_CHECK, not WCO_RLS_UPDATE_CHECK. The
>> WCO_RLS_UPDATE_CHECK applies to the tuple to be appended to the
>> relation (the tuple that an UPDATE makes supersede some existing
>> tuple, a new row version).
>
> You can already see the effects of an INSERT modified by before triggers
> via RETURNING. No?

I'm not saying that I agree with the decision to not do anything
special with RLS + RETURNING in general. I'm also not going to say
that I disagree with it. As I said, I missed that decision until just
now. I agree that it's obviously true that what you propose is
consistent with what is now considered ideal behavior for RLS (that's
what I get from the wiki page comments on RLS + RETURNING).

FWIW, I think that this technically wasn't a bug, because it was based
on a deliberate design decision that I thought (not without
justification) was consistent with what we wanted for RLS across the
board. But, again, happy to go along with what you say in light of
this new information.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

02 October 2015, 02:49:45

On 2015-10-01 16:26:07 -0700, Peter Geoghegan wrote:
> FWIW, I think that this technically wasn't a bug

Meh. In which scenario would do a policy applied to EXCLUDED actually
anything reasonable?

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

02 October 2015, 02:53:20

On Thu, Oct 1, 2015 at 4:26 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> You can already see the effects of an INSERT modified by before triggers
>> via RETURNING. No?
>
> I'm not saying that I agree with the decision to not do anything
> special with RLS + RETURNING in general. I'm also not going to say
> that I disagree with it. As I said, I missed that decision until just
> now. I agree that it's obviously true that what you propose is
> consistent with what is now considered ideal behavior for RLS (that's
> what I get from the wiki page comments on RLS + RETURNING).

I see now that commit 4f3b2a8883 changed things for UPDATE and DELETE
statements, but not INSERT statements. I guess my unease is because
that isn't entirely consistent with INSERT + RETURNING and the GRANT
system. Logically, the only possible justification for our long
standing INSERT and RETURNING behavior with GRANT (the fact that it
requires SELECT privilege for rows returned, just like UPDATE and
DELETE) is that before row insert triggers could do something secret
(e.g. they could be security definer). It doesn't seem to be too much
of a stretch to suppose the same should apply with RLS.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

02 October 2015, 02:55:26

On Thu, Oct 1, 2015 at 4:49 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-10-01 16:26:07 -0700, Peter Geoghegan wrote:
>> FWIW, I think that this technically wasn't a bug
>
> Meh. In which scenario would do a policy applied to EXCLUDED actually
> anything reasonable?

I agree that it's very unlikely to matter. Consistency is something
that is generally valued, though.

I'm not going to object if you want to continue with committing
something that changes excluded + RLS. I was just explaining my view
of the matter.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

02 October 2015, 03:12:40

On 2015-10-01 16:55:23 -0700, Peter Geoghegan wrote:
> On Thu, Oct 1, 2015 at 4:49 PM, Andres Freund <andres@anarazel.de> wrote:
> > On 2015-10-01 16:26:07 -0700, Peter Geoghegan wrote:
> >> FWIW, I think that this technically wasn't a bug
> >
> > Meh. In which scenario would do a policy applied to EXCLUDED actually
> > anything reasonable?
> 
> I agree that it's very unlikely to matter. Consistency is something
> that is generally valued, though.

I don't think you get my gist.

I'm can't see how the current code can do anything sensible at all. What
do you think is going to be the effect of an excluded row that doesn't
meet security quals? Even if it worked in the sense that the correct
data were accessed and every - which I doubt is completely the case as
things stands given there's no actual scan node and stuff - you'd still
have EXCLUDED.* being used in the projection for the new version of the
tuple.

As far as I can see the only correct thing you could do in that
situation is error out.

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Peter Geoghegan

Date:

02 October 2015, 03:41:33

On Thu, Oct 1, 2015 at 5:12 PM, Andres Freund <andres@anarazel.de> wrote:
> I'm can't see how the current code can do anything sensible at all. What
> do you think is going to be the effect of an excluded row that doesn't
> meet security quals? Even if it worked in the sense that the correct
> data were accessed and every - which I doubt is completely the case as
> things stands given there's no actual scan node and stuff - you'd still
> have EXCLUDED.* being used in the projection for the new version of the
> tuple.
>
> As far as I can see the only correct thing you could do in that
> situation is error out.

I agree. I wasn't defending the current code (although that might have
been made unclear by the "technically wasn't a bug" remark).

Note that I'm not telling you what I think needs to happen. I'm just
explaining my understanding of what has happened.

-- 
Peter Geoghegan

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

03 October 2015, 14:21:46

> My proposal in this WIP patch is to make it a bit clearer that
> 'EXCLUDED' isn't a real relation. I played around with adding a
> different rtekind, but that's too heavy a hammer. What I instead did was
> to set relkind to composite - which seems to signal pretty well that
> we're not dealing with a real relation. That immediately fixes the RLS
> issue as fireRIRrules has the following check:
>         if (rte->rtekind != RTE_RELATION ||
>             rte->relkind != RELKIND_RELATION)
>             continue;
> It also makes it relatively straightforward to fix the system column
> issue by adding an additional relkind check to scanRTEForColumn's system
> column handling.

That works, but also precludes referencing 'oid' in a WITH OIDs table
via EXCLUDED.oid - to me that looks correct since a to-be-inserted row
can't yet have an oid assigned. Differing opinions?

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Stephen Frost

Date:

05 October 2015, 15:01:04

Peter, all,

* Peter Geoghegan (pg@heroku.com) wrote:
> I see now that commit 4f3b2a8883 changed things for UPDATE and DELETE
> statements, but not INSERT statements. I guess my unease is because
> that isn't entirely consistent with INSERT + RETURNING and the GRANT
> system. Logically, the only possible justification for our long
> standing INSERT and RETURNING behavior with GRANT (the fact that it
> requires SELECT privilege for rows returned, just like UPDATE and
> DELETE) is that before row insert triggers could do something secret
> (e.g. they could be security definer). It doesn't seem to be too much
> of a stretch to suppose the same should apply with RLS.

I had intended to address with policies what is addressed through
permissions with 7d8db3e, but the coverage for INSERT+RETURNING was only
done when ON CONFLICT was in use.

I've fixed that by applying the SELECT policies as WCOs for both the
INSERT and UPDATE RETURNING cases.  This matches the permissions system,
where we require SELECT rights on the table for an INSERT RETURNING
query.

Thanks!

Stephen

Re: ON CONFLICT issues around whole row vars,

From

Andres Freund

Date:

05 October 2015, 15:14:28

On 2015-10-05 08:01:00 -0400, Stephen Frost wrote:
> Peter, all,
> I had intended to address with policies what is addressed through
> permissions with 7d8db3e, but the coverage for INSERT+RETURNING was only
> done when ON CONFLICT was in use.
> 
> I've fixed that by applying the SELECT policies as WCOs for both the
> INSERT and UPDATE RETURNING cases.  This matches the permissions system,
> where we require SELECT rights on the table for an INSERT RETURNING
> query.

This really needs tests verifying the behaviour...

Greetings,

Andres Freund

Re: ON CONFLICT issues around whole row vars,

From

Stephen Frost

Date:

05 October 2015, 15:17:22

Andres,

On Monday, October 5, 2015, Andres Freund <andres@anarazel.de> wrote:

On 2015-10-05 08:01:00 -0400, Stephen Frost wrote:
> Peter, all,
> I had intended to address with policies what is addressed through
> permissions with 7d8db3e, but the coverage for INSERT+RETURNING was only
> done when ON CONFLICT was in use.
>
> I've fixed that by applying the SELECT policies as WCOs for both the
> INSERT and UPDATE RETURNING cases. This matches the permissions system,
> where we require SELECT rights on the table for an INSERT RETURNING
> query.

This really needs tests verifying the behaviour...

Good point, will add.

Thanks!

Stephen

Re: ON CONFLICT issues around whole row vars,

From

Tom Lane

Date:

05 October 2015, 16:50:37

Stephen Frost <sfrost@snowman.net> writes:
> I had intended to address with policies what is addressed through
> permissions with 7d8db3e, but the coverage for INSERT+RETURNING was only
> done when ON CONFLICT was in use.

> I've fixed that by applying the SELECT policies as WCOs for both the
> INSERT and UPDATE RETURNING cases.  This matches the permissions system,
> where we require SELECT rights on the table for an INSERT RETURNING
> query.

What of DELETE RETURNING?
        regards, tom lane

Re: ON CONFLICT issues around whole row vars,

From

Stephen Frost

Date:

05 October 2015, 16:58:34

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > I had intended to address with policies what is addressed through
> > permissions with 7d8db3e, but the coverage for INSERT+RETURNING was only
> > done when ON CONFLICT was in use.
>
> > I've fixed that by applying the SELECT policies as WCOs for both the
> > INSERT and UPDATE RETURNING cases.  This matches the permissions system,
> > where we require SELECT rights on the table for an INSERT RETURNING
> > query.
>
> What of DELETE RETURNING?

That was handled in 7d8db3e.

Per previous discussion, UPDATE and DELETE RETURNING apply SELECT
policies as security quals, meaning only the records visible through the
SELECT policy are eligible for consideration.  INSERT+RETURNING has only
WithCheckOptions, no security quals, which is what makes it different
from the other cases.  The INSERT+ON CONFLICT+RETURNING case had been
covered already and I had mistakenly thought it was also covering
INSERT+RETURNING.  In fixing that, I realized that Peter makes a good
point that UPDATE+RETURNING should also have SELECT policies applied as
WithCheckOptions.

I'm about to push updated regression tests, as suggested by Andres.

Thanks!

Stephen

Re: ON CONFLICT issues around whole row vars,

From

Stephen Frost

Date:

05 October 2015, 17:15:19

* Andres Freund (andres@anarazel.de) wrote:
> On 2015-10-05 08:01:00 -0400, Stephen Frost wrote:
> > Peter, all,
> > I had intended to address with policies what is addressed through
> > permissions with 7d8db3e, but the coverage for INSERT+RETURNING was only
> > done when ON CONFLICT was in use.
> >
> > I've fixed that by applying the SELECT policies as WCOs for both the
> > INSERT and UPDATE RETURNING cases.  This matches the permissions system,
> > where we require SELECT rights on the table for an INSERT RETURNING
> > query.
>
> This really needs tests verifying the behaviour...

Added.

Thanks!

Stephen