Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [HACKERS] MERGE SQL Statement for PG11
Date
Msg-id CABOikdPxjUak5Q6gAh+p7vNAUx0tGX8ZCGChstYjfNoew2Gixg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] MERGE SQL Statement for PG11  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
Hi Stephen,


On Fri, Mar 16, 2018 at 7:28 AM, Stephen Frost <sfrost@snowman.net> wrote:
Greetings Pavan, all,

* Pavan Deolasee (pavan.deolasee@gmail.com) wrote:
> On 9 March 2018 at 08:29, Peter Geoghegan <pg@bowt.ie> wrote:
> >  My #1 concern has become RLS, and
> > perhaps only because I haven't studied it in enough detail.
>
> Sure. I've done what I thought is the right thing to do, but please check.
> Stephen also wanted to review RLS related code; I don't know if he had
> chance to do so.

I've started looking through the code from an RLS perspective and, at
least on my initial review, it looks alright.


Thanks for taking time out to review the patch. It certainly helps a lot.
 

A couple test that aren't included that I think should be in the
regression suire are where both tables have RLS policies and
where the RLS tables have actual SELECT policies beyond just 'true'.

I certainly see SELECT policies which limits the records that
individuals are allowed to see very frequently and it's an
important case to ensure works correctly.  I did a few tests here myself
and they behaved as I expected, and reading through the code it looks
reasonable, but they should be simple to write tests which run very
quickly.

Ok. I will add those tests.
 

I'm a bit on the fence about it, but as MERGE is added in quite a few
places which previously mentioned UPDATE and DELETE throughout the
system, I wonder if we shouldn't do better than this:

=*# create policy p1 on t1 for merge using ((c1 % 2) = 0);
ERROR:  syntax error at or near "merge"
LINE 1: create policy p1 on t1 for merge using ((c1 % 2) = 0);

Specifically, perhaps we should change that to pick up on MERGE being
asked for there and return an error saying that policies for the MERGE
command aren't supported with a HINT that MERGE respects
INSERT/UPDATE/DELETE policies and that users should use those instead.

Hmm. I am not sure if that would be a good idea just for RLS. We might then also want to change several other places in the grammar to accept INSERT/UPDATE/DELETE keyword and handle that during parse analysis. We can certainly do that, but I am not sure if it adds a lot of value and certainly adds a lot more code. We should surely document these things, if we are not already. 
 

Also, in nodeModifyTable.c, there's a comment:

 * The WITH CHECK quals are applied in ExecUpdate() and hence we need
 * not do anything special to handle them.

Which I believe is actually getting at the fact that ExecUpdate() will
run ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK ...) and therefore in
ExecMergeMatched() we don't need to check WCO_RLS_UPDATE_CHECK, but we
do still need to check WCO_RLS_MERGE_UPDATE_CHECK (and that's what the
code does).

Right.
 
  One thing I wonder about there though is if we really need
to segregate those..?  What's actually making WCO_RLS_MERGE_UPDATE_CHECK
different from WCO_RLS_UPDATE_CHECK?  I get that the DELETE case is
different, because in a regular DELETE we'll never even see the row, but
for MERGE, we will see the row (assuming it passes SELECT policies, of
course) and then will check if it matches and that's when we realize
that we've been asked to run a DELETE, so we have the special-case of
WCO_RLS_MERGE_DELETE_CHECK, but that's not true of UPDATE, so I think
this might be adding a bit of unnecessary complication by introducing
WCO_RLS_MERGE_UPDATE_CHECK.

I've modelled this code on the lines of ON CONFLICT DO NOTHING. And quite similar to that, I believe we need separate handling for MERGE as well because we don't (and can't) push down the USING security quals of UPDATE/DELETE to the scan. So we need to separately check that the target row actually passes the USING quals. WCO_RLS_MERGE_UPDATE_CHECK and WCO_RLS_MERGE_DELETE_CHECK are used for that purpose.

One point to deliberate though is whether it's a good idea to throw an error when the USING quals fail or should we silently ignore such rows. Regular UPDATE/DELETE does the latter and ON CONFLICT DO UPDATE does the former. I chose to throw an error because otherwise it may get confusing to the user since a row would neither be updated (meaning, it will be seen as a case of NOT MATCHED), but nor be inserted. I hope no problem there.

Thanks,
Pavan 


--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: [bug fix] Cascaded standby cannot start after a clean shutdown
Next
From: Michael Paquier
Date:
Subject: Re: [bug fix] Cascaded standby cannot start after a clean shutdown