Thread: Adding OLD/NEW support to RETURNING

Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
I have been playing around with the idea of adding support for OLD/NEW
to RETURNING, partly motivated by the discussion on the MERGE
RETURNING thread [1], but also because I think it would be a very
useful addition for other commands (UPDATE in particular).

This was discussed a long time ago [2], but that previous discussion
didn't lead to a workable patch, and so I have taken a different
approach here.

My first thought was that this would only really make sense for UPDATE
and MERGE, since OLD/NEW are pretty pointless for INSERT/DELETE
respectively. However...

1. For an INSERT with an ON CONFLICT ... DO UPDATE clause, returning
OLD might be very useful, since it provides a way to see which rows
conflicted, and return the old conflicting values.

2. If a DELETE is turned into an UPDATE by a rule (e.g., to mark rows
as deleted, rather than actually deleting them), then returning NEW
can also be useful. (I admit, this is a somewhat obscure use case, but
it's still possible.)

3. In a MERGE, we need to be able to handle all 3 command types anyway.

4. It really isn't any extra effort to support INSERT and DELETE.

So in the attached very rough patch (no docs, minimal testing) I have
just allowed OLD/NEW in RETURNING for all command types (except, I
haven't done MERGE here - I think that's best kept as a separate
patch). If there is no OLD/NEW row in a particular context, it just
returns NULLs. The regression tests contain examples of  1 & 2 above.


Based on Robert Haas' suggestion in [2], the patch works by adding a
new "varreturningtype" field to Var nodes. This field is set during
parse analysis of the returning clause, which adds new namespace
aliases for OLD and NEW, if tables with those names/aliases are not
already present. So the resulting Var nodes have the same
varno/varattno as they would normally have had, but a different
varreturningtype.

For the most part, the rewriter and parser are then untouched, except
for a couple of places necessary to ensure that the new field makes it
through correctly. In particular, none of this affects the shape of
the final plan produced. All of the work to support the new Var
returning type is done in the executor.

This turns out to be relatively straightforward, except for
cross-partition updates, which was a little trickier since the tuple
format of the old row isn't necessarily compatible with the new row,
which is in a different partition table and so might have a different
column order.

One thing that I've explicitly disallowed is returning OLD/NEW for
updates to foreign tables. It's possible that could be added in a
later patch, but I have no plans to support that right now.


One difficult question is what names to use for the new aliases. I
think OLD and NEW are the most obvious and natural choices. However,
there is a problem - if they are used in a trigger function, they will
conflict. In PL/pgSQL, this leads to an error like the following:

ERROR:  column reference "new.f1" is ambiguous
LINE 3:     RETURNING new.f1, new.f4
                      ^
DETAIL:  It could refer to either a PL/pgSQL variable or a table column.

That's the same error that you'd get if a different alias name had
been chosen, and it happened to conflict with a user-defined PL/pgSQL
variable, except that in that case, the user could just change their
variable name to fix the problem, which is not possible with the
automatically-added OLD/NEW trigger variables. As a way round that, I
added a way to optionally change the alias used in the RETURNING list,
using the following syntax:

  RETURNING [ WITH ( { OLD | NEW } AS output_alias [, ...] ) ]
    * | output_expression [ [ AS ] output_name ] [, ...]

for example:

  RETURNING WITH (OLD AS o) o.id, o.val, ...

I'm not sure how good a solution that is, but the syntax doesn't look
too bad to me (somewhat reminiscent of a WITH-query), and it's only
necessary in cases where there is a name conflict.

The simpler solution would be to just pick different alias names to
start with. The previous thread seemed to settle on BEFORE/AFTER, but
I don't find those names particularly intuitive or appealing. Over on
[1], PREVIOUS/CURRENT was suggested, which I prefer, but they still
don't seem as natural as OLD/NEW.

So, as is often the case, naming things turns out to be the hardest
problem, which is why I quite like the idea of letting the user pick
their own name, if they need to. In most contexts, OLD and NEW will
work, so they won't need to.

Thoughts?

Regards,
Dean

[1] https://www.postgresql.org/message-id/flat/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/51822C0F.5030807%40gmail.com

Attachment

Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Mon, Dec 4, 2023 at 8:15 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> I have been playing around with the idea of adding support for OLD/NEW
> to RETURNING, partly motivated by the discussion on the MERGE
> RETURNING thread [1], but also because I think it would be a very
> useful addition for other commands (UPDATE in particular).
>
> This was discussed a long time ago [2], but that previous discussion
> didn't lead to a workable patch, and so I have taken a different
> approach here.
>
> Thoughts?
>


  /* get the tuple from the relation being scanned */
- scratch.opcode = EEOP_ASSIGN_SCAN_VAR;
+ switch (variable->varreturningtype)
+ {
+ case VAR_RETURNING_OLD:
+ scratch.opcode = EEOP_ASSIGN_OLD_VAR;
+ break;
+ case VAR_RETURNING_NEW:
+ scratch.opcode = EEOP_ASSIGN_NEW_VAR;
+ break;
+ default:
+ scratch.opcode = EEOP_ASSIGN_SCAN_VAR;
+ break;
+ }
I have roughly an idea of what this code is doing. but do you need to
refactor the above comment?


/* for EEOP_INNER/OUTER/SCAN_FETCHSOME */
in src/backend/executor/execExpr.c, do you need to update the comment?

create temp table foo (f1 int, f2 int);
insert into foo values (1,2), (3,4);
INSERT INTO foo select 11, 22  RETURNING WITH (old AS new, new AS old)
new.*, old.*;
--this works. which is fine.

create or replace function stricttest1() returns void as $$
declare x record;
begin
    insert into foo values(5,6) returning new.* into x;
    raise notice 'x.f1 = % x.f2 %', x.f1, x.f2;
end$$ language plpgsql;
select * from stricttest1();
--this works.

create or replace function stricttest2() returns void as $$
declare x record; y record;
begin
    INSERT INTO foo select 11, 22  RETURNING WITH (old AS o, new AS n)
o into x, n into y;
    raise notice 'x.f1: % x.f2 % y.f1 % y.f2 %', x.f1,x.f2, y.f1, y.f2;
end$$ language plpgsql;
--this does not work.
--because
https://www.postgresql.org/message-id/flat/CAFj8pRB76FE2MVxJYPc1RvXmsf2upoTgoPCC9GsvSAssCM2APQ%40mail.gmail.com

create or replace function stricttest3() returns void as $$
declare x record; y record;
begin
    INSERT INTO foo select 11, 22  RETURNING WITH (old AS o, new AS n) o.*,n.*
    into x;
    raise notice 'x.f1 % x.f2 %, % %', x.f1, x.f2, x.f1,x.f2;
end$$ language plpgsql;
select * from stricttest3();
--this is not what we want. because old and new share the same column name
--so here you cannot get the "new" content.

create or replace function stricttest4() returns void as $$
declare x record; y record;
begin
    INSERT INTO foo select 11, 22
    RETURNING WITH (old AS o, new AS n)
    o.f1 as of1,o.f2 as of2,n.f1 as nf1, n.f2 as nf2
    into x;
    raise notice 'x.0f1 % x.of2 % nf1 % nf2 %', x.of1, x.of2, x.nf1, x.nf2;
end$$ language plpgsql;
--kind of verbose, but works, which is fine.

create or replace function stricttest5() returns void as $$
declare x record; y record;
      a foo%ROWTYPE; b foo%ROWTYPE;
begin
  INSERT INTO foo select 11, 22
  RETURNING WITH (old AS o, new AS n) o into a, n into b;
end$$ language plpgsql;
-- expect this to work.



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Sat, 16 Dec 2023 at 13:04, jian he <jian.universality@gmail.com> wrote:
>
>   /* get the tuple from the relation being scanned */
> I have roughly an idea of what this code is doing. but do you need to
> refactor the above comment?
>
> /* for EEOP_INNER/OUTER/SCAN_FETCHSOME */
> in src/backend/executor/execExpr.c, do you need to update the comment?
>

Thanks for looking at this.

Attached is a new version with some updated comments. In addition, I
fixed a couple of issues:

In raw_expression_tree_walker(), I had missed one of the new node types.

When "old" or "new" are specified by themselves in the RETURNING list
to return the whole old/new row, the parser was generating a RowExpr
node, which appeared to work OK, but failed if there were any dropped
columns in the relation. I have changed this to generate a wholerow
Var instead, which deals with that issue, and seems better for
efficiency and consistency with existing code.

In addition, I have added code during executor startup to record
whether or not the RETURNING list actually has any references to
OLD/NEW values. This allows the building of old/new tuple slots to be
skipped when they're not actually needed, reducing per-row overheads.

I still haven't written any docs yet.


> create or replace function stricttest2() returns void as $$
> declare x record; y record;
> begin
>     INSERT INTO foo select 11, 22  RETURNING WITH (old AS o, new AS n)
> o into x, n into y;
>     raise notice 'x.f1: % x.f2 % y.f1 % y.f2 %', x.f1,x.f2, y.f1, y.f2;
> end$$ language plpgsql;
> --this does not work.
> --because
https://www.postgresql.org/message-id/flat/CAFj8pRB76FE2MVxJYPc1RvXmsf2upoTgoPCC9GsvSAssCM2APQ%40mail.gmail.com
>
> create or replace function stricttest5() returns void as $$
> declare x record; y record;
>       a foo%ROWTYPE; b foo%ROWTYPE;
> begin
>   INSERT INTO foo select 11, 22
>   RETURNING WITH (old AS o, new AS n) o into a, n into b;
> end$$ language plpgsql;
> -- expect this to work.

Yeah, but note that multiple INTO clauses aren't allowed. An
alternative is to create a custom type to hold the old and new
records, e.g.:

CREATE TYPE foo_delta AS (old foo, new foo);

then you can just do "RETURNING old, new INTO delta" where delta is a
variable of type foo_delta, and you can extract individual fields
using expressions like "(delta.old).f1".

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Tomas Vondra
Date:
On 12/4/23 13:14, Dean Rasheed wrote:
> I have been playing around with the idea of adding support for OLD/NEW
> to RETURNING, partly motivated by the discussion on the MERGE
> RETURNING thread [1], but also because I think it would be a very
> useful addition for other commands (UPDATE in particular).
> 

Sounds reasonable ...

> This was discussed a long time ago [2], but that previous discussion
> didn't lead to a workable patch, and so I have taken a different
> approach here.
> 

Presumably the 2013 thread went nowhere because of some implementation
problems, not simply because the author lost interest and disappeared?
Would it be helpful for this new patch to briefly summarize what the
main issues were and how this new approach deals with that? (It's hard
to say if reading the old thread is necessary/helpful for understanding
this new patch, and time is a scarce resource.)

> My first thought was that this would only really make sense for UPDATE
> and MERGE, since OLD/NEW are pretty pointless for INSERT/DELETE
> respectively. However...
> 
> 1. For an INSERT with an ON CONFLICT ... DO UPDATE clause, returning
> OLD might be very useful, since it provides a way to see which rows
> conflicted, and return the old conflicting values.
> 
> 2. If a DELETE is turned into an UPDATE by a rule (e.g., to mark rows
> as deleted, rather than actually deleting them), then returning NEW
> can also be useful. (I admit, this is a somewhat obscure use case, but
> it's still possible.)
> 
> 3. In a MERGE, we need to be able to handle all 3 command types anyway.
> 
> 4. It really isn't any extra effort to support INSERT and DELETE.
> 
> So in the attached very rough patch (no docs, minimal testing) I have
> just allowed OLD/NEW in RETURNING for all command types (except, I
> haven't done MERGE here - I think that's best kept as a separate
> patch). If there is no OLD/NEW row in a particular context, it just
> returns NULLs. The regression tests contain examples of  1 & 2 above.
> 
> 
> Based on Robert Haas' suggestion in [2], the patch works by adding a
> new "varreturningtype" field to Var nodes. This field is set during
> parse analysis of the returning clause, which adds new namespace
> aliases for OLD and NEW, if tables with those names/aliases are not
> already present. So the resulting Var nodes have the same
> varno/varattno as they would normally have had, but a different
> varreturningtype.
> 

No opinion on whether varreturningtype is the right approach - it sounds
like it's working better than the 2013 patch, but I won't pretend my
knowledge of this code is sufficient to make judgments beyond that.

> For the most part, the rewriter and parser are then untouched, except
> for a couple of places necessary to ensure that the new field makes it
> through correctly. In particular, none of this affects the shape of
> the final plan produced. All of the work to support the new Var
> returning type is done in the executor.
> 
> This turns out to be relatively straightforward, except for
> cross-partition updates, which was a little trickier since the tuple
> format of the old row isn't necessarily compatible with the new row,
> which is in a different partition table and so might have a different
> column order.
> 

So we just "remap" the attributes, right?

> One thing that I've explicitly disallowed is returning OLD/NEW for
> updates to foreign tables. It's possible that could be added in a
> later patch, but I have no plans to support that right now.
> 

Sounds like an acceptable restriction, as long as it's documented.

What are the challenges for supporting OLD/NEW for foreign tables? I
guess we'd need to ask the FDW handler to tell us if it can support
OLD/NEW for this table (and only allow it for postgres_fdw with
sufficiently new server version), and then deparse the SQL.

I'm asking because this seems like a nice first patch idea, but if I
don't see some major obstacle that I don't see ...

> 
> One difficult question is what names to use for the new aliases. I
> think OLD and NEW are the most obvious and natural choices. However,
> there is a problem - if they are used in a trigger function, they will
> conflict. In PL/pgSQL, this leads to an error like the following:
> 
> ERROR:  column reference "new.f1" is ambiguous
> LINE 3:     RETURNING new.f1, new.f4
>                       ^
> DETAIL:  It could refer to either a PL/pgSQL variable or a table column.
> 
> That's the same error that you'd get if a different alias name had
> been chosen, and it happened to conflict with a user-defined PL/pgSQL
> variable, except that in that case, the user could just change their
> variable name to fix the problem, which is not possible with the
> automatically-added OLD/NEW trigger variables. As a way round that, I
> added a way to optionally change the alias used in the RETURNING list,
> using the following syntax:
> 
>   RETURNING [ WITH ( { OLD | NEW } AS output_alias [, ...] ) ]
>     * | output_expression [ [ AS ] output_name ] [, ...]
> 
> for example:
> 
>   RETURNING WITH (OLD AS o) o.id, o.val, ...
> 
> I'm not sure how good a solution that is, but the syntax doesn't look
> too bad to me (somewhat reminiscent of a WITH-query), and it's only
> necessary in cases where there is a name conflict.
> 
> The simpler solution would be to just pick different alias names to
> start with. The previous thread seemed to settle on BEFORE/AFTER, but
> I don't find those names particularly intuitive or appealing. Over on
> [1], PREVIOUS/CURRENT was suggested, which I prefer, but they still
> don't seem as natural as OLD/NEW.
> 
> So, as is often the case, naming things turns out to be the hardest
> problem, which is why I quite like the idea of letting the user pick
> their own name, if they need to. In most contexts, OLD and NEW will
> work, so they won't need to.
> 

I think OLD/NEW with a way to define a custom alias when needed seems
acceptable. Or at least I can't think of a clearly better solution. Yes,
using some other name might not have this problem, but I guess we'd have
to pick an existing keyword or add one. And Tom didn't seem thrilled
with reserving a keyword in 2013 ...

Plus I think there's value in consistency, and OLD/NEW seems way more
natural that BEFORE/AFTER.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Sat, 24 Feb 2024 at 17:52, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Presumably the 2013 thread went nowhere because of some implementation
> problems, not simply because the author lost interest and disappeared?
> Would it be helpful for this new patch to briefly summarize what the
> main issues were and how this new approach deals with that? (It's hard
> to say if reading the old thread is necessary/helpful for understanding
> this new patch, and time is a scarce resource.)

Thanks for looking!

The 2013 patch got fairly far down a particular implementation path
(adding a new kind of RTE called RTE_ALIAS) before Robert reviewed it
[1]. He pointed out various specific issues, as well as questioning
the overall approach, and suggesting a different approach that would
have involved significant rewriting (this is essentially the approach
that I have taken, adding a new field to Var nodes).

[1] https://www.postgresql.org/message-id/CA%2BTgmoY5EXE-YKMV7CsdSFj-noyZz%3D2z45sgyJX5Y84rO3RnWQ%40mail.gmail.com

The thread kind-of petered out shortly after that, with the conclusion
that the patch needed a pretty significant redesign and rewrite.


> No opinion on whether varreturningtype is the right approach - it sounds
> like it's working better than the 2013 patch, but I won't pretend my
> knowledge of this code is sufficient to make judgments beyond that.
>
> > For the most part, the rewriter and parser are then untouched, except
> > for a couple of places necessary to ensure that the new field makes it
> > through correctly. In particular, none of this affects the shape of
> > the final plan produced. All of the work to support the new Var
> > returning type is done in the executor.

(Of course, I meant the rewriter and the *planner* are largely untouched.)

I think this is one of the main advantages of this approach. The 2013
design, adding a new RTE kind, required changes all over the place,
including lots of hacking in the planner.


> > This turns out to be relatively straightforward, except for
> > cross-partition updates, which was a little trickier since the tuple
> > format of the old row isn't necessarily compatible with the new row,
> > which is in a different partition table and so might have a different
> > column order.
>
> So we just "remap" the attributes, right?

Right. That's what the majority of the new code in ExecDelete() and
ExecInsert() is for. It's not that complicated, but it did require a
bit of care.


> What are the challenges for supporting OLD/NEW for foreign tables?

I didn't really look at that in any detail, but I don't think it
should be too hard. It's not something I want to tackle now though,
because the patch is big enough already.


> I think OLD/NEW with a way to define a custom alias when needed seems
> acceptable. Or at least I can't think of a clearly better solution. Yes,
> using some other name might not have this problem, but I guess we'd have
> to pick an existing keyword or add one. And Tom didn't seem thrilled
> with reserving a keyword in 2013 ...
>
> Plus I think there's value in consistency, and OLD/NEW seems way more
> natural that BEFORE/AFTER.

Yes, I think OLD/NEW are much nicer too.

Attached is a new patch, now with docs (no other code changes).

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Sat, Mar 9, 2024 at 3:53 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
>
> Attached is a new patch, now with docs (no other code changes).
>

Hi,
some issues I found, while playing around with
support-returning-old-new-v2.patch

doc/src/sgml/ref/update.sgml:
    [ RETURNING [ WITH ( { OLD | NEW } AS <replaceable
class="parameter">output_alias</replaceable> [, ...] ) ]
                * | <replaceable
class="parameter">output_expression</replaceable> [ [ AS ]
<replaceable class="parameter">output_name</replaceable> ] [, ...] ]
</synopsis>

There is no parameter explanation for `*`.
so, I think the synopsis may not cover cases like:
`
update foo set f3 = 443 RETURNING new.*;
`
I saw the explanation at output_alias, though.

-----------------------------------------------------------------------------
insert into foo select 1, 2 RETURNING old.*, new.f2, old.f1();
ERROR:  function old.f1() does not exist
LINE 1: ...sert into foo select 1, 2 RETURNING old.*, new.f2, old.f1();
                                                              ^
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.

I guess that's ok, slightly different context evaluation. if you say
"old.f1", old refers to the virtual table "old",
but "old.f1()", the "old" , reevaluate to the schema "old".
you need privilege to schema "old", you also need execution privilege
to function "old.f1()" to execute the above query.
so seems no security issue after all.
-----------------------------------------------------------------------------
I found a fancy expression:
`
CREATE  TABLE foo (f1 serial, f2 text, f3 int default 42);
insert into foo select 1, 2 union select 11, 22 RETURNING old.*,
new.f2, (select sum(new.f1) over());
`
is this ok?

also the following works on PG16, not sure it's a bug.
`
 insert into foo select 1, 2 union select 11, 22 RETURNING  (select count(*));
`
but not these
`
insert into foo select 1, 2 union select 11, 22 RETURNING  (select
count(old.*));
insert into foo select 1, 2 union select 11, 22 RETURNING  (select sum(f1));
`
-----------------------------------------------------------------------------
I found another interesting case, while trying to add some tests on
for new code in createplan.c.
in postgres_fdw.sql, right after line `MERGE ought to fail cleanly`

--this will work
insert into itrtest select 1, 'foo' returning new.*,old.*;
--these two will fail
insert into remp1 select 1, 'foo' returning new.*;
insert into remp1 select 1, 'foo' returning old.*;

itrtest is the partitioned non-foreign table.
remp1 is the partition of itrtest, foreign table.

------------------------------------------------------------------------------------------
I did find a segment fault bug.
insert into foo select 1, 2 RETURNING (select sum(old.f1) over());

This information is set in a subplan node.
/* update the ExprState's flags if Var refers to OLD/NEW */
if (variable->varreturningtype == VAR_RETURNING_OLD)
state->flags |= EEO_FLAG_HAS_OLD;
else if (variable->varreturningtype == VAR_RETURNING_NEW)
state->flags |= EEO_FLAG_HAS_NEW;

but in ExecInsert:
`
else if (resultRelInfo->ri_projectReturning->pi_state.flags & EEO_FLAG_HAS_OLD)
{
oldSlot = ExecGetReturningSlot(estate, resultRelInfo);
ExecStoreAllNullTuple(oldSlot);
oldSlot->tts_tableOid = RelationGetRelid(resultRelInfo->ri_RelationDesc);
}
`
it didn't use subplan node state->flags information. so the ExecInsert
above code, never called, and should be executed.
however
`
insert into foo select 1, 2 RETURNING (select sum(new.f1)over());`
works

Similarly this
 `
delete from foo RETURNING (select sum(new.f1) over());
`
also causes segmentation fault.
------------------------------------------------------------------------------------------
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
new file mode 100644
index 6133dbc..c9d3661
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -411,12 +411,21 @@ slot_getsysattr(TupleTableSlot *slot, in
 {
  Assert(attnum < 0); /* caller error */

+ /*
+ * If the tid is not valid, there is no physical row, and all system
+ * attributes are deemed to be NULL, except for the tableoid.
+ */
  if (attnum == TableOidAttributeNumber)
  {
  *isnull = false;
  return ObjectIdGetDatum(slot->tts_tableOid);
  }
- else if (attnum == SelfItemPointerAttributeNumber)
+ if (!ItemPointerIsValid(&slot->tts_tid))
+ {
+ *isnull = true;
+ return PointerGetDatum(NULL);
+ }
+ if (attnum == SelfItemPointerAttributeNumber)
  {
  *isnull = false;
  return PointerGetDatum(&slot->tts_tid);

These changes is slot_getsysattr is somehow independ of this feature?



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Sun, 10 Mar 2024 at 23:41, jian he <jian.universality@gmail.com> wrote:
>
> Hi,
> some issues I found, while playing around with
> support-returning-old-new-v2.patch
>

Thanks for testing. This is very useful.


> doc/src/sgml/ref/update.sgml:
>
> There is no parameter explanation for `*`.
> so, I think the synopsis may not cover cases like:
> `
> update foo set f3 = 443 RETURNING new.*;
> `
> I saw the explanation at output_alias, though.

"*" is documented under output_alias and output_expression. I'm not
sure that it makes sense to have a separate top-level parameter
section for it, because "*" is also something that can appear after
table_name, meaning something completely different, so it might get
confusing. Perhaps the explanation under output_expression can be
expanded a bit. I'll think about it some more.


> insert into foo select 1, 2 RETURNING old.*, new.f2, old.f1();
> ERROR:  function old.f1() does not exist
> LINE 1: ...sert into foo select 1, 2 RETURNING old.*, new.f2, old.f1();
>                                                               ^
> HINT:  No function matches the given name and argument types. You
> might need to add explicit type casts.

Yes, that's consistent with current behaviour. You can also write
foo.f1() or something_else.f1(). Anything of that form, with
parentheses, is interpreted as schema_name.function_name(), not as a
column reference.


> I found a fancy expression:
> `
> CREATE  TABLE foo (f1 serial, f2 text, f3 int default 42);
> insert into foo select 1, 2 union select 11, 22 RETURNING old.*,
> new.f2, (select sum(new.f1) over());
> `
> is this ok?

Yes, I guess it's OK, though not really useful in practice.

"new.f1" is 1 for the first row and 11 for the second. When you write
"(select sum(new.f1) over())", with no FROM clause, you're implicitly
evaluating over a table with 1 row in the subquery, so it just returns
new.f1.

This is the same as the standalone query

SELECT sum(11) OVER();
 sum
-----
  11
(1 row)

So it's likely that any window function can be used in a FROM-less
subquery inside a RETURNING expression. I can't think of any practical
use for it though. In any case, this isn't something new to this
patch.


> also the following works on PG16, not sure it's a bug.
> `
>  insert into foo select 1, 2 union select 11, 22 RETURNING  (select count(*));
> `

This is OK, because that subquery is an uncorrelated aggregate query
that doesn't reference the outer query. In this case, it's not very
interesting, because it lacks a FROM clause, so it just returns 1. But
you could also write "(SELECT count(*) FROM some_other_table WHERE
...)", and it would work because the aggregate function is evaluated
over the rows of the table in the subquery. That's more useful if the
subquery is made into a correlated subquery by referring to columns
from the outer query. The rules for that are documented here:


https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES:~:text=When%20an%20aggregate%20expression%20appears%20in%20a%20subquery


> but not these
> `
> insert into foo select 1, 2 union select 11, 22 RETURNING  (select
> count(old.*));
> insert into foo select 1, 2 union select 11, 22 RETURNING  (select sum(f1));
> `

In these cases, since the aggregate's arguments are all outer-level
variables, it is associated with the outer query, so it is rejected on
the grounds that aggregate functions aren't allowed in RETURNING.

It is allowed if that subquery has a FROM clause, since the aggregated
arguments are then treated as constants over the rows in the subquery,
so arguably the same could be made to happen without a FROM clause,
but there really is no practical use case for allowing that. Again,
this isn't something new to this patch.


> I found another interesting case, while trying to add some tests on
> for new code in createplan.c.
> in postgres_fdw.sql, right after line `MERGE ought to fail cleanly`
>
> --this will work
> insert into itrtest select 1, 'foo' returning new.*,old.*;
> --these two will fail
> insert into remp1 select 1, 'foo' returning new.*;
> insert into remp1 select 1, 'foo' returning old.*;
>
> itrtest is the partitioned non-foreign table.
> remp1 is the partition of itrtest, foreign table.

Hmm, I was a little surprised that that first example worked, but I
can see why now.

I was content to just say that RETURNING old/new wasn't supported for
foreign tables in this first version, but looking at it more closely,
the only tricky part is direct-modify updates. So if we just disable
direct-modify when there are OLD/NEW variables in the the RETURNING
list, then it "just works".

So I've done that, and added a few additional tests to
postgres_fdw.sql, and removed the doc notes about foreign tables not
being supported. I really thought that there would be more to it than
that, but it seems to work fine.


> I did find a segment fault bug.
> insert into foo select 1, 2 RETURNING (select sum(old.f1) over());
>
> This information is set in a subplan node.
> /* update the ExprState's flags if Var refers to OLD/NEW */
> if (variable->varreturningtype == VAR_RETURNING_OLD)
> state->flags |= EEO_FLAG_HAS_OLD;
> else if (variable->varreturningtype == VAR_RETURNING_NEW)
> state->flags |= EEO_FLAG_HAS_NEW;
>
> but in ExecInsert it didn't use subplan node state->flags information

Ah, good catch!

When recursively initialising a SubPlan, if any of its expressions is
found to contain OLD/NEW Vars, it needs to update the flags on the
parent ExprState. Fixed in the new version.


> @@ -411,12 +411,21 @@ slot_getsysattr(TupleTableSlot *slot, in
>  {
>   Assert(attnum < 0); /* caller error */
>
> + /*
> + * If the tid is not valid, there is no physical row, and all system
> + * attributes are deemed to be NULL, except for the tableoid.
> + */
>   if (attnum == TableOidAttributeNumber)
>   {
>   *isnull = false;
>   return ObjectIdGetDatum(slot->tts_tableOid);
>   }
> - else if (attnum == SelfItemPointerAttributeNumber)
> + if (!ItemPointerIsValid(&slot->tts_tid))
> + {
> + *isnull = true;
> + return PointerGetDatum(NULL);
> + }
> + if (attnum == SelfItemPointerAttributeNumber)
>   {
>   *isnull = false;
>   return PointerGetDatum(&slot->tts_tid);
>
> These changes is slot_getsysattr is somehow independ of this feature?

This is necessary because under some circumstances, when returning
old/new, the corresponding table slot may contain all NULLs and an
invalid ctid. For example, the old slot in an INSERT which didn't do
an ON CONFLICT UPDATE. So we need to guard against that, in case the
user tries to return old.ctid, for example. It's useful to always
return a non-NULL tableoid though, because that's a property of the
table, rather than the row.

Thanks for testing.

Attached is an updated patch, fixing the seg-fault and now with
support for foreign tables.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Mon, 11 Mar 2024 at 14:03, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Attached is an updated patch, fixing the seg-fault and now with
> support for foreign tables.
>

Updated version attached tidying up a couple of things and fixing another bug:

1). Tidied up the code in createplan.c that was testing for old/new
Vars in the returning list, by adding a separate function --
contain_vars_returning_old_or_new() -- making it more reusable and
efficient.

2). Updated the deparsing code for EXPLAIN so that old/new Vars are
always prefixed with the alias, so that it's possible to tell them
apart in the EXPLAIN output.

3). Updated rewriteRuleAction() to preserve the old/new alias names in
the rewritten query. I think this was only relevant to the EXPLAIN
output.

4). Fixed a bug in assign_param_for_var() -- this needs to compare the
varreturningtype of the Vars, otherwise 2 different Vars could get
assigned the same Param. As the comment said, this needs to compare
everything that _equalVar() compares, except for the specific fields
listed. Otherwise a subquery like (select old.a = new.a) in the
returning list would only generate one Param for the two up-level
Vars, and produce the wrong result.

5). Removed the ParseState fields p_returning_old and p_returning_new
that weren't being used anymore.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Tue, 12 Mar 2024 at 18:21, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Updated version attached tidying up a couple of things and fixing another bug:
>

Rebased version attached, on top of c649fa24a4 (MERGE ... RETURNING support).

This just extends the previous version to work with MERGE, adding a
few extra tests, which is all fairly straightforward.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Mon, Mar 18, 2024 at 6:48 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Tue, 12 Mar 2024 at 18:21, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> >
> > Updated version attached tidying up a couple of things and fixing another bug:
> >
>
> Rebased version attached, on top of c649fa24a4 (MERGE ... RETURNING support).
>


hi, some minor issues I found out.

+/*
+ * ReplaceReturningVarsFromTargetList -
+ * replace RETURNING list Vars with items from a targetlist
+ *
+ * This is equivalent to calling ReplaceVarsFromTargetList() with a
+ * nomatch_option of REPLACEVARS_REPORT_ERROR, but with the added effect of
+ * copying varreturningtype onto any Vars referring to new_result_relation,
+ * allowing RETURNING OLD/NEW to work in the rewritten query.
+ */
+
+typedef struct
+{
+ ReplaceVarsFromTargetList_context rv_con;
+ int new_result_relation;
+} ReplaceReturningVarsFromTargetList_context;
+
+static Node *
+ReplaceReturningVarsFromTargetList_callback(Var *var,
+ replace_rte_variables_context *context)
+{
+ ReplaceReturningVarsFromTargetList_context *rcon =
(ReplaceReturningVarsFromTargetList_context *) context->callback_arg;
+ Node   *newnode;
+
+ newnode = ReplaceVarsFromTargetList_callback(var, context);
+
+ if (var->varreturningtype != VAR_RETURNING_DEFAULT)
+ SetVarReturningType((Node *) newnode, rcon->new_result_relation,
+ var->varlevelsup, var->varreturningtype);
+
+ return newnode;
+}
+
+Node *
+ReplaceReturningVarsFromTargetList(Node *node,
+   int target_varno, int sublevels_up,
+   RangeTblEntry *target_rte,
+   List *targetlist,
+   int new_result_relation,
+   bool *outer_hasSubLinks)
+{
+ ReplaceReturningVarsFromTargetList_context context;
+
+ context.rv_con.target_rte = target_rte;
+ context.rv_con.targetlist = targetlist;
+ context.rv_con.nomatch_option = REPLACEVARS_REPORT_ERROR;
+ context.rv_con.nomatch_varno = 0;
+ context.new_result_relation = new_result_relation;
+
+ return replace_rte_variables(node, target_varno, sublevels_up,
+ ReplaceReturningVarsFromTargetList_callback,
+ (void *) &context,
+ outer_hasSubLinks);
+}

the ReplaceReturningVarsFromTargetList related comment
should be placed right above the function ReplaceReturningVarsFromTargetList,
not above ReplaceReturningVarsFromTargetList_context?

struct ReplaceReturningVarsFromTargetList_context adds some comments
about new_result_relation would be great.


/* INDEX_VAR is handled by default case */
this comment appears in execExpr.c and execExprInterp.c.
need to move to default case's switch default case?


/*
 * set_deparse_context_plan - Specify Plan node containing expression
 *
 * When deparsing an expression in a Plan tree, we might have to resolve
 * OUTER_VAR, INNER_VAR, or INDEX_VAR references.  To do this, the caller must
 * provide the parent Plan node.
...
*/
does the comment in set_deparse_context_plan need to be updated?

+ * buildNSItemForReturning -
+ * add a ParseNamespaceItem for the OLD or NEW alias in RETURNING.
+ */
+static void
+addNSItemForReturning(ParseState *pstate, const char *aliasname,
+  VarReturningType returning_type)
comment "buildNSItemForReturning" should be "addNSItemForReturning"?


  * results.  If include_dropped is true then empty strings and NULL constants
  * (not Vars!) are returned for dropped columns.
  *
- * rtindex, sublevels_up, and location are the varno, varlevelsup, and location
- * values to use in the created Vars.  Ordinarily rtindex should match the
- * actual position of the RTE in its rangetable.
+ * rtindex, sublevels_up, returning_type, and location are the varno,
+ * varlevelsup, varreturningtype, and location values to use in the created
+ * Vars.  Ordinarily rtindex should match the actual position of the RTE in
+ * its rangetable.
we already updated the comment in expandRTE.
but it seems we only do RTE_RELATION, some part of RTE_FUNCTION.
do we need
`
varnode->varreturningtype = returning_type;
`
for other `rte->rtekind` when there is a makeVar?

(I don't understand this part, in the case where rte->rtekind is
RTE_SUBQUERY, if I add  `varnode->varreturningtype = returning_type;`
the tests still pass.



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Mon, 18 Mar 2024 at 10:48, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Rebased version attached, on top of c649fa24a4 (MERGE ... RETURNING support).
>

I have been doing more testing of this and I realised that there was a
problem -- the previous patch worked fine when updating a regular
table, so that old/new.colname is just a Var, but when updating an
auto-updatable view, "colname" could end up being replaced by an
arbitrary expression. In the cases I had tested before, that appeared
to work OK, but actually it wasn't right in all cases where the result
should have been NULL, due to the old/new row being absent (e.g., the
old row in an INSERT).

After thinking about that for a while, the best solution seemed to be
to add a new executable node, which I've called ReturningExpr. This
evaluates the old/new expression if the old/new row exists, but skips
it and returns NULL if the old/new row doesn't exist. The simplest
example is a query like this, which now returns what I would expect:

DROP TABLE IF EXISTS tt CASCADE;
CREATE TABLE tt (a int PRIMARY KEY, b text);
INSERT INTO tt VALUES (1, 'R1');
CREATE VIEW tv AS SELECT a, b, 'Const' c FROM tt;

INSERT INTO tv VALUES (1, 'Row 1'), (2, 'Row 2')
  ON CONFLICT (a) DO UPDATE SET b = excluded.b
  RETURNING old.*, new.*;

 a | b  |   c   | a |   b   |   c
---+----+-------+---+-------+-------
 1 | R1 | Const | 1 | Row 1 | Const
   |    |       | 2 | Row 2 | Const
(2 rows)

(Previously that was returning old.c = 'Const' in both rows, because
the Const node has no old/new qualification.)

In EXPLAIN, I opted to display this as "old/new.(expression)", to make
it clear that the expression is being evaluated in the context of the
old/new row, even if it doesn't directly refer to old/new values from
the table. So, for example, the plan for the above query looks like
this:

                                   QUERY PLAN
--------------------------------------------------------------------------------
 Insert on public.tt
   Output: old.a, old.b, old.('Const'::text), new.a, new.b, new.('Const'::text)
   Conflict Resolution: UPDATE
   Conflict Arbiter Indexes: tt_pkey
   ->  Values Scan on "*VALUES*"
         Output: "*VALUES*".column1, "*VALUES*".column2

(It can't output "old.c" or "new.c" because all knowledge of the view
column "c" is gone by the time it has been through the rewriter, and
in any case, the details of the expression being evaluated are likely
to be useful in general.)

Things get more complicated when subqueries are involved. For example,
given this view definition:

CREATE VIEW tv AS SELECT a, b, (SELECT concat('b=',b)) c FROM tt;

the INSERT above produces this:

 a | b  |  c   | a |   b   |    c
---+----+------+---+-------+---------
 1 | R1 | b=R1 | 1 | Row 1 | b=Row 1
   |    |      | 2 | Row 2 | b=Row 2
(2 rows)

which is as expected. This uses the following query plan:

                                 QUERY PLAN
----------------------------------------------------------------------------
 Insert on public.tt
   Output: old.a, old.b, old.((SubPlan 1)), new.a, new.b, new.((SubPlan 2))
   Conflict Resolution: UPDATE
   Conflict Arbiter Indexes: tt_pkey
   ->  Values Scan on "*VALUES*"
         Output: "*VALUES*".column1, "*VALUES*".column2
   SubPlan 1
     ->  Result
           Output: concat('b=', old.b)
   SubPlan 2
     ->  Result
           Output: concat('b=', new.b)

In this case "b" in the view subquery becomes "old.b" in SubPlan 1 and
"new.b" in SubPlan 2 (each with varlevelsup = 1, and therefore
evaluated as input params to the subplans). The concat() result would
normally always be non-NULL, but it (or rather the SubLink subquery
containing it) is wrapped in a ReturningExpr. As a result, SubPlan 1
is skipped in the second row, for which old does not exist, and ends
up only being executed once in that query, whereas SubPlan 2 is
executed twice.

Things get even more fiddly when the old/new expression itself appears
in a subquery. For example, given the following query:

INSERT INTO tv VALUES (1, 'Row 1'), (2, 'Row 2')
  ON CONFLICT (a) DO UPDATE SET b = excluded.b
  RETURNING old.a, old.b, (SELECT old.c), new.*;

the result is the same, but the query plan is now

                              QUERY PLAN
----------------------------------------------------------------------
 Insert on public.tt
   Output: old.a, old.b, (SubPlan 2), new.a, new.b, new.((SubPlan 3))
   Conflict Resolution: UPDATE
   Conflict Arbiter Indexes: tt_pkey
   ->  Values Scan on "*VALUES*"
         Output: "*VALUES*".column1, "*VALUES*".column2
   SubPlan 1
     ->  Result
           Output: concat('b=', old.b)
   SubPlan 2
     ->  Result
           Output: (old.((SubPlan 1)))
   SubPlan 3
     ->  Result
           Output: concat('b=', new.b)

The ReturningExpr nodes belong to the query level containing the
RETURNING list (hence they have a "levelsup" field, like Var,
PlaceHolderVar, etc.). So in this example, one of the ReturningExpr
nodes is in SubPlan 2, with "levelsup" = 1, wrapping SubPlan 1, i.e.,
it only executes SubPlan 1 if the old row exists.

Although that all sounds quite complicated, all the individual pieces
are quite simple.

Attached is an updated patch in which I have also tidied up a few
other things, but I haven't read your latest review comments yet. I'll
respond to those separately.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Mon, 25 Mar 2024 at 00:00, jian he <jian.universality@gmail.com> wrote:
>
> hi, some minor issues I found out.
>
> the ReplaceReturningVarsFromTargetList related comment
> should be placed right above the function ReplaceReturningVarsFromTargetList,
> not above ReplaceReturningVarsFromTargetList_context?

Hmm, well there are a mix of possible styles for this kind of
function. Sometimes the outer function comes first, immediately after
the function comment, and then the callback function comes after that.
That has the advantage that all documentation comments related to the
top-level input arguments are next to the function that takes them.
Also, this ordering means that you naturally read it in the order in
which it is initially executed.

The other style, putting the callback function first has the advantage
that you can more immediately see what the function does, since it's
usually the callback that contains the interesting logic.

rewriteManip.c has examples of both styles, but in this case, since
ReplaceReturningVarsFromTargetList() is similar to
ReplaceVarsFromTargetList(), I opted to copy its style.

> struct ReplaceReturningVarsFromTargetList_context adds some comments
> about new_result_relation would be great.

I substantially rewrote that function in the v6 patch. As part of
that, I renamed "new_result_relation" to "new_target_varno", which
more closely matches the existing "target_varno" argument, and I added
comments about what it's for to the top-level function comment block.

> /* INDEX_VAR is handled by default case */
> this comment appears in execExpr.c and execExprInterp.c.
> need to move to default case's switch default case?

No, I think it's fine as it is. Its current placement is where you
might otherwise expect to find a "case INDEX_VAR:" block of code, and
it's explaining why there isn't one there, and where to look instead.

Moving it into the switch default case would lose that effect, and I
think it would reduce the code's readability.

> /*
>  * set_deparse_context_plan - Specify Plan node containing expression
>  *
>  * When deparsing an expression in a Plan tree, we might have to resolve
>  * OUTER_VAR, INNER_VAR, or INDEX_VAR references.  To do this, the caller must
>  * provide the parent Plan node.
> ...
> */
> does the comment in set_deparse_context_plan need to be updated?

In the v6 patch, I moved the code change from
set_deparse_context_plan() down into set_deparse_plan(), because I
thought that would catch more cases, but thinking about it some more,
that wasn't necessary, since it won't change when moving up and down
the ancestor tree. So in v7, I've moved it back and updated the
comment.

> + * buildNSItemForReturning -
> + * add a ParseNamespaceItem for the OLD or NEW alias in RETURNING.
> + */
> +static void
> +addNSItemForReturning(ParseState *pstate, const char *aliasname,
> +  VarReturningType returning_type)
> comment "buildNSItemForReturning" should be "addNSItemForReturning"?

Yes, well spotted. Fixed in v7.

> [in expandRTE()]
>
> - * rtindex, sublevels_up, and location are the varno, varlevelsup, and location
> - * values to use in the created Vars.  Ordinarily rtindex should match the
> - * actual position of the RTE in its rangetable.
> + * rtindex, sublevels_up, returning_type, and location are the varno,
> + * varlevelsup, varreturningtype, and location values to use in the created
> + * Vars.  Ordinarily rtindex should match the actual position of the RTE in
> + * its rangetable.
> we already updated the comment in expandRTE.
> but it seems we only do RTE_RELATION, some part of RTE_FUNCTION.
> do we need
> `
> varnode->varreturningtype = returning_type;
> `
> for other `rte->rtekind` when there is a makeVar?
>
> (I don't understand this part, in the case where rte->rtekind is
> RTE_SUBQUERY, if I add  `varnode->varreturningtype = returning_type;`
> the tests still pass.

In the v6 patch, I already added code to ensure that it's set in all
cases, though I don't think it's strictly necessary. returning_type
can only have a non-default value for the target RTE, which can't
currently be any of those other RTE kinds, but nonetheless it seemed
better from a consistency point-of-view, and to make it more
future-proof.

v7 patch attached, with those updates.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Mon, 25 Mar 2024 at 14:04, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> v7 patch attached, with those updates.
>

Rebased version attached, forced by 87985cc925.

The changes made in that commit didn't entirely make sense to me, but
the ExecDelete() change, copying data between slots, broke this patch,
because it wasn't setting the slot's tableoid. That copying seemed to
be unnecessary anyway, so I got rid of it, and it works fine. While at
it, I also removed the extra "oldslot" argument added to ExecDelete(),
which didn't seem necessary, and wasn't documented clearly. Those
changes could perhaps be extracted and applied separately.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Jeff Davis
Date:
On Tue, 2024-03-26 at 18:49 +0000, Dean Rasheed wrote:
> On Mon, 25 Mar 2024 at 14:04, Dean Rasheed <dean.a.rasheed@gmail.com>
> wrote:
> >
> > v7 patch attached, with those updates.
> >
>
> Rebased version attached, forced by 87985cc925.

This isn't a complete review, but I spent a while looking at this, and
it looks like it's in good shape.

I like the syntax, and I think the solution for renaming the alias
("RETURNING WITH (new as n, old as o)") is a good one.

The implementation touches quite a few areas. How did you identify all
of the potential problem areas? It seems the primary sources of
complexity came from rules, partitioning, and updatable views, is that
right?

Regards,
    Jeff Davis




Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Wed, 27 Mar 2024 at 07:47, Jeff Davis <pgsql@j-davis.com> wrote:
>
> This isn't a complete review, but I spent a while looking at this, and
> it looks like it's in good shape.

Thanks for looking.

> I like the syntax, and I think the solution for renaming the alias
> ("RETURNING WITH (new as n, old as o)") is a good one.

Thanks, that's good to know. Settling on a good syntax can be
difficult, so it's good to know that people are generally supportive
of this.

> The implementation touches quite a few areas. How did you identify all
> of the potential problem areas?

Hmm, well that's one of the hardest parts, and it's really difficult
to be sure that I have.

Initially, when I was just adding a new field to Var, I just tried to
look at all the existing code that made Vars, or copied other
non-default fields like varnullingrels around. I still managed to miss
the necessary change in assign_param_for_var() on my first attempt,
but fortunately that was an easy fix.

More worrying was the fact that I managed to completely overlook the
fact that I needed to worry about non-updatable columns in
auto-updatable views until v6, which added the ReturningExpr node.
Once I realised that I needed that, and that it needed to be tied to a
particular query level, and so needed a "levelsup" field, I just
looked at GroupingFunc to identify the places in code that needed to
be updated to do the right thing for a query-level-aware node.

What I'm most worried about now is that there are other areas of
functionality like that, that I'm overlooking, and which will interact
with this feature in non-trivial ways.

> It seems the primary sources of
> complexity came from rules, partitioning, and updatable views, is that
> right?

Foreign tables looked like it would be tricky at first, but then
turned out to be trivial, after disallowing direct-modify when
returning old/new.

Rules are a whole area that I wish I didn't have to worry about (I
wish we had deprecated them a long time ago). In practice though, I
haven't done much beyond what seemed like the most obvious (and
simplest) thing.

Nonetheless, there are some interesting interactions that probably
need more careful examination. For example, the fact that the
RETURNING clause in a RULE already has its own "special table names"
OLD and NEW, which are actually references to different RTEs, unlike
the OLD and NEW that this patch introduces, which are references to
the result relation. This leads to a number of different cases:

Case 1
======

In the simplest case, the rule can simply contain "RETURNING *". This
leads to what I think is the most obvious and intuitive behaviour:

DROP TABLE IF EXISTS t1, t2 CASCADE;
CREATE TABLE t1 (val1 text);
INSERT INTO t1 VALUES ('Old value 1');
CREATE TABLE t2 (val2 text);
INSERT INTO t2 VALUES ('Old value 2');

CREATE RULE r2 AS ON UPDATE TO t2
  DO INSTEAD UPDATE t1 SET val1 = NEW.val2
  RETURNING *;

UPDATE t2 SET val2 = 'New value 2'
  RETURNING old.val2 AS old_val2, new.val2 AS new_val2,
            t2.val2 AS t2_val2, val2;

  old_val2   |  new_val2   |   t2_val2   |    val2
-------------+-------------+-------------+-------------
 Old value 1 | New value 2 | New value 2 | New value 2
(1 row)

So someone using the table with the rule can access old and new values
in the obvious way, and they will get new values by default for an
UPDATE.

The query plan for this is pretty-much what you'd expect:

                      QUERY PLAN
-------------------------------------------------------
 Update on public.t1
   Output: old.val1, new.val1, t1.val1, t1.val1
   ->  Nested Loop
         Output: 'New value 2'::text, t1.ctid, t2.ctid
         ->  Seq Scan on public.t1
               Output: t1.ctid
         ->  Materialize
               Output: t2.ctid
               ->  Seq Scan on public.t2
                     Output: t2.ctid

Case 2
======

If the rule contains "RETURNING OLD.*", it means that the RETURNING
list of the rewritten query contains Vars that no longer refer to the
result relation, but instead refer to the old data in t2. This leads
the the following behaviour:

DROP TABLE IF EXISTS t1, t2 CASCADE;
CREATE TABLE t1 (val1 text);
INSERT INTO t1 VALUES ('Old value 1');
CREATE TABLE t2 (val2 text);
INSERT INTO t2 VALUES ('Old value 2');

CREATE RULE r2 AS ON UPDATE TO t2
  DO INSTEAD UPDATE t1 SET val1 = NEW.val2
  RETURNING OLD.*;

UPDATE t2 SET val2 = 'New value 2'
  RETURNING old.val2 AS old_val2, new.val2 AS new_val2,
            t2.val2 AS t2_val2, val2;

  old_val2   |  new_val2   |   t2_val2   |    val2
-------------+-------------+-------------+-------------
 Old value 2 | Old value 2 | Old value 2 | Old value 2
(1 row)

The reason this happens is that the Vars in the returning list don't
refer to the result relation, and so setting varreturningtype on them
has no effect, and is simply ignored. This can be seen by looking at
the query plan:

                           QUERY PLAN
----------------------------------------------------------------
 Update on public.t1
   Output: old.(t2.val2), new.(t2.val2), t2.val2, t2.val2
   ->  Nested Loop
         Output: 'New value 2'::text, t1.ctid, t2.ctid, t2.val2
         ->  Seq Scan on public.t1
               Output: t1.ctid
         ->  Materialize
               Output: t2.ctid, t2.val2
               ->  Seq Scan on public.t2
                     Output: t2.ctid, t2.val2

So all the final output values come from t2, not the result relation t1.

Case 3
======

Similarly, if the rule contains "RETURNING NEW.*", the effect is
similar, because again, the Vars in the RETURNING list don't refer to
the result relation in the rewritten query:

DROP TABLE IF EXISTS t1, t2 CASCADE;
CREATE TABLE t1 (val1 text);
INSERT INTO t1 VALUES ('Old value 1');
CREATE TABLE t2 (val2 text);
INSERT INTO t2 VALUES ('Old value 2');

CREATE RULE r2 AS ON UPDATE TO t2
  DO INSTEAD UPDATE t1 SET val1 = NEW.val2
  RETURNING NEW.*;

UPDATE t2 SET val2 = 'New value 2'
  RETURNING old.val2 AS old_val2, new.val2 AS new_val2,
            t2.val2 AS t2_val2, val2;

  old_val2   |  new_val2   |   t2_val2   |    val2
-------------+-------------+-------------+-------------
 New value 2 | New value 2 | New value 2 | New value 2
(1 row)

This time, the query plan shows that the result values are coming from
the new source values:

                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Update on public.t1
   Output: old.('New value 2'::text), new.('New value 2'::text), 'New
value 2'::text, 'New value 2'::text
   ->  Nested Loop
         Output: 'New value 2'::text, t1.ctid, t2.ctid
         ->  Seq Scan on public.t1
               Output: t1.ctid
         ->  Materialize
               Output: t2.ctid
               ->  Seq Scan on public.t2
                     Output: t2.ctid

Case 4
======

It's also possible to use the new returning old/new syntax in the
rule, by defining custom aliases. This has a subtly different meaning,
because it indicates that Vars in the rewritten query should refer to
the result relation, with varreturningtype set accordingly. So, for
example, returning old in the rule using this technique leads to the
following behaviour:

DROP TABLE IF EXISTS t1, t2 CASCADE;
CREATE TABLE t1 (val1 text);
INSERT INTO t1 VALUES ('Old value 1');
CREATE TABLE t2 (val2 text);
INSERT INTO t2 VALUES ('Old value 2');

CREATE RULE r2 AS ON UPDATE TO t2
  DO INSTEAD UPDATE t1 SET val1 = NEW.val2
  RETURNING WITH (OLD AS o) o.*;

UPDATE t2 SET val2 = 'New value 2'
  RETURNING old.val2 AS old_val2, new.val2 AS new_val2,
            t2.val2 AS t2_val2, val2;

  old_val2   |  new_val2   |   t2_val2   |    val2
-------------+-------------+-------------+-------------
 Old value 1 | New value 2 | Old value 1 | Old value 1
(1 row)

The query plan for this indicates that all returned values now come
from the result relation, but the default is to return old values
rather than new values, and it now allows that default to be
overridden:

                      QUERY PLAN
-------------------------------------------------------
 Update on public.t1
   Output: old.val1, new.val1, old.val1, old.val1
   ->  Nested Loop
         Output: 'New value 2'::text, t1.ctid, t2.ctid
         ->  Seq Scan on public.t1
               Output: t1.ctid
         ->  Materialize
               Output: t2.ctid
               ->  Seq Scan on public.t2
                     Output: t2.ctid

Case 5
======

Similarly, the rule can use the new syntax to return new values:

DROP TABLE IF EXISTS t1, t2 CASCADE;
CREATE TABLE t1 (val1 text);
INSERT INTO t1 VALUES ('Old value 1');
CREATE TABLE t2 (val2 text);
INSERT INTO t2 VALUES ('Old value 2');

CREATE RULE r2 AS ON UPDATE TO t2
  DO INSTEAD UPDATE t1 SET val1 = NEW.val2
  RETURNING WITH (NEW AS n) n.*;

UPDATE t2 SET val2 = 'New value 2'
  RETURNING old.val2 AS old_val2, new.val2 AS new_val2,
            t2.val2 AS t2_val2, val2;

  old_val2   |  new_val2   |   t2_val2   |    val2
-------------+-------------+-------------+-------------
 Old value 1 | New value 2 | New value 2 | New value 2
(1 row)

which is the same result as case 1, but with a slightly different query plan:

                      QUERY PLAN
-------------------------------------------------------
 Update on public.t1
   Output: old.val1, new.val1, new.val1, new.val1
   ->  Nested Loop
         Output: 'New value 2'::text, t1.ctid, t2.ctid
         ->  Seq Scan on public.t1
               Output: t1.ctid
         ->  Materialize
               Output: t2.ctid
               ->  Seq Scan on public.t2
                     Output: t2.ctid

This explicitly sets the defaults for "t2.val2" and "val2"
unqualified, whereas in case 1 they were the implicit defaults for an
UPDATE command.

I think that all that is probably reasonable, but it definitely needs
documenting, which I haven't attempted yet.

Overall, I'm pretty hesitant to try to commit this to v17. Aside from
the fact that there's a lot of new code that hasn't had much in the
way of review or discussion, I also feel that I probably haven't fully
considered all areas where additional complexity might arise. It
doesn't seem like that long ago that this was just a prototype, and
it's certainly not that long ago that I had to add a substantial
amount of new code to deal with the auto-updatable view case that I
had completely overlooked.

So on reflection, rather than trying to rush to get this into v17, I
think it would be better to leave it to v18.

Regards,
Dean



Re: Adding OLD/NEW support to RETURNING

From
Jeff Davis
Date:
On Wed, 2024-03-27 at 13:19 +0000, Dean Rasheed wrote:

> What I'm most worried about now is that there are other areas of
> functionality like that, that I'm overlooking, and which will
> interact
> with this feature in non-trivial ways.

Agreed. I'm not sure exactly how we'd find those other areas (if they
exist) aside from just having more eyes on the code.

>
> So on reflection, rather than trying to rush to get this into v17, I
> think it would be better to leave it to v18.

Thank you for letting me know. That allows some time for others to have
a look.

Regards,
    Jeff Davis




Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
Rebased version attached, on top of 0294df2f1f (MERGE .. WHEN NOT
MATCHED BY SOURCE), with a few additional tests. No code changes, just
keeping it up to date.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Sat, 30 Mar 2024 at 15:31, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Rebased version attached, on top of 0294df2f1f (MERGE .. WHEN NOT
> MATCHED BY SOURCE), with a few additional tests. No code changes, just
> keeping it up to date.
>

New version attached, rebased following the revert of 87985cc925, but
also with a few other changes:

I've added a note to rules.sgml explaining how this interacts with rules.

I've redone the way old/new system attributes are evaluated -- the
previous code changed slot_getsysattr() to try to decide when to
return NULL, but that didn't work correctly if the CTID was invalid
but non-NULL, something I hadn't anticipated, but which shows up in
the new tests added by 6572bd55b0. Instead, ExecEvalSysVar() now
checks if the OLD/NEW row exists, so there's no need to change
slot_getsysattr(), which seems much better.

I've added a new elog() error check to
adjust_appendrel_attrs_mutator(), similar to the existing one for
varnullingrels, to report if we ever attempt to apply a non-default
varreturningtype to a non-Var, which should never be possible, but
seems worth checking. (non-Var expressions should only occur if we've
flattened a UNION ALL query, so shouldn't apply to the target relation
of a data-modifying query with RETURNING.)

The previous patch added a new rewriter function
ReplaceReturningVarsFromTargetList() to rewrite the RETURNING list,
but that duplicated a lot of code from ReplaceVarsFromTargetList(), so
I've now just merged them together, which looks a lot neater.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Wed, 26 Jun 2024 at 12:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> I've added a new elog() error check to
> adjust_appendrel_attrs_mutator(), similar to the existing one for
> varnullingrels, to report if we ever attempt to apply a non-default
> varreturningtype to a non-Var, which should never be possible, but
> seems worth checking. (non-Var expressions should only occur if we've
> flattened a UNION ALL query, so shouldn't apply to the target relation
> of a data-modifying query with RETURNING.)
>

New version attached, updating an earlier comment in
adjust_appendrel_attrs_mutator() that I had missed.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Sat, Jul 13, 2024 at 1:22 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Wed, 26 Jun 2024 at 12:18, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> >
> > I've added a new elog() error check to
> > adjust_appendrel_attrs_mutator(), similar to the existing one for
> > varnullingrels, to report if we ever attempt to apply a non-default
> > varreturningtype to a non-Var, which should never be possible, but
> > seems worth checking. (non-Var expressions should only occur if we've
> > flattened a UNION ALL query, so shouldn't apply to the target relation
> > of a data-modifying query with RETURNING.)
> >
>
> New version attached, updating an earlier comment in
> adjust_appendrel_attrs_mutator() that I had missed.
>


hi.
I have some minor questions, but overall it just works.

@@ -4884,6 +5167,18 @@ ExecEvalSysVar(ExprState *state, ExprEva
 {
  Datum d;

+ /* if OLD/NEW row doesn't exist, OLD/NEW system attribute is NULL */
+ if ((op->d.var.varreturningtype == VAR_RETURNING_OLD &&
+ state->flags & EEO_FLAG_OLD_IS_NULL) ||
+ (op->d.var.varreturningtype == VAR_RETURNING_NEW &&
+ state->flags & EEO_FLAG_NEW_IS_NULL))
+ {
+ *op->resvalue = (Datum) 0;
+ *op->resnull = true;
+
+ return;
+ }
+
in ExecEvalSysVar, we can add Asserts
Assert(state->flags & EEO_FLAG_HAS_OLD || state->flags & EEO_FLAG_HAS_NEW);
if I understand it correctly.


in make_modifytable,
contain_vars_returning_old_or_new((Node *) root->parse->returningList))
this don't need to go through the loop
```
foreach(lc, resultRelations)
```


+ * In addition, the caller must provide result_relation, the index of the
+ * target relation for an INSERT/UPDATE/DELETE/MERGE.  This is needed to
+ * handle any OLD/NEW RETURNING list Vars referencing target_varno.  When such
+ * Vars are expanded, varreturningtype is copied onto any replacement Vars
+ * that reference result_relation.  In addition, if the replacement expression
+ * from the targetlist is not simply a Var referencing result_relation, we
+ * wrap it in a ReturningExpr node, to force it to be NULL if the OLD/NEW row
+ * doesn't exist.
+ *
  * outer_hasSubLinks works the same as for replace_rte_variables().
  */
@@ -1657,6 +1736,7 @@ typedef struct
 {
  RangeTblEntry *target_rte;
  List   *targetlist;
+ int result_relation;
  ReplaceVarsNoMatchOption nomatch_option;
  int nomatch_varno;
 } ReplaceVarsFromTargetList_context;

"to force it to be NULL if the OLD/NEW row doesn't exist."
I am slightly confused.
i think you mean: "to force it to be NULL if the OLD/NEW row will be
resulting null."
For INSERT,  the old row is all null, for DELETE, the new row is all null.



in sql-update.html
"An unqualified column name or * causes new values to be returned. The
same applies to columns qualified using the target table name or
alias. "
"The same", I think, refers "causes new values to be returned", but I
i am not so sure.
(apply to sql-insert.sql-delete, sql-merge).



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Fri, 19 Jul 2024 at 01:11, jian he <jian.universality@gmail.com> wrote:
>
> hi.
> I have some minor questions, but overall it just works.

Thanks for the review!

> in ExecEvalSysVar, we can add Asserts
> Assert(state->flags & EEO_FLAG_HAS_OLD || state->flags & EEO_FLAG_HAS_NEW);
> if I understand it correctly.

OK. I think it's probably worth coding defensively here, so I have
added more specific Asserts, based on the actual varreturningtype (and
I didn't really like that old "if" condition anyway, so I've rewritten
it as a switch).

> in make_modifytable,
> contain_vars_returning_old_or_new((Node *) root->parse->returningList))
> this don't need to go through the loop
> ```
> foreach(lc, resultRelations)
> ```

Good point. I agree, it's worth ensuring that we don't call
contain_vars_returning_old_or_new() multiple times (or at all, if we
don't need to).

> + * In addition, the caller must provide result_relation, the index of the
> + * target relation for an INSERT/UPDATE/DELETE/MERGE.  This is needed to
> + * handle any OLD/NEW RETURNING list Vars referencing target_varno.  When such
> + * Vars are expanded, varreturningtype is copied onto any replacement Vars
> + * that reference result_relation.  In addition, if the replacement expression
> + * from the targetlist is not simply a Var referencing result_relation, we
> + * wrap it in a ReturningExpr node, to force it to be NULL if the OLD/NEW row
> + * doesn't exist.
> + *
> I am slightly confused.
> i think you mean: "to force it to be NULL if the OLD/NEW row will be
> resulting null."
> For INSERT,  the old row is all null, for DELETE, the new row is all null.

No, I think it's slightly more accurate to say that the old row
doesn't exist for INSERT and the new row doesn't exist for DELETE. The
end result is that all the values will be NULL, so in that sense it's
the same as the old/new row being NULL, or being an all-NULL tuple.

> in sql-update.html
> "An unqualified column name or * causes new values to be returned. The
> same applies to columns qualified using the target table name or
> alias. "
> "The same", I think, refers "causes new values to be returned", but I
> i am not so sure.
> (apply to sql-insert.sql-delete, sql-merge).

OK, I have rewritten and expanded upon that a bit to try to make it
clearer. I also decided that this discussion really belongs in the
output_expression description, rather than under output_alias.

Thanks again for the review. Updated patch attached.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Fri, 19 Jul 2024 at 12:55, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Thanks again for the review. Updated patch attached.
>

Trivial rebase, following c7301c3b6f.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Mon, 29 Jul 2024 at 11:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Trivial rebase, following c7301c3b6f.
>

Rebased version, forced by a7f107df2b. Evaluating the input parameters
of correlated SubPlans in the referencing ExprState simplifies this
patch in a couple of places, since it no longer has to worry about
copying ExprState flags to a new ExprState.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Thu, Aug 1, 2024 at 7:33 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Mon, 29 Jul 2024 at 11:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> >
> > Trivial rebase, following c7301c3b6f.
> >
>
> Rebased version, forced by a7f107df2b. Evaluating the input parameters
> of correlated SubPlans in the referencing ExprState simplifies this
> patch in a couple of places, since it no longer has to worry about
> copying ExprState flags to a new ExprState.
>

hi. some minor issues.

    saveOld = changingPart && resultRelInfo->ri_projectReturning &&
        resultRelInfo->ri_projectReturning->pi_state.flags & EEO_FLAG_HAS_OLD;
    if (resultRelInfo->ri_projectReturning && (processReturning || saveOld))
    {
    }

"saveOld" imply "resultRelInfo->ri_projectReturning"
we can simplified it as

    if (processReturning || saveOld))
    {
    }



for projectReturning->pi_state.flags,
we don't use EEO_FLAG_OLD_IS_NULL, EEO_FLAG_NEW_IS_NULL
in ExecProcessReturning, we can do the following way.


    /* Make old/new tuples available to ExecProject, if required */
    if (oldSlot)
        econtext->ecxt_oldtuple = oldSlot;
    else if (projectReturning->pi_state.flags & EEO_FLAG_HAS_OLD)
        econtext->ecxt_oldtuple = ExecGetAllNullSlot(estate, resultRelInfo);
    else
        econtext->ecxt_oldtuple = NULL; /* No references to OLD columns */

    if (newSlot)
        econtext->ecxt_newtuple = newSlot;
    else if (projectReturning->pi_state.flags & EEO_FLAG_HAS_NEW)
        econtext->ecxt_newtuple = ExecGetAllNullSlot(estate, resultRelInfo);
    else
        econtext->ecxt_newtuple = NULL; /* No references to NEW columns */

    /*
     * Tell ExecProject whether or not the OLD/NEW rows exist (needed for any
     * ReturningExpr nodes).
     */
    if (oldSlot == NULL)
        projectReturning->pi_state.flags |= EEO_FLAG_OLD_IS_NULL;
    else
        projectReturning->pi_state.flags &= ~EEO_FLAG_OLD_IS_NULL;

    if (newSlot == NULL)
        projectReturning->pi_state.flags |= EEO_FLAG_NEW_IS_NULL;
    else
        projectReturning->pi_state.flags &= ~EEO_FLAG_NEW_IS_NULL;


@@ -2620,6 +2620,13 @@ transformWholeRowRef(ParseState *pstate,
  * point, there seems no harm in expanding it now rather than during
  * planning.
  *
+ * Note that if the nsitem is an OLD/NEW alias for the target RTE (as can
+ * appear in a RETURNING list), its alias won't match the target RTE's
+ * alias, but we still want to make a whole-row Var here rather than a
+ * RowExpr, for consistency with direct references to the target RTE, and
+ * so that any dropped columns are handled correctly.  Thus we also check
+ * p_returning_type here.
+ *
makeWholeRowVar and subroutines only related to pg_type, but dropped
column info is in pg_attribute.
I don't understand "so that any dropped columns are handled correctly".


ExecEvalSysVar, slot_getsysattr we have "Assert(attnum < 0);"
but
ExecEvalSysVar, while rowIsNull is true, we didn't do "Assert(attnum < 0);"



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Fri, 2 Aug 2024 at 08:25, jian he <jian.universality@gmail.com> wrote:
>
>     if (resultRelInfo->ri_projectReturning && (processReturning || saveOld))
>     {
>     }
>
> "saveOld" imply "resultRelInfo->ri_projectReturning"
> we can simplified it as
>
>     if (processReturning || saveOld))
>     {
>     }
>

No, because processReturning can be true when
resultRelInfo->ri_projectReturning is NULL (no RETURNING list). So we
do still need to check that resultRelInfo->ri_projectReturning is
non-NULL.

> for projectReturning->pi_state.flags,
> we don't use EEO_FLAG_OLD_IS_NULL, EEO_FLAG_NEW_IS_NULL
> in ExecProcessReturning, we can do the following way.
>
>     /* Make old/new tuples available to ExecProject, if required */
>     if (oldSlot)
>         econtext->ecxt_oldtuple = oldSlot;
>     else if (projectReturning->pi_state.flags & EEO_FLAG_HAS_OLD)
>         econtext->ecxt_oldtuple = ExecGetAllNullSlot(estate, resultRelInfo);
>     else
>         econtext->ecxt_oldtuple = NULL; /* No references to OLD columns */
>
>     if (newSlot)
>         econtext->ecxt_newtuple = newSlot;
>     else if (projectReturning->pi_state.flags & EEO_FLAG_HAS_NEW)
>         econtext->ecxt_newtuple = ExecGetAllNullSlot(estate, resultRelInfo);
>     else
>         econtext->ecxt_newtuple = NULL; /* No references to NEW columns */
>
>     /*
>      * Tell ExecProject whether or not the OLD/NEW rows exist (needed for any
>      * ReturningExpr nodes).
>      */
>     if (oldSlot == NULL)
>         projectReturning->pi_state.flags |= EEO_FLAG_OLD_IS_NULL;
>     else
>         projectReturning->pi_state.flags &= ~EEO_FLAG_OLD_IS_NULL;
>
>     if (newSlot == NULL)
>         projectReturning->pi_state.flags |= EEO_FLAG_NEW_IS_NULL;
>     else
>         projectReturning->pi_state.flags &= ~EEO_FLAG_NEW_IS_NULL;
>

I'm not sure I understand your point. It's true that
EEO_FLAG_OLD_IS_NULL and EEO_FLAG_NEW_IS_NULL aren't used directly in
ExecProcessReturning(), but they are used in stuff called from
ExecProject().

If the point was just to swap those 2 code blocks round, then OK, I
guess maybe it reads a little better that way round, though it doesn't
really make any difference either way.

I did notice that that comment should mention that ExecEvalSysVar()
also uses these flags, so I've updated it to do so.

> @@ -2620,6 +2620,13 @@ transformWholeRowRef(ParseState *pstate,
>   * point, there seems no harm in expanding it now rather than during
>   * planning.
>   *
> + * Note that if the nsitem is an OLD/NEW alias for the target RTE (as can
> + * appear in a RETURNING list), its alias won't match the target RTE's
> + * alias, but we still want to make a whole-row Var here rather than a
> + * RowExpr, for consistency with direct references to the target RTE, and
> + * so that any dropped columns are handled correctly.  Thus we also check
> + * p_returning_type here.
> + *
> makeWholeRowVar and subroutines only related to pg_type, but dropped
> column info is in pg_attribute.
> I don't understand "so that any dropped columns are handled correctly".
>

The nsitem contains references to dropped columns, so if you expanded
it as a RowExpr, you'd end up with mismatched columns and it would
fail (somewhere under ParseFuncOrColumn(), from transformColumnRef(),
I think). There's a regression test case in returning.sql that covers
that.

> ExecEvalSysVar, slot_getsysattr we have "Assert(attnum < 0);"
> but
> ExecEvalSysVar, while rowIsNull is true, we didn't do "Assert(attnum < 0);"

I don't see much value in that, since we aren't going to evaluate the
attribute if the old/new row is null.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Fri, Aug 2, 2024 at 6:13 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Fri, 2 Aug 2024 at 08:25, jian he <jian.universality@gmail.com> wrote:
> >
> >     if (resultRelInfo->ri_projectReturning && (processReturning || saveOld))
> >     {
> >     }
> >
> > "saveOld" imply "resultRelInfo->ri_projectReturning"
> > we can simplified it as
> >
> >     if (processReturning || saveOld))
> >     {
> >     }
> >
>
> No, because processReturning can be true when
> resultRelInfo->ri_projectReturning is NULL (no RETURNING list). So we
> do still need to check that resultRelInfo->ri_projectReturning is
> non-NULL.
>
> > for projectReturning->pi_state.flags,
> > we don't use EEO_FLAG_OLD_IS_NULL, EEO_FLAG_NEW_IS_NULL
> > in ExecProcessReturning, we can do the following way.
> >
> >     /* Make old/new tuples available to ExecProject, if required */
> >     if (oldSlot)
> >         econtext->ecxt_oldtuple = oldSlot;
> >     else if (projectReturning->pi_state.flags & EEO_FLAG_HAS_OLD)
> >         econtext->ecxt_oldtuple = ExecGetAllNullSlot(estate, resultRelInfo);
> >     else
> >         econtext->ecxt_oldtuple = NULL; /* No references to OLD columns */
> >
> >     if (newSlot)
> >         econtext->ecxt_newtuple = newSlot;
> >     else if (projectReturning->pi_state.flags & EEO_FLAG_HAS_NEW)
> >         econtext->ecxt_newtuple = ExecGetAllNullSlot(estate, resultRelInfo);
> >     else
> >         econtext->ecxt_newtuple = NULL; /* No references to NEW columns */
> >
> >     /*
> >      * Tell ExecProject whether or not the OLD/NEW rows exist (needed for any
> >      * ReturningExpr nodes).
> >      */
> >     if (oldSlot == NULL)
> >         projectReturning->pi_state.flags |= EEO_FLAG_OLD_IS_NULL;
> >     else
> >         projectReturning->pi_state.flags &= ~EEO_FLAG_OLD_IS_NULL;
> >
> >     if (newSlot == NULL)
> >         projectReturning->pi_state.flags |= EEO_FLAG_NEW_IS_NULL;
> >     else
> >         projectReturning->pi_state.flags &= ~EEO_FLAG_NEW_IS_NULL;
> >
>
> I'm not sure I understand your point. It's true that
> EEO_FLAG_OLD_IS_NULL and EEO_FLAG_NEW_IS_NULL aren't used directly in
> ExecProcessReturning(), but they are used in stuff called from
> ExecProject().
>
> If the point was just to swap those 2 code blocks round, then OK, I
> guess maybe it reads a little better that way round, though it doesn't
> really make any difference either way.

sorry for confusion. I mean "swap those 2 code blocks round".
I think it will make it more readable, because you first check
projectReturning->pi_state.flags
with EEO_FLAG_HAS_NEW, EEO_FLAG_HAS_OLD
then change it.


> I did notice that that comment should mention that ExecEvalSysVar()
> also uses these flags, so I've updated it to do so.
>
> > @@ -2620,6 +2620,13 @@ transformWholeRowRef(ParseState *pstate,
> >   * point, there seems no harm in expanding it now rather than during
> >   * planning.
> >   *
> > + * Note that if the nsitem is an OLD/NEW alias for the target RTE (as can
> > + * appear in a RETURNING list), its alias won't match the target RTE's
> > + * alias, but we still want to make a whole-row Var here rather than a
> > + * RowExpr, for consistency with direct references to the target RTE, and
> > + * so that any dropped columns are handled correctly.  Thus we also check
> > + * p_returning_type here.
> > + *
> > makeWholeRowVar and subroutines only related to pg_type, but dropped
> > column info is in pg_attribute.
> > I don't understand "so that any dropped columns are handled correctly".
> >
>
> The nsitem contains references to dropped columns, so if you expanded
> it as a RowExpr, you'd end up with mismatched columns and it would
> fail (somewhere under ParseFuncOrColumn(), from transformColumnRef(),
> I think). There's a regression test case in returning.sql that covers
> that.
play around with it, get it.

if (nsitem->p_names == nsitem->p_rte->eref ||
        nsitem->p_returning_type != VAR_RETURNING_DEFAULT)
else
{
        expandRTE(nsitem->p_rte, nsitem->p_rtindex, sublevels_up,
                  nsitem->p_returning_type, location, false, NULL, &fields);
}
The ELSE  branch expandRTE include_dropped argument is false.
that makes the ELSE  branch unable to deal with dropped columns.



took me a while to understand the changes in rewriteHandler.c, rewriteManip.c
rule over updateable view still works, but I didn't check closely with
rewriteRuleAction.
i think I understand rewriteTargetView and subroutines.

 * In addition, the caller must provide result_relation, the index of the
 * target relation for an INSERT/UPDATE/DELETE/MERGE.  This is needed to
 * handle any OLD/NEW RETURNING list Vars referencing target_varno.  When such
 * Vars are expanded, varreturningtype is copied onto any replacement Vars
 * that reference result_relation.  In addition, if the replacement expression
 * from the targetlist is not simply a Var referencing result_relation, we
 * wrap it in a ReturningExpr node, to force it to be NULL if the OLD/NEW row
 * doesn't exist.

"the index of the target relation for an INSERT/UPDATE/DELETE/MERGE",
here, "target relation" I think people may be confused whether it
refers to view relation or the base relation.
I think here the target relation is the base relation (rtekind == RTE_RELATION)


" to force it to be NULL if the OLD/NEW row doesn't exist."
i think this happen in execExpr.c?
maybe
" to force it to be NULL if the OLD/NEW row doesn't exist, see execExpr.c"

overall, looks good to me.



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Mon, 5 Aug 2024 at 12:46, jian he <jian.universality@gmail.com> wrote:
>
> took me a while to understand the changes in rewriteHandler.c, rewriteManip.c
> rule over updateable view still works, but I didn't check closely with
> rewriteRuleAction.
> i think I understand rewriteTargetView and subroutines.
>
>  * In addition, the caller must provide result_relation, the index of the
>  * target relation for an INSERT/UPDATE/DELETE/MERGE.  This is needed to
>  * handle any OLD/NEW RETURNING list Vars referencing target_varno.  When such
>  * Vars are expanded, varreturningtype is copied onto any replacement Vars
>  * that reference result_relation.  In addition, if the replacement expression
>  * from the targetlist is not simply a Var referencing result_relation, we
>  * wrap it in a ReturningExpr node, to force it to be NULL if the OLD/NEW row
>  * doesn't exist.
>
> "the index of the target relation for an INSERT/UPDATE/DELETE/MERGE",
> here, "target relation" I think people may be confused whether it
> refers to view relation or the base relation.
> I think here the target relation is the base relation (rtekind == RTE_RELATION)

Yes, it's the result relation in the rewritten query. I've updated
that comment to try to make that clearer.

Basically, if a replacement Var refers to the new result relation in
the rewritten query, then its varreturningtype needs to be set
correctly. Otherwise, if it refers to some other relation, its
varreturningtype shouldn't be changed, but it does need to be wrapped
in a ReturningExpr node, if the original Var had a non-default
varreturningtype, so that it evaluates as NULL if the old/new row
doesn't exist.

> " to force it to be NULL if the OLD/NEW row doesn't exist."
> i think this happen in execExpr.c?
> maybe
> " to force it to be NULL if the OLD/NEW row doesn't exist, see execExpr.c"

OK, I've updated it to just say that this causes the executor to
return NULL if the old/new row doesn't exist. There are multiple
places in the executor that actually make that happen, so it doesn't
make sense to just refer to one place.

> overall, looks good to me.

Thanks for reviewing.

I'm pretty happy with the patch now, but I was just thinking about the
wholerow case a little more, and I think it's worth changing the way
that's handled.

Previously, if you wrote something like "RETURNING old", and the old
row didn't exist, it would return an all-NULL record (displayed as
something like '(,,,,)'), but I don't think that's really right. I
think it should actually return NULL. I think that's more consistent
with the way "non-existent" is generally handled, for example in a
query like "SELECT t1, t2 FROM t1 OUTER JOIN t2 ON ...".

It's pretty trivial, but it does involve changing code in 2 places
(the first for regular tables, and the second for updatable views):

1. ExecEvalWholeRowVar() now checks EEO_FLAG_OLD_IS_NULL and
EEO_FLAG_NEW_IS_NULL. This makes it more consistent with
ExecEvalSysVar(), so I added the same Asserts.

2. ReplaceVarsFromTargetList() now wraps the RowExpr node created in
the wholerow case in a ReturningExpr. That's consistent with the
function's comment: "if the replacement expression from the targetlist
is not simply a Var referencing result_relation, it is wrapped in a
ReturningExpr node".

Both those changes seem quite natural and consistent, and I think the
resulting test output looks much nicer.

Regards,
Dean

Attachment

Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
hi.

took me a while to understand how the returning clause Var nodes
correctly reference the relations RT index.
mainly in set_plan_references, set_plan_refs and
set_returning_clause_references.

do you think we need do something in
set_returning_clause_references->build_tlist_index_other_vars
to make sure that
if the topplan->targetlist associated Var's varreturningtype is not default,
then the var->varno must equal to resultRelation.
because set_plan_references is almost at the end of standard_planner,
before that things may change.



    /*
     * Tell ExecProject whether or not the OLD/NEW rows exist (needed for any
     * ReturningExpr nodes and ExecEvalSysVar).
     */
    if (oldSlot == NULL)
        projectReturning->pi_state.flags |= EEO_FLAG_OLD_IS_NULL;
    else
        projectReturning->pi_state.flags &= ~EEO_FLAG_OLD_IS_NULL;
    if (newSlot == NULL)
        projectReturning->pi_state.flags |= EEO_FLAG_NEW_IS_NULL;
    else
        projectReturning->pi_state.flags &= ~EEO_FLAG_NEW_IS_NULL;

ExecEvalWholeRowVar also uses this information, comment needs to be
slightly adjusted?



simialr to
https://git.postgresql.org/cgit/postgresql.git/commit/?id=2bb969f3998489e5dc4fe9f2a61185b43581975d
do you think it's necessary to
                        errmsg("%s cannot be specified multiple times", "NEW"),
                        errmsg("%s cannot be specified multiple times", "OLD"),



+ /*
+ * Scan RETURNING WITH(...) options for OLD/NEW alias names.  Complain if
+ * there is any conflict with existing relations.
+ */
+ foreach_node(ReturningOption, option, returningClause->options)
+ {
+ if (refnameNamespaceItem(pstate, NULL, option->name, -1, NULL))
+ ereport(ERROR,
+ errcode(ERRCODE_DUPLICATE_ALIAS),
+ errmsg("table name \"%s\" specified more than once",
+   option->name),
+ parser_errposition(pstate, option->location));
+
+ if (option->isNew)
+ {
+ if (qry->returningNew != NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("NEW cannot be specified multiple times"),
+ parser_errposition(pstate, option->location));
+ qry->returningNew = option->name;
+ }
+ else
+ {
+ if (qry->returningOld != NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("OLD cannot be specified multiple times"),
+ parser_errposition(pstate, option->location));
+ qry->returningOld = option->name;
+ }
+ }
+
+ /*
+ * If no OLD/NEW aliases specified, use "old"/"new" unless masked by
+ * existing relations.
+ */
+ if (qry->returningOld == NULL &&
+ refnameNamespaceItem(pstate, NULL, "old", -1, NULL) == NULL)
+ qry->returningOld = "old";
+ if (qry->returningNew == NULL &&
+ refnameNamespaceItem(pstate, NULL, "new", -1, NULL) == NULL)
+ qry->returningNew = "new";
+
+ /*
+ * Add the OLD and NEW aliases to the query namespace, for use in
+ * expressions in the RETURNING list.
+ */
+ save_nslen = list_length(pstate->p_namespace);
+ if (qry->returningOld)
+ addNSItemForReturning(pstate, qry->returningOld, VAR_RETURNING_OLD);
+ if (qry->returningNew)
+ addNSItemForReturning(pstate, qry->returningNew, VAR_RETURNING_NEW);


the only case we don't do addNSItemForReturning is when there is
really a RTE called "new" or "old".
Even if the returning list doesn't specify "new" or "old", like
"returning 1", we still do addNSItemForReturning.
Do you think it's necessary in ReturningClause add two booleans
"hasold", "hasnew".
so if becomes
+ if (qry->returningOld && hasold)
+ addNSItemForReturning(pstate, qry->returningOld, VAR_RETURNING_OLD);
+ if (qry->returningNew && hasnew)
+ addNSItemForReturning(pstate, qry->returningNew, VAR_RETURNING_NEW);

that means in gram.y
returning_clause:
            RETURNING returning_with_clause target_list
                {
                    ReturningClause *n = makeNode(ReturningClause);

                    n->options = $2;
                    n->exprs = $3;
                    $$ = n;
                }

n->exprs will have 3 branches: NEW.expr, OLD.expr, expr.
I guess overall we can save some cycles?



Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Fri, Aug 16, 2024 at 6:39 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>

in Var comments:

 * varlevelsup is greater than zero in Vars that represent outer references.
 * Note that it affects the meaning of all of varno, varnullingrels, and
 * varnosyn, all of which refer to the range table of that query level.

Does this need to change accordingly?

i found there is no privilege test in src/test/regress/sql/updatable_views.sql?
Do we need to add some tests?

Other than that, I didn't find any issue.



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Wed, 21 Aug 2024 at 10:07, jian he <jian.universality@gmail.com> wrote:
>
> in Var comments:
>
>  * varlevelsup is greater than zero in Vars that represent outer references.
>  * Note that it affects the meaning of all of varno, varnullingrels, and
>  * varnosyn, all of which refer to the range table of that query level.
>
> Does this need to change accordingly?
>

No, I don't think so. varlevelsup doesn't directly change the meaning
of varreturningtype, any more than it changes the meaning of, say,
varattno. The point of that comment is that the fields varno,
varnullingrels, and varnosyn are (or contain) the range table indexes
of relations, which by themselves are insufficient to identify the
relations -- varlevelsup must be used in combination with those fields
to find the relations they refer to.

> i found there is no privilege test in src/test/regress/sql/updatable_views.sql?
> Do we need to add some tests?
>

I don't think so, because varreturningtype doesn't affect any
permissions checks.

> Other than that, I didn't find any issue.

Thanks for reviewing.

If there are no other issues, I think this is probably ready for commit.

Regards,
Dean



Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
typedef struct ReturningOption
{
    NodeTag        type;
    bool        isNew;
    char       *name;
    int            location;
} ReturningOption;
location should be type ParseLoc?


in process_sublinks_mutator

    else if (IsA(node, ReturningExpr))
    {
        if (((ReturningExpr *) node)->retlevelsup > 0)
            return node;
    }
this part doesn't have a coverage test?
the following is the minimum tests i come up with:

create table s (a int, b int);
create view sv as select * from s;
explain insert into sv values(1,2) returning (select new from
(values((select new))));


explain insert into sv values(1,2) returning (select new from (((select new))));
won't touch the changes we did.
but these two explain output plans are the same.



Re: Adding OLD/NEW support to RETURNING

From
jian he
Date:
On Mon, Oct 14, 2024 at 7:03 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> > > typedef struct ReturningOption
> > > {
> > >     NodeTag        type;
> > >     bool        isNew;
> > >     char       *name;
> > >     int            location;
> > > } ReturningOption;
>
> Thinking about that struct some more, I think "isNew" is better done
> as an enum, since this is meant to be a generic option. So even though
> it might never have more than 2 possible values, I think it's neater
> done that way.
>

typedef struct ReturningOption
{
    NodeTag        type;
    ReturningOptionKind option; /* specified option */
    char       *value;            /* option's value */
    ParseLoc    location;        /* token location, or -1 if unknown */
} ReturningOption;



@@ -4304,6 +4332,16 @@ raw_expression_tree_walker_impl(Node *no
  return true;
  }
  break;
+ case T_ReturningClause:
+ {
+ ReturningClause *returning = (ReturningClause *) node;
+
+ if (WALK(returning->options))
+ return true;
+ if (WALK(returning->exprs))
+ return true;
+ }
+ break;


+ if (WALK(returning->options))
+ return true;
T_ReturningOption is primitive, so we only need to
"if (WALK(returning->exprs))"?



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Mon, 14 Oct 2024 at 16:34, jian he <jian.universality@gmail.com> wrote:
>
> typedef struct ReturningOption
> {
>     NodeTag        type;
>     ReturningOptionKind option; /* specified option */
>     char       *value;            /* option's value */
>     ParseLoc    location;        /* token location, or -1 if unknown */
> } ReturningOption;
>
> @@ -4304,6 +4332,16 @@ raw_expression_tree_walker_impl(Node *no
>   return true;
>   }
>   break;
> + case T_ReturningClause:
> + {
> + ReturningClause *returning = (ReturningClause *) node;
> +
> + if (WALK(returning->options))
> + return true;
> + if (WALK(returning->exprs))
> + return true;
> + }
> + break;
>
> + if (WALK(returning->options))
> + return true;
> T_ReturningOption is primitive, so we only need to
> "if (WALK(returning->exprs))"?

No, it still needs to walk the options so that it will call the
callback for each option. The fact that T_ReturningOption is primitive
doesn't change that, it just means that there is no more structure
*below* a ReturningOption that needs to be traversed. The
ReturningOption itself still needs to be traversed. For example,
imagine you wanted to use raw_expression_tree_walker() to print out
the whole structure of a raw parse tree. You'd want your printing
callback to be called for every node, including the ReturningOption
nodes.

Regards,
Dean



Re: Adding OLD/NEW support to RETURNING

From
Dean Rasheed
Date:
On Tue, 29 Oct 2024 at 13:05, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Rebased version attached. No other changes.
>

In the light of 7f798aca1d5df290aafad41180baea0ae311b4ee, I guess I
should remove various (void *) casts that this patch adds, copied from
old code. I'll wait a while though, just in case the buildfarm objects
to that other patch.

Regards,
Dean