Thread: proposal and patch : support INSERT INTO...RETURNING with partitioned table using rule

-----------------------------------
Problem I'm trying to solve:
    For partitioned tables,  make it possible to use RETURNING clause on INSERT INTO
           together with DO INSTEAD rule

[  Note  -  wherever I say INSERT I also mean UPDATE and DELETE ]

-----------------------------------
Current behaviour :

    An INSERT which has a RETURNING clause and which is to be rewritten based on
    a rule will be accepted if the rule is an "unconditional DO INSTEAD".
    In general I believe "unconditional" means "no WHERE clause", but in practice
    if the rule is of the form
       CREATE RULE insert_part_history as ON INSERT to history \
         DO INSTEAD SELECT history_insert_partitioned(NEW) returning NEW.id
    this is treated as conditional and the query is rejected.   

    Testcase:
    A table T is partitioned and has two or more columns,  one of which
    is an id column declared as
         id bigint DEFAULT nextval('history_id_seq'::regclass) NOT NULL
    and the application issues
      "INSERT into history  (column-list which excludes id)  values (....) RETURNING id"

    I can get the re-direction of the INSERT *without* RETURNING to work using
    either trigger or rule, in which the trigger/rule invokes a procedure, but
    whichever way I do it, I could not get this RETURNING clause to work.

    For a trigger,
      the INSERT ... RETURNING was accepted but returned no rows, (as I would
      expect), and for the RULE, the INSERT ... RETURNING was rejected with :
 
      ERROR:  cannot perform INSERT RETURNING on relation "history"
      HINT:  You need an unconditional ON INSERT DO INSTEAD rule with a RETURNING clause.

      but this hint was not much help,  since :

   For a rule,
     CREATE RULE insert_part_history as ON INSERT to history \
         DO INSTEAD SELECT history_insert_partitioned(NEW) returning NEW.id
    ERROR:  syntax error at or near "returning"
    LINE 1: ...DO INSTEAD SELECT history_insert_partitioned(NEW) returning ...

    Here the function history_insert_partitioned is something like
        CREATE FUNCTION history_insert_partitioned( NEW public.history) RETURNS BIGINT AS $$
            DECLARE
               ...
            BEGIN
               ...
                 < acccess NEW fields e.g. timestamp>
                 <  construct partitioned table name>
                  < EXECUTE 'INSERT INTO ' partitioned table
               ...
                RETURN history_id;
            END;
            $$
            LANGUAGE plpgsql

-----------------------------------
Some references to discussion of this requirement :
  .  http://wiki.postgresql.org/wiki/Todo
     item "Make it possible to use RETURNING together with conditional DO INSTEAD rules,
           such as for partitioning setups"
  .  http://archives.postgresql.org/pgsql-general/2012-06/msg00377.php
  .  http://archives.postgresql.org/pgsql-general/2010-12/msg00542.php
  .  http://acodapella.blogspot.it/2011/06/hibernate-postgresql-table-partitioning.html

-----------------------------------
Proposal:
  .  I propose to extend the rule system to recognize cases where the INSERT query specifies
     RETURNING and the rule promises to return a row,  and to then permit this query to run
     and return the expected row.   In effect,  to widen the definition of "unconditional"
     to handle cases such as my testcase.
  .  One comment is that all the infrastructure for returning one row from the re-written query
     is already present in the code,  and the non-trivial question is how to ensure the
     new design is safe in preventing any rewrite that actually would not return a row.
  .  In this patch,   I have chosen to make use of the LIMIT clause  -
     I add a side-effect implication to a LIMIT clause when it occurs in a rewrite of an INSERT
     to mean "this rule will return a row".
     So,  with my patch, same testcase,  same function history_insert_partitioned and new rule
        CREATE RULE insert_part_history as ON INSERT to history \
               DO INSTEAD SELECT history_insert_partitioned(NEW) LIMIT 1
     the INSERT is accepted and returns the id.
     This use of LIMIT clause is probably contentious but I wished to avoid introducing new
     SQL syntax,  and the LIMIT clause does have a connotation of returning rows.

-----------------------------------
I attach patch based on clone of postgresql.git as of yesterday (120619-145751 EST)
I have tested the patch with INSERT and UPDATE  (not tested with DELETE but should work).
The patch is not expected to be final but just to show how I did it.

John

Attachment
John Lumby <johnlumby@hotmail.com> wrote:
> I attach patch based on clone of postgresql.git as of yesterday 
Please read about the CommitFest process:
http://wiki.postgresql.org/wiki/CommitFest
and add your patch to the open CF:
https://commitfest.postgresql.org/action/commitfest_view/open
This will ensure that the patch doesn't get lost before the next review
cycle starts.
-Kevin


On Wed, Jun 20, 2012 at 12:24 PM, John Lumby <johnlumby@hotmail.com> wrote:
>     An INSERT which has a RETURNING clause and which is to be rewritten based on
>     a rule will be accepted if the rule is an "unconditional DO INSTEAD".
>     In general I believe "unconditional" means "no WHERE clause", but in practice
>     if the rule is of the form
>        CREATE RULE insert_part_history as ON INSERT to history \
>          DO INSTEAD SELECT history_insert_partitioned(NEW) returning NEW.id
>     this is treated as conditional and the query is rejected.

This isn't rejected because the query is treated as condition; it's
rejected because it's not valid syntax.  A SELECT query can't have a
RETURNING clause, because the target list (i.e. the part that
immediately follows the SELECT) already serves that purpose.  The fact
that it's in a CREATE RULE statement is irrelevant:

rhaas=# select 4 returning 3;
ERROR:  syntax error at or near "returning"
LINE 1: select 4 returning 3;                ^

>   .  I propose to extend the rule system to recognize cases where the INSERT query specifies
>      RETURNING and the rule promises to return a row,  and to then permit this query to run
>      and return the expected row.   In effect,  to widen the definition of "unconditional"
>      to handle cases such as my testcase.

That already (kind of) works:

rhaas=# create table history (id bigserial, name text);NOTICE:  CREATE
TABLE will create implicit sequence "history_id_seq" for serial column
"history.id"
CREATE TABLE
rhaas=# create table history1 () inherits (history);
CREATE TABLE
rhaas=# create rule history_insert as on insert to history do instead
insert into history1 (id, name) values (NEW.id, NEW.name || ' is
awesome!') returning 17::bigint, 'cheeze whiz'::text;
CREATE RULE
rhaas=# insert into history (name) values ('Linus') returning id,
name; id |    name
----+-------------17 | cheeze whiz
(1 row)

INSERT 0 1
rhaas=# select * from history;id |       name
----+------------------- 1 | Linus is awesome!
(1 row)

I do notice that the RETURNING clause of the INSERT can't reference
NEW, which seems like a restriction that we probably ought to lift,
but it doesn't seem to have much to do with your patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


First,  apologies for taking so long to reply to your post.

On Fri, 22 Jun 2012 09:55:13, Robert Haas wrote:

> On Wed, Jun 20, 2012 at 12:24 PM, John Lumby <johnlumby(at)hotmail(dot)com> wrote:
> >     An INSERT which has a RETURNING clause and which is to be rewritten based on
> >     a rule will be accepted if the rule is an "unconditional DO INSTEAD".
> >     In general I believe "unconditional" means "no WHERE clause", but in practice
> >     if the rule is of the form
> >        CREATE RULE insert_part_history as ON INSERT to history \
> >          DO INSTEAD SELECT history_insert_partitioned(NEW) returning NEW.id
> >     this is treated as conditional and the query is rejected.
>
> This isn't rejected because the query is treated as condition; it's
> rejected because it's not valid syntax.  A SELECT query can't have a
> RETURNING clause, because the target list (i.e. the part that
> immediately follows the SELECT) already serves that purpose. The fact
> that it's in a CREATE RULE statement is irrelevant.

Thanks for correcting me.   At the time,  it wasn't clear to me whether the RETURNING clause
in a CREATE RULE statement belonged to the CREATE RULE statement or the rule being created,
but now I see it's the latter.

> >   .  I propose to extend the rule system to recognize cases where the INSERT query specifies
> >      RETURNING and the rule promises to return a row,  and to then permit this query to run
> >      and return the expected row.   In effect,  to widen the definition of "unconditional"
> >      to handle cases such as my testcase.
>
> That already (kind of) works:
>
>  [...]
>
> I do notice that the RETURNING clause of the INSERT can't reference
> NEW, which seems like a restriction that we probably ought to lift,
> but it doesn't seem to have much to do with your patch.
>

The main use of my proposal is to be able to return the value of the sequence assigned
to the NEW.id column,  so yes   that is a serious restriction.

However,   even if that restriction is lifted,  it will not help with the case where
the rule is an invocation of a function,   which is the case I need.    For example,
in my testcase,  the function is building the SQL statement to be executed including thetable name of the partitioned
childtable,   based on the timestamp in the NEW record. 

Your comments have helped me realize that my original subject line did not accurately state
my requirement,  which should read
     "support INSERT INTO...RETURNING with partitioned table using rule which invokes a function"

And for this requirement as re-stated,  as far as I can tell,   current postgresql has no solution:
  .  only way to invoke a function in a rewrite rule is with SELECT function()
  .  SELECT does not permit RETURNING clause
  .  my proposal provides a way of relaxing this restriction for the case of SELECT in a rewrite rule.

I hope this describes the proposal a bit better.

John

John Lumby <johnlumby@hotmail.com> writes:
> On Fri, 22 Jun 2012 09:55:13, Robert Haas wrote:
>> I do notice that the RETURNING clause of the INSERT can't reference
>> NEW, which seems like a restriction that we probably ought to lift,
>> but it doesn't seem to have much to do with your patch.

> The main use of my proposal is to be able to return the value of the
> sequence assigned to the NEW.id column, so yes that is a serious
> restriction.

I think both of you are confused.  What the RETURNING clause can see is
the inserted row's actual values.  You can certainly get the assigned
sequence ID out of that.  I would argue that being able to see the NEW.*
expressions is at best secondary, because that data doesn't necessarily
have anything to do with what went into the table (consider the
possibility that a BEFORE trigger changed it).

> However, even if that restriction is lifted, it will not help with the
> case where the rule is an invocation of a function, which is the case
> I need.

What you're requesting seems pretty much nonsensical to me.  The point
of being able to write a RETURNING clause in a rule is to emulate what
would happen with RETURNING on a regular table.  As an example, suppose
that I have
create table t (id serial, data1 text, data2 text);

and for whatever reason I write
insert into t(data1, data2) values('foo', 'bar') returning id, data2;

I should get back the generated sequence value and the data2 value, but
*not* the data1 value.  Anything else is just wrong.  Now, if t has a
rule "ON INSERT DO INSTEAD SELECT somefunction()", how is that going to
happen?  The function doesn't know what the RETURNING clause looks like.
If we had a notional inserted-row-value then the executor could do the
RETURNING computation based on that, but there's no way to make a
connection between whatever the function does internally and the data
for RETURNING to chew on.

The whole concept of ON INSERT DO [INSTEAD/ALSO] SELECT seems pretty
shaky to me, as it *necessarily* involves a command substitution that
causes an INSERT to act in a strange fashion that the client application
will need special code to cope with.  I won't argue to take the feature
out, because people do use it in custom applications --- but it doesn't
play nice with RETURNING, and I don't think it can be made to.  It's
pretty much a legacy method of doing business IMO.

It seems to me that instead of lobbying to throw another kluge on top
of that pile, you'd be better off looking for alternative solutions.
Have you tried implementing this as an INSTEAD OF trigger, and not using
rules at all?  That mechanism works just fine with RETURNING, and it
seems to me that it would let you do whatever you could do inside a
custom function.  It would certainly be enough for the
dynamic-partition-redirection problem.
        regards, tom lane



On 09/20/12 16:34, Tom Lane wrote:
> John Lumby <johnlumby@hotmail.com> writes:
>> On Fri, 22 Jun 2012 09:55:13, Robert Haas wrote:
>>> I do notice that the RETURNING clause of the INSERT can't reference
>>> NEW, which seems like a restriction that we probably ought to lift,
>>> but it doesn't seem to have much to do with your patch.
>> The main use of my proposal is to be able to return the value of the
>> sequence assigned to the NEW.id column, so yes that is a serious
>> restriction.
> I think both of you are confused.  What the RETURNING clause can see is
> the inserted row's actual values.  You can certainly get the assigned
> sequence ID out of that.  I would argue that being able to see the NEW.*
> expressions is at best secondary, because that data doesn't necessarily
> have anything to do with what went into the table (consider the
> possibility that a BEFORE trigger changed it).

I think this part of the discussion was a bit of a
(probably confused) red herring going off on a tangent.

>> However, even if that restriction is lifted, it will not help with the
>> case where the rule is an invocation of a function, which is the case
>> I need.
> What you're requesting seems pretty much nonsensical to me.  The point
> of being able to write a RETURNING clause in a rule is to emulate what
> would happen with RETURNING on a regular table.  As an example, suppose
> that I have
>
>     create table t (id serial, data1 text, data2 text);
>
> and for whatever reason I write
>
>     insert into t(data1, data2) values('foo', 'bar') returning id, data2;
>
> I should get back the generated sequence value and the data2 value, but
> *not* the data1 value.  Anything else is just wrong.  Now, if t has a
> rule "ON INSERT DO INSTEAD SELECT somefunction()", how is that going to
> happen?  The function doesn't know what the RETURNING clause looks like.
> If we had a notional inserted-row-value then the executor could do the
> RETURNING computation based on that, but there's no way to make a
> connection between whatever the function does internally and the data
> for RETURNING to chew on.

Well since you raise the question  --  surely the function could return
a tuple of the correct row type and the executor could then pick out
whatever the actual statement requested.     This actually seems to
make my proposal more general and useful.   And answers the point
you make about "doesn't play nice with RETURNING" in your next para.

>
> The whole concept of ON INSERT DO [INSTEAD/ALSO] SELECT seems pretty
> shaky to me, as it *necessarily* involves a command substitution that
> causes an INSERT to act in a strange fashion that the client application
> will need special code to cope with.  I won't argue to take the feature
> out, because people do use it in custom applications --- but it doesn't
> play nice with RETURNING, and I don't think it can be made to.  It's
> pretty much a legacy method of doing business IMO.
>
> It seems to me that instead of lobbying to throw another kluge on top
> of that pile, you'd be better off looking for alternative solutions.
> Have you tried implementing this as an INSTEAD OF trigger, and not using
> rules at all?  That mechanism works just fine with RETURNING, and it
> seems to me that it would let you do whatever you could do inside a
> custom function.  It would certainly be enough for the
> dynamic-partition-redirection problem.

It took me a little while to realize your implicit suggestion that
I should rename my inheritance-parent (true name 'history') as 'something_else' and then   CREATE VIEW history as
select* from something_else
 
amd then create the instead trigger on the view.
(This *is* what you are suggesting,  right?)
I tried t and yes indeed it does exactly what I want  -
for the INSERT.    Now I also have to define instead triggers
for update and delete.      And are there any other considerations
for changing the table into a view?    I mean,  any other ways in which
SQL or client interfaces could perceive some difference?

Anyhow,  yes,   this does indeed serve as a solution to the problem
without needing any kluges or hacks,  so thank you.
But it gives me (and anyone else who tries it) more work than
one simple RULE on the table without needing to add the view.
By the way  -  what is the reason for the restiction
that INSTEAD OF triggers cannot be defined on "real" tables,
only on views?      Could this be lifted?

John Lumby

>
>             regards, tom lane
>
>




johnlumby escribió:
> On 09/20/12 16:34, Tom Lane wrote:

> >It seems to me that instead of lobbying to throw another kluge on top
> >of that pile, you'd be better off looking for alternative solutions.
> >Have you tried implementing this as an INSTEAD OF trigger, and not using
> >rules at all?  That mechanism works just fine with RETURNING, and it
> >seems to me that it would let you do whatever you could do inside a
> >custom function.  It would certainly be enough for the
> >dynamic-partition-redirection problem.

> Anyhow,  yes,   this does indeed serve as a solution to the problem
> without needing any kluges or hacks,  so thank you.
> But it gives me (and anyone else who tries it) more work than
> one simple RULE on the table without needing to add the view.
> By the way  -  what is the reason for the restiction
> that INSTEAD OF triggers cannot be defined on "real" tables,
> only on views?      Could this be lifted?

I read this as meaning that this patch is not necessary.  You seem to be
saying that you'd like some other things patched in some other ways, but
they are unrelated to the original patch.  If you are able to figure out
what you want to change and how, please submit another proposal and patch.

I'm marking this one "rejected" in the current commitfest.  Thanks.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services