Thread: What about improving the rules system we have, was Re: Rules going away

What about improving the rules system we have, was Re: Rules going away

From
Chris Travers
Date:
On Wed, Sep 28, 2011 at 7:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I think the true state of affairs is this: rules have a lot of
> surprising behaviors, and if we could think of something that works more
> straightforwardly, we'd love to replace them.  But I think we'd have to
> have the "something" in place before we consider deprecating rules.
> At the moment we don't even have a glimmer of a design, so David's
> statement is many years premature.
>
>
I guess my question is what it would take to eliminate some of the
very surprising behaviors.  It seems to me that the cases of
surprising behaviors are actually pretty limited (moreso than table
inheritance for example).  The only surprises I can think of come with
DO ALSO rules which involve expressions which should not be
re-executed (nextval being a good example).  DO INSTEAD rules work
extremely well, IMO, and I can't think of any cases where there
surprises which are particularly easy to get bitten by there.  It
seems to me that a lot of problems with DO ALSO rules are issues which
happen when lexical ambiguity hits implementation (do you mean DO ALSO
nextval() again or take the value from the previous nextval() and feed
it into this rule?).

Instead of considering deprecating and replacing rules with something
else, I guess I am wondering if the issues re fundamental or issues of
implementation.  It seems to me we might get more mileage out of
pondering a Rules 2.0 approach, discussing the specifications, etc.
that we would in looking for an alternative.

I figure if we start this discussion then folks may have a basis for
considering alternatives down the line.  If we never discuss it
however, then folks who want something more predictable and
straightforward have nothing to consider.

So I guess I'd start by asking the community a couple of questions:
1)  Are there surprising behaviors in DO INSTEAD rules?
2)  Should we assume that there are cases dependent on existing behaviors?

If it were up to me I would change the rules spec in a couple of
clearly defined ways and then look at how to tweak what we have to
implement those changes.  In particular I would:

1)  Redefine NEW so that it is post-initial-expression evaluation and
therefore remains constant at the input of the rule's query plan.
2)  Add a new IN tuple which is pre-initial-expression evaluation and
therefore does not stay constant as a guarantee.

So for example, suppose I have the following tables which are used to
stage data coming into an accounting system.

CREATE TABLE my_transactions1 (id int not null unique default
nextval('my_transactions1_id_seq'),
post_on date not null,
credit_account int not null references credit_accounts(id),
source_id text not null unique,
total_amount numeric);

CREATE TABLE my_transactions2 (id int not null unique default
nextval('my_transactions1_id_seq'),
post_on date not null,
credit_account int not null references credit_accounts(id),
source_id text not null unique,
total_amount numeric);

If I:
CREATE RULE insert_deduction AS ON INSERT TO my_transactions1
DO ALSO INSERT INTO my_transactions2 (id, post_on, credit_account,
source_id, total_amount)
VALUES (IN.id, get_deduction_account(NEW.credit_account),
get_deduction_source(NEW.source), NEW.total_amount * -1);

Then I would expect nextval() to be executed twice, while NEW would
behave as it does in triggers.

I don't know how feasible it is to implement such a thing, and
certainly it would break backwards compatibility for at least some
users.  But I don't think it would be any worse than outright
replacing the rules system.

Best Wishes,
Chris Travers

Re: What about improving the rules system we have, was Re: Rules going away

From
Tom Lane
Date:
Chris Travers <chris.travers@gmail.com> writes:
> On Wed, Sep 28, 2011 at 7:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think the true state of affairs is this: rules have a lot of
>> surprising behaviors, and if we could think of something that works more
>> straightforwardly, we'd love to replace them.

> I guess my question is what it would take to eliminate some of the
> very surprising behaviors.

You're certainly right that unexpected multiple evaluations of volatile
expressions is the first thing that bites people.  (I don't believe
that's restricted to DO ALSO vs INSTEAD though.)  I think there are
also some fairly serious performance issues for large tables, stemming
from the rule system's tendency to create large joins under-the-hood for
any rewritten UPDATE or DELETE.  Not sure what else.

> Instead of considering deprecating and replacing rules with something
> else, I guess I am wondering if the issues re fundamental or issues of
> implementation.  It seems to me we might get more mileage out of
> pondering a Rules 2.0 approach, discussing the specifications, etc.
> that we would in looking for an alternative.

Maybe.  The advantage of something-thats-not-a-rule is that it would
dodge all issues of backwards compatibility, since we could just leave
the rule system behaving as-is until we were ready to remove it
altogether.  If we try to tweak the semantics of rules then we're likely
to break things for people who are using them now.  But having said
that, it's not unreasonable to think about it and try to scope out
exactly what we might tweak.

> [ proposal involving a "constant NEW tuple" ]

Not sure this specific proposal makes any sense at all.  IMO the only
real advantage that rules have over triggers is that they work on a
set-operation basis not a tuple-by-tuple basis.  Don't see how to
preserve that characteristic while redefining NEW as a static tuple.
(IOW, the example you gave is the least interesting possible case.
Think about how an ON INSERT rule would rewrite an INSERT ... SELECT,
or in general how you'd rewrite cases that process many tuples.)

            regards, tom lane

Re: What about improving the rules system we have, was Re: Rules going away

From
Harald Fuchs
Date:
In article <4116.1317226367@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Not sure this specific proposal makes any sense at all.  IMO the only
> real advantage that rules have over triggers is that they work on a
> set-operation basis not a tuple-by-tuple basis.

Isn't that what statement-level triggers are for, at least in other DB
systems?  How about telling PostgreSQL's statement-level triggers
something about the set of rows they affect?

Re: What about improving the rules system we have, was Re: Rules going away

From
Merlin Moncure
Date:
On Wed, Sep 28, 2011 at 11:46 AM, Harald Fuchs <hari.fuchs@gmail.com> wrote:
> In article <4116.1317226367@sss.pgh.pa.us>,
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>
>> Not sure this specific proposal makes any sense at all.  IMO the only
>> real advantage that rules have over triggers is that they work on a
>> set-operation basis not a tuple-by-tuple basis.
>
> Isn't that what statement-level triggers are for, at least in other DB
> systems?  How about telling PostgreSQL's statement-level triggers
> something about the set of rows they affect?

in theory that would be nice, but they just don't work that way.  you
don't even have access to the SQL statement firing the trigger IIRC.
that said, with some thought you could work an 'after' trigger into a
set level operation, say, by rigging something around now().

now that we have view triggers (not that rules ever really worked for
updating views anyways), even notwithstanding the minor limitations of
triggers of rules vs triggers, I personally find the RULE feature to
be useless and dangerous.  I'd vote for immediately deprecating it
without hesitation.

merlin

Re: What about improving the rules system we have, was Re: Rules going away

From
Chris Travers
Date:
First, thanks for your thoughtful reply.

On Wed, Sep 28, 2011 at 9:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
> You're certainly right that unexpected multiple evaluations of volatile
> expressions is the first thing that bites people.  (I don't believe
> that's restricted to DO ALSO vs INSTEAD though.)

I am having trouble thinking of practical uses where this would be a
problem.  I may simply lack imagination though.

>I think there are
> also some fairly serious performance issues for large tables, stemming
> from the rule system's tendency to create large joins under-the-hood for
> any rewritten UPDATE or DELETE.  Not sure what else.

I have run into amusing order or operations conditions on
delete/update rules, but those were all DO INSTEAD instead of DO ALSO.
>
> Maybe.  The advantage of something-thats-not-a-rule is that it would
> dodge all issues of backwards compatibility, since we could just leave
> the rule system behaving as-is until we were ready to remove it
> altogether.  If we try to tweak the semantics of rules then we're likely
> to break things for people who are using them now.  But having said
> that, it's not unreasonable to think about it and try to scope out
> exactly what we might tweak.

One option for a replacement would be to allow triggers on views,
provided that no actual insert or update occurs to the underlying
pseudotable.  Then rules could be deprecated for writing data.
>
>> [ proposal involving a "constant NEW tuple" ]
>
> Not sure this specific proposal makes any sense at all.  IMO the only
> real advantage that rules have over triggers is that they work on a
> set-operation basis not a tuple-by-tuple basis.  Don't see how to
> preserve that characteristic while redefining NEW as a static tuple.
> (IOW, the example you gave is the least interesting possible case.
> Think about how an ON INSERT rule would rewrite an INSERT ... SELECT,
> or in general how you'd rewrite cases that process many tuples.)

It's more of a question of order of operations.  I guess I was trying
to start with a simple example.  In a more complex example, like
INSERT.... SELECT (or better yet, writable common table expressions)
you'd basically have three stages logically (none of which is
necessarily guaranteed to be there if it is not applicable to simpler
cases)....

1)  Initial selection
2)  Insert tuple calculation, if applicable
3)  Actual tuple insert plan

So, suppose we have a totally unrealistic explanatory example:

CREATE TABLE a (id int);
CREATE TABLE b (id int);
CREATE TABLE c (a_id int, b_id int);
CREATE RULE a1 AS ON INSERT TO a DO ALSO INSERT INTO b values (in.id);
CREATE RULE a2 AS ON INSERT TO b DO ALSO INSERT INTO c values (new.id, in.id);
INSERT INTO a(id) select round(random() * 1000) from generate_series(1, 10);

Now, currently, something kinda funny happens with using NEW in this
example instead of IN:
in every row in c, a_id = b_id, but these values do not match a or b
tables.  In other words, for every insert, random() gets executed
three times and three different values get inserted into four columns.
  In other words we get random values which are local to each record
of each subquery, but not local to each value.

So currently I think this rewrites to something like:
INSERT INTO a (id) select round(random() * 1000) as randnum from
generate_series(1, 10);
INSERT INTO b (id) select round(random() * 1000) as randnum from
generate_series(1, 10);
INSERT INTO c (a_id, b_id) SELECT randnum, randnum from (select
round(random() * 1000) as randnum from generate_series(1, 10));

In Pseudocode (back to differentiating in vs new here), I am
suggesting something like:

WITH new (randnum) AS (select round(random() * 1000) as randnum from
generate_series(1, 10))
DO
(INSERT INTO a(id) select randnum from new;
INSERT INTO b(id) select round(random() * 1000) from new;
INSERT INTO c(a_id, b_id) select  randnum, round(random() * 1000) from new;
);

Does this make sense?

Best Wishes,
Chris Travers