Thread: The rule question before, request official documentation on the problem

The rule question before, request official documentation on the problem

From
Chris Travers
Date:
Hi all;

Sorry to follow up so soon on my own post.  I did find that there was
some discussion about this problem back in 2003.  However, the big issue
is that sequences are wrapped by serial data types and so this is likely
to be a common issue (using an insert rule on a serial, causing the
sequence to be incremented for the target table too).  In the
application I am working on, this was reviewed by a number of people who
are pretty strong PostgreSQL experts and nobody spotted this as an issue.

My request at this point is to officially and clearly document this as a
substantial limitation of rules.  It is not obvious that this is how
rules are supposed to behave in this case, and even assuming that the
current behavior is desired, it would be nice to let us know this :-)
Sometimes the easiest way to fix a bug is to document it as a feature....

Best Wishes,
Chris Travers

Attachment

Re: The rule question before, request official documentation on the problem

From
"Stuart Cooper"
Date:
> My request at this point is to officially and clearly document this as a
> substantial limitation of rules.  It is not obvious that this is how
> rules are supposed to behave in this case, and even assuming that the
> current behavior is desired, it would be nice to let us know this :-)

It's documented.

Section 35.3.1 of Postgresql 8.2 PDF docmentation, 2nd last paragraph:

***
For any reference to NEW, the target list of the original query is searched
for a corresponding entry. If found, that entry's expression replaces the
reference.
***

"expression" is the key term here. NEW.id is an expression, *not* a value.

Cheers,
Stuart.

Re: The rule question before, request official documentation on the problem

From
Chris Travers
Date:
Hmm.....

I just came up with a far more problematic case too and wonder if
documentation is enough.  Maybe we should warn about potential problems
more loudly.

Imagine the following case:  insert into test_table (test) values
(random()) where an insert rule propagates the changes faithfully to the
next table.  In short, all we are doing is inserting random numbers into
different tables and generating them on each insert.  In short, rules
provide no guarantee of predictable behavior because queries can always
mess with them.

Let me put that a different way:  rules can *only* be used where data
integrity is not at stake.  My own thinking is that it might be time to
make an official recommendation that they are only safe for views.

Best Wishes,
Chris Travers

Stuart Cooper wrote:
>> My request at this point is to officially and clearly document this as a
>> substantial limitation of rules.  It is not obvious that this is how
>> rules are supposed to behave in this case, and even assuming that the
>> current behavior is desired, it would be nice to let us know this :-)
>
> It's documented.
>
> Section 35.3.1 of Postgresql 8.2 PDF docmentation, 2nd last paragraph:
>
> ***
> For any reference to NEW, the target list of the original query is
> searched
> for a corresponding entry. If found, that entry's expression replaces the
> reference.
> ***
>
> "expression" is the key term here. NEW.id is an expression, *not* a
> value.
>
> Cheers,
> Stuart.
>
>


Attachment

Re: The rule question before, request official documentation on the problem

From
"Stuart Cooper"
Date:
> I just came up with a far more problematic case too and wonder if
> documentation is enough.  Maybe we should warn about potential problems
> more loudly.

> Imagine the following case:  insert into test_table (test) values
> (random()) where an insert rule propagates the changes faithfully to the
> next table.  In short, all we are doing is inserting random numbers into
> different tables and generating them on each insert. In short, rules
> provide no guarantee of predictable behavior because queries can always
> mess with them.

Rules mess with queries. For data copying/archiving kinds of tasks,
triggers are a better bet, like you suggested in your original post.

> Let me put that a different way:  rules can *only* be used where data
> integrity is not at stake.  My own thinking is that it might be time to
> make an official recommendation that they are only safe for views.

NEW and OLD mean different things in a PL/pgSQL context and a Rules context.
In PL/pgSQL NEW and OLD are values, in Rules (which specifically mess with
queries) they are expressions.

The fact that the same words mean different things in different contexts
is a bit unfortunate but not as messy as say using "NEWEXPR" in the
Rules context would be.

Once you appreciate the difference, there's no confusion.

Cheers,
Stuart.

> Rules mess with queries. For data copying/archiving kinds of tasks,
> triggers are a better bet, like you suggested in your original post.
>
>> Let me put that a different way:  rules can *only* be used where data
>> integrity is not at stake.  My own thinking is that it might be time to
>> make an official recommendation that they are only safe for views.
>
> NEW and OLD mean different things in a PL/pgSQL context and a Rules
> context.
> In PL/pgSQL NEW and OLD are values, in Rules (which specifically mess
> with
> queries) they are expressions.
>
> The fact that the same words mean different things in different contexts
> is a bit unfortunate but not as messy as say using "NEWEXPR" in the
> Rules context would be.


    Since we now have UPDATE/INSERT/DELETE RETURNING, one could imagine the
rules using these to access the actual rows and not the expressions...

    But there is a perfectly valid argument against that :

    - There already is a mechanism designed specifically for this purpose
(triggers).
    - It works perfectly.
    - Rules are supposed to rewrite queries to do stuff like views.

    It should be mentioned in the docs, though : someone with an account on
the PG site should copypaste this mail exchange in the comments field...




Re: The rule question before, request official documentation on the problem

From
Chris Travers
Date:
Listmail wrote:
<snip>
>
>
>     Since we now have UPDATE/INSERT/DELETE RETURNING, one could
> imagine the rules using these to access the actual rows and not the
> expressions...
>
>     But there is a perfectly valid argument against that :
>
>     - There already is a mechanism designed specifically for this
> purpose (triggers).
>     - It works perfectly.
>     - Rules are supposed to rewrite queries to do stuff like views.
Agreed.

I have narrowed the problem cases down to a subset I think should be
mentioned in the docs.

DO ALSO rules involving NEW are fundamentally dangerous to the integrity
of data because NEW is not guaranteed to be internally consistent.  DO
INSTEAD rules are fine (there is only one NEW), as are any DO ALSO rules
involving OLD.

We already protect against programmers using unsafe and non-standard
quote escapes.  I have sent in my cases to a number of other people,
some of which are deeply involved in PostgreSQL development, and the
initial behavior was not properly predicted by any of them.  This is why
I say that if this is the defined behavior of rules, that a clear and
obvious warning needs to be placed in the docs that this is dangerous
and in every case I can think of, not something you want to use a rule for.

Imagine, for example, that we have an application that is built.  Uses
DO ALSO rules with NEW to replicate user-supplied data from one table to
an audit trail or the like,  Everything works fine until someone decides
to load up a database with random data.  The programmer did not foresee
this and put his trust in PostgreSQL's features for data integrity.
Given the comments I found in the docs, I suspect that people *are*
using DO ALSO rules frequently when these are dangerous.  Since this can
cause problems based on user-supplied input, this is a problem.

One of the things that causes me to favor PostgreSQL for all my projects
is the strong emphasis on data integrity by the community, perhaps
better than any other RDBMS out there.  Being unwilling to warn clearly
and loudly about unsafe features does undermine that commitment.
>
>     It should be mentioned in the docs, though : someone with an
> account on the PG site should copypaste this mail exchange in the
> comments field...
>
For 90% of what I do, I use the local copy of the docs.  My concern is
that (at least in 8.1) there is no obvious warning about DO ALSO rules
using NEW to be inherently nondeterministic.   I checked the online 8.2
docs and while there was the bit about the expression substitution,
there still was not a warning about this behavior being fundamentally
nondeterministic.  I would like to see a note in the section comparing
triggers to rules explaining that this subset of rules is not deterministic.

Best Wishes,
Chris Travers

Attachment

Re: The rule question before, request official documentation on the problem

From
Martijn van Oosterhout
Date:
On Wed, Apr 11, 2007 at 09:21:46AM -0700, Chris Travers wrote:
> DO ALSO rules involving NEW are fundamentally dangerous to the integrity
> of data because NEW is not guaranteed to be internally consistent.  DO
> INSTEAD rules are fine (there is only one NEW), as are any DO ALSO rules
> involving OLD.

Huh? The entered data is exactly what was asked. The whole system is
totally deterministic and works as designed. Really, I'd prefer a warning
stating that people shouldn't use rules unless they absolutly know what
they're doing.

> One of the things that causes me to favor PostgreSQL for all my projects
> is the strong emphasis on data integrity by the community, perhaps
> better than any other RDBMS out there.  Being unwilling to warn clearly
> and loudly about unsafe features does undermine that commitment.

The problem is that the are some things that really need rules.
However, I think you can safely say:

Unless what you want can only be done using rules, use triggers. They
are far more obivous.

Not to mention that using a rule for auditing is silly, since it won't
record what actually went into the table.

> For 90% of what I do, I use the local copy of the docs.  My concern is
> that (at least in 8.1) there is no obvious warning about DO ALSO rules
> using NEW to be inherently nondeterministic.

Wrong word. It's not non-deterministic, nor is it undocumented, it's
just often misunderstood. Which brings you back to: if it doesn't have
to be a rule, make it a trigger. Forget you ever heard about rules.
Pretend they don't exist...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

ERROR: XLogFlush request 0/240169BC is not satisfied

From
"Nitin Verma"
Date:
java.sql.SQLException: ERROR:  XLogFlush: request
0/240169BC is not satisfied --- flushed only to 0/23FFC01C

This error I was in the logs of a java process that was failing to get some
data. But using psql I was able to connect and query all the data. Thus this
may indicate some connection / statement isolated problem but I am not sure
about that.


I would like to understand; When/why do we get this error? And can we avoid
or detect this?

Re: The rule question before, request official documentation on the problem

From
ptjm@interlog.com (Patrick TJ McPhee)
Date:
In article <461D0B1A.6030407@metatrontech.com>,
Chris Travers <chris@verkiel.metatrontech.com> wrote:

% DO ALSO rules involving NEW are fundamentally dangerous to the integrity
% of data because NEW is not guaranteed to be internally consistent.  DO
% INSTEAD rules are fine (there is only one NEW), as are any DO ALSO rules
% involving OLD.

It seems to me that this sort of dogmatism is fundamentally dangerous.

CREATE TABLE x (a varchar(20) PRIMARY KEY, b INT NOT NULL);
CREATE TABLE y (a varchar(20) NOT NULL, b INT NOT NULL);
CREATE RULE y_ins AS ON INSERT TO y DO UPDATE x SET b=b+new.b WHERE a=new.a;
CREATE RULE y_del AS ON DELETE TO y DO UPDATE x SET b=b-old.b WHERE a=old.a;
INSERT INTO x VALUES ('a', 0);
INSERT INTO y VALUES ('a', 2);
INSERT INTO y VALUES ('a', 2);
SELECT * FROM x;
 a | b
---+---
 a | 4

DELETE FROM y;
SELECT * FROM x;
 a | b
---+---
 a | 2

The DO ALSO rules involving OLD didn't do so well here.

The section on rules v. triggers could do with a caveat or two, but
it's a bit much to call them "fundamentally dangerous".
--

Patrick TJ McPhee
North York  Canada
ptjm@interlog.com