Thread: RULE questions.

RULE questions.

From
"Neil Burrows"
Date:
Hi,

I have what I first thought would be a trivial problem, in that I require
the 2 VARCHAR columns in the following table to have the data stored in
upper case.

test_table
+-----------------------------+------------------------------+------+
|            Field            |             Type             |Length|
+-----------------------------+------------------------------+------+
| user_id                     | int4                         |    4 |
| name                        | varchar()                    |   10 |
| password                    | varchar()                    |   10 |
+-----------------------------+------------------------------+------+


I considered just using UPPER() in every SELECT statement, or creating a
view that SELECTed the 2 VARCHAR cols using UPPER but I assume that this
would have more performance issues than having the data converted to
uppercase only once during an insert or update.  (Please correct me if I am
wrong).

After looking at triggers and rules I came up with the following solution:

CREATE VIEW test AS SELECT * FROM test_table;

CREATE RULE insert_test AS
ON INSERT TO test DO INSTEAD
INSERT INTO test_table (user_id, name, password) VALUES(new.user_id,
UPPER(new.name), UPPER(new.password));

CREATE RULE update_test AS
ON UPDATE TO test DO INSTEAD
UPDATE test_table SET name = UPPER(new.name), password = UPPER(new.password)
WHERE user_id = new.user_id;

which means that any insert or update to the test view is stored in upper
case as required.

However, I still have two concerns about this.

1) What impact on performance does using a VIEW in this way have?
2) Users can still enter data straight into test_table in lower case
bypassing the "rules"

First off, is there an easier way to ensure that data is stored in uppercase
for certain columns (not the whole table).  And if not does anyone have
comments on performance issues, or ways of stopping users accidentally or
intentionally inserting lower case data straight into the table rather than
the view?

Many thanks in advance,

---[  Neil Burrows  ]-----------------------------------------------------
E-mail: neil.burrows@gssec.bt.co.uk             British Telecom Plc.
      : neil@pawprint.co.uk                     Glasgow Engineering Centre
Web   : http://www.remo.demon.co.uk/            Highburgh Rd.  Glasgow  UK
-----------< Any views expressed are not those of my employer >-----------


Re: [SQL] RULE questions.

From
jwieck@debis.com (Jan Wieck)
Date:
>
> Hi,
>
> I have what I first thought would be a trivial problem, in that I require
> the 2 VARCHAR columns in the following table to have the data stored in
> upper case.
>
> test_table
> +-----------------------------+------------------------------+------+
> |            Field            |             Type             |Length|
> +-----------------------------+------------------------------+------+
> | user_id                     | int4                         |    4 |
> | name                        | varchar()                    |   10 |
> | password                    | varchar()                    |   10 |
> +-----------------------------+------------------------------+------+
>
>
> I considered just using UPPER() in every SELECT statement, or creating a
> view that SELECTed the 2 VARCHAR cols using UPPER but I assume that this
> would have more performance issues than having the data converted to
> uppercase only once during an insert or update.  (Please correct me if I am
> wrong).

    It's right.

>
> After looking at triggers and rules I came up with the following solution:
>
> CREATE VIEW test AS SELECT * FROM test_table;
>
> CREATE RULE insert_test AS
> ON INSERT TO test DO INSTEAD
> INSERT INTO test_table (user_id, name, password) VALUES(new.user_id,
> UPPER(new.name), UPPER(new.password));
>
> CREATE RULE update_test AS
> ON UPDATE TO test DO INSTEAD
> UPDATE test_table SET name = UPPER(new.name), password = UPPER(new.password)
> WHERE user_id = new.user_id;

    1.  Make sure user_id is unique or extend the WHERE clause in
        the UPDATE rule.  To explain why:

            user_id | name
            --------+----------
                  1 | aaa
                  1 | bbb
                  2 | ccc

            UPDATE test SET name = 'ddd' WHERE name = 'aaa';

            user_id | name
            --------+----------
                  1 | ddd
                  1 | ddd
                  2 | ccc

        This is because the rule will find the user_id 1 for name
        'aaa' and then updates any row with user_id 1.

    2.  Change  the  WHERE  clause  in the UPDATE rule to compare
        against old.user_id and add "user_id  =  new.user_id"  to
        the  SET  clause.  Otherwise  it would not be possible to
        change the user_id because this thrown away by the  rule.

    3.  Don't  forget  the  ON  DELETE rule. Maybe you don't want
        once given user_id's to be changed or  deleted.  Then  2.
        and 3. aren't right.

>
> which means that any insert or update to the test view is stored in upper
> case as required.
>
> However, I still have two concerns about this.
>
> 1) What impact on performance does using a VIEW in this way have?

    Only  the  rewriting  overhead  per query. The rewrite system
    changes the querytree generated by the parser in such  a  way
    that  the planner/optimizer will get the same input as if the
    query really was the SELECT from test_table.  If you  have  a
    view

        CREATE VIEW test AS SELECT * FROM test_table;

    the two statements

        SELECT * FROM test;
        SELECT * FROM test_table;

    are  totally  equivalent from the planners/optimizers (and so
    from the executors) point of  view.  The  rewriting  overhead
    depends  on  how  complex the statements and rule definitions
    are. But not on the number of rows affected in the statement.
    Selecting  thousands of rows has the same speed than doing it
    from the real tables behind a view.  It's very small  because
    compared  against  parser/planner/optimizer it has to do very
    few system cache lookups and works mostly with the data  that
    is already in memory.

> 2) Users can still enter data straight into test_table in lower case
> bypassing the "rules"

    Not  necessarily.  Since  v6.4  rule  actions (in contrast to
    triggers up to now) inherit the  access  permissions  of  the
    owner of the relation they're fired on.

        CREATE TABLE test_table ...;
        CREATE VIEW test AS SELECT * FROM test_table;

        REVOKE ALL ON test_table FROM public;
        GRANT ALL ON test_table TO me;

        REVOKE ALL ON test FROM public;
        GRANT ALL ON test TO me;
        GRANT SELECT, INSERT, UPDATE, DELETE ON test TO public;

    Now  any  user  can access test, but nobody but me can access
    test_table.  Not even a SELECT does work. They  can  do  most
    things  on  test. But the rule actions are executed under the
    permissions of me, so they work silently.

    YOU  MUST  NOT  GRANT  ALL  TO  PUBLIC.  ALL  includes   RULE
    permission, so a user could change the rules on test, do some
    things (maybe on any of your other tables) and reinstall  the
    original state of rules!

    In  addition to that, consider the case you really don't want
    once given user_id's ever to change. Nor you like them to  be
    ever reused. But they should disappear on DELETE.

        CREATE TABLE test_table (user_id int,
                                 name varchar(10),
                                 pass varchar(10),
                                 alive bool);

        CREATE UNIQUE INDEX test_user_id ON test_table (user_id);

        CREATE VIEW test AS SELECT * FROM test_data
            WHERE alive;

        CREATE RULE ins_test AS ON INSERT TO test
            DO INSTEAD INSERT INTO test_table
            VALUES (new.user_id, UPPER(new.name), UPPER(new.pass), 't');

        CREATE RULE upd_test AS ON UPDATE TO test
            DO INSTEAD UPDATE test_table
            SET name = UPPER(new.name), pass = UPPER(new.pass)
            WHERE user_id = old.user_id AND alive;

        CREATE RULE del_test AS ON DELETE TO test
            DO INSTEAD UPDATE test_table
            SET alive = 'f'
            WHERE user_id = old.user_id AND alive;

    Plus  all  the REVOKE and GRANT. This setup denies changes to
    user_id, makes the row's disappear on DELETE but  throw's  an
    error 'cannot insert duplicate ...' if someone tries to reuse
    a user_id. Only the owner of the test_table can reincarnate a
    once deleted account.

>
> First off, is there an easier way to ensure that data is stored in uppercase
> for certain columns (not the whole table).  And if not does anyone have
> comments on performance issues, or ways of stopping users accidentally or
> intentionally inserting lower case data straight into the table rather than
> the view?

    The  Postgres rewrite rule system is the most powerful way to
    do that.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [SQL] RULE questions.

From
"D'Arcy" "J.M." Cain
Date:
Thus spake Neil Burrows
> First off, is there an easier way to ensure that data is stored in uppercase
> for certain columns (not the whole table).  And if not does anyone have
> comments on performance issues, or ways of stopping users accidentally or
> intentionally inserting lower case data straight into the table rather than
> the view?

This makes me think of two features missing in PostgreSQL that I would
love to see.  I know it's probably to late to think about it now for
6.5 but I wonder what others think about this.

First, as suggested above, how about an option to automatically convert
data to upper case on entry?  I realize that triggers can do the job but
it seems to be needed often enough that putting it into the definition
for the field seems useful.  I guess a lower option would make sense too.

Second, an option to CREATE INDEX to make the index case insensitive.
Other RDBMS systems do this and it is nice not to depend on users being
consistent when entering names.  Consider ("albert", "Daniel", "DENNIS")
which would sort exactly opposite.  Also, in a primary key field (or
unique index) it would be nice if "A" was rejected if "a" already was
in the database.

Thoughts?

Followups to hackers.

--
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.

RE: [SQL] RULE questions.

From
"Neil Burrows"
Date:
Hi,


> > I have what I first thought would be a trivial problem, in that
> I require
> > the 2 VARCHAR columns in the following table to have the data stored in
> > upper case.


>     1.  Make sure user_id is unique or extend the WHERE clause in
>         the UPDATE rule.  To explain why:

This is actually just a small test table, and the real one has quite a few
more columns, but I did mean to make user_id unique, just forgot.  :)


>     2.  Change  the  WHERE  clause  in the UPDATE rule to compare
>         against old.user_id and add "user_id  =  new.user_id"  to
>         the  SET  clause.  Otherwise  it would not be possible to
>         change the user_id because this thrown away by the  rule.

The thinking behind it was that user_id shouldn't be able changed but I
accidentally neglected to mention that.

> > 2) Users can still enter data straight into test_table in lower case
> > bypassing the "rules"

>     Not  necessarily.  Since  v6.4  rule  actions (in contrast to
>     triggers up to now) inherit the  access  permissions  of  the
>     owner of the relation they're fired on.

Ahh, I see.  I thought that the rule actions used the current users access
permissions, not the owners.  That's much handier, thanks.


>     In  addition to that, consider the case you really don't want
>     once given user_id's ever to change. Nor you like them to  be
>     ever reused. But they should disappear on DELETE.
>
>         CREATE TABLE test_table (user_id int,
>                                  name varchar(10),
>                                  pass varchar(10),
>                                  alive bool);
>

And that's a great way of doing what I was going to start looking at next.
:)

>     The  Postgres rewrite rule system is the most powerful way to
>     do that.

Thanks very much for your time and comments here.  It's certainly made
things clearer.

Thanks again,

---[  Neil Burrows  ]-----------------------------------------------------
E-mail: neil.burrows@gssec.bt.co.uk             British Telecom Plc.
      : neil@pawprint.co.uk                     Glasgow Engineering Centre
Web   : http://www.remo.demon.co.uk/            Highburgh Rd.  Glasgow  UK
-----------< Any views expressed are not those of my employer >-----------


Re: [HACKERS] Re: [SQL] RULE questions.

From
Tom Lane
Date:
"D'Arcy" "J.M." Cain <darcy@druid.net> writes:
> Second, an option to CREATE INDEX to make the index case insensitive.

That, at least, we can already do: build the index on lower(field) not
just field.  Or upper(field) if that seems more natural to you.

> Also, in a primary key field (or
> unique index) it would be nice if "A" was rejected if "a" already was
> in the database.

Making either of the above a UNIQUE index should accomplish that.

            regards, tom lane

Re: [HACKERS] Re: [SQL] RULE questions.

From
Hannu Krosing
Date:
D'Arcy J.M. Cain wrote:
>
> This makes me think of two features missing in PostgreSQL that I would
> love to see.  I know it's probably to late to think about it now for
> 6.5 but I wonder what others think about this.
>
> First, as suggested above, how about an option to automatically convert
> data to upper case on entry?  I realize that triggers can do the job but
> it seems to be needed often enough that putting it into the definition
> for the field seems useful.  I guess a lower option would make sense too.

These could probably be implemened more effectively using rules. Having
the
rules generated automatically for simple cases would of course be nice,
but a warning at least should be given to user about creating the rule,
like it's currently done with primary key.

Or maybe it would be better to support virtual fields, like this :

create table people(
  first_name                 varchar(25),
  last_name                  varchar(25),
  upper_first_name VIRTUAL upper(first_name),
  upper_last_name  VIRTUAL upper(last_name),
  full_name        VIRTUAL (upper_first_name || ' ' || upper_last_name)
primary key
);

and then untangle this in the backend and create required rules and
indexes automatically ?

> Second, an option to CREATE INDEX to make the index case insensitive.

If you have this option on idex, how do you plan to make sure that the
index is actually used ?

It may be better to do it explicitly -

1. create index on upper(field)

2. use where upper(field) = 'MYDATA'

---------------
Hannu

Re: [HACKERS] Re: [SQL] RULE questions.

From
jwieck@debis.com (Jan Wieck)
Date:
>
> D'Arcy J.M. Cain wrote:
> >
> > This makes me think of two features missing in PostgreSQL that I would
> > love to see.  I know it's probably to late to think about it now for
> > 6.5 but I wonder what others think about this.
> >
> > First, as suggested above, how about an option to automatically convert
> > data to upper case on entry?  I realize that triggers can do the job but
> > it seems to be needed often enough that putting it into the definition
> > for the field seems useful.  I guess a lower option would make sense too.
>
> These could probably be implemened more effectively using rules. Having
> the
> rules generated automatically for simple cases would of course be nice,
> but a warning at least should be given to user about creating the rule,
> like it's currently done with primary key.

    No it can't.

    Such a rule would look like

        CREATE RULE xxx AS ON INSERT TO this_table
            DO INSTEAD INSERT INTO this_table ...

    The   rule   system  will  be  triggerd  on  an  INSERT  INTO
    this_table, rewrite and generate another parsetree that is an
    INSERT  INTO this_table, which is recursively rewritten again
    applying rule xxx...

    That's an endless recursion. A rule can  never  do  the  same
    operation to a table it is fired for.

    The  old  pre-Postgres95 university version (Postgres release
    4.2) had the possibility to define  rules  that  UPDATE  NEW.
    They  where buggy and didn't worked sometimes at all. Instead
    of fixing them, this functionality got removed when  Postgres
    became 95 :-(


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] RULE (and ALTER TABLE) questions

From
Hannu Krosing
Date:
Jan Wieck wrote:
>
> > These could probably be implemened more effectively using rules. Having
> > the
> > rules generated automatically for simple cases would of course be nice,
> > but a warning at least should be given to user about creating the rule,
> > like it's currently done with primary key.
>
>     No it can't.
>
>     Such a rule would look like
>
>         CREATE RULE xxx AS ON INSERT TO this_table
>             DO INSTEAD INSERT INTO this_table ...
>
>     The   rule   system  will  be  triggerd  on  an  INSERT  INTO
>     this_table, rewrite and generate another parsetree that is an
>     INSERT  INTO this_table, which is recursively rewritten again
>     applying rule xxx...
>
>     That's an endless recursion. A rule can  never  do  the  same
>     operation to a table it is fired for.

But when doing that at the table creation time, then the table can
actually
be defined as a view on storage table and rules for insert update and
delete
be defined for this view that do the actual data manipulation on the
storage table.

Or is the rule system currently not capable for this ?

When some field is changed to UPPER-ONLY status using alter table, the
table
could be renamed to staorage table and all the rules be created ?


And the other question - what is the status of ALTER TABLE commands -
can we add/remove/disable constraints without recreating the table ?

Is constraint and index disabling supported at all ?

-------------------
Hannu

Re: [HACKERS] RULE (and ALTER TABLE) questions

From
jwieck@debis.com (Jan Wieck)
Date:
> But when doing that at the table creation time, then the table can
> actually
> be defined as a view on storage table and rules for insert update and
> delete
> be defined for this view that do the actual data manipulation on the
> storage table.

    That's  IMHO  a too specific case to do it generally with the
    rule system.  Should be some kind of  constraint  handled  by
    the  parser  in  putting  an  UPPER()  func  node  around the
    targetlist expression.

    There could be more general support implemented,  in  that  a
    user can allways tell that a custom function should be called
    with the result of the TLE-expr before the value  is  dropped
    into the tuple on INSERT/UPDATE.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #