Thread: CONSTRAINTS...

CONSTRAINTS...

From
"Jackson, DeJuan"
Date:
So, could someone send me the SQL92 constraints syntax as well as the
definition of what a deferrable constraint is supposed to be?
ADVthanksANCE-DEJ


Re: [HACKERS] CONSTRAINTS...

From
Hal Snyder
Date:
"Jackson, DeJuan" <djackson@cpsgroup.com> writes:

> So, could someone send me the SQL92 constraints syntax as well as the
> definition of what a deferrable constraint is supposed to be?

Maybe someone emailed you a good answer already...

There's a whole chapter on constraints in "A Guide to the SQL
Standard" 4th ed. by Date and Darwen. Constraint syntax includes
CREATE DOMAIN, ALTER DOMAIN, DROP DOMAIN, CREATE ASSERTION, DROP
ASSERTION, CREATE TABLE, ALTER TABLE, FOREIGN KEY, CHECK, and SET
CONSTRAINTS, and others.

A deferrable constraint is one that may be deferred,using INITIALLY
DEFERRED or DEFERRABLE in its definition, or later, using SET
CONSTRAINTS ... DEFERRED.

A constraint is presumably deferred until a) the end of the applicable
transaction, or b) the next COMMIT, or c) the next SET CONSTRAINTS
... IMMEDIATE, whichever comes first. A possible use of deferrable
constraints is to avoid Catch-22 when setting up tables which have
cyclic dependency among foreign keys. [SQL lawyers, check me on this.]

BTW, AltaVista Web search on "sql92 ~ constraint" yielded 8 hits -
7 of which were PostgreSQL items.




Re: [HACKERS] CONSTRAINTS...

From
"Jose' Soares"
Date:
>From "A Guide to The SQL standard" C.J.DATE:

FOREIGN KEY Syntax:

* base-table-constraint-def::= [ CONSTRAINT constraint ]    foreign-key-def [ deferrability ]
 foreign-key-def::= FOREIGN KEY ( column-commalist ) references-def
 references-def::= REFERENCES base-table [ ( column-commalist ) ]    [ MATCH { FULL | PARTIAL } ]    [ ON DELETE
referential-action]    [ ON UPDATE referential-action ]     referential-action::= NO ACTION | CASCADE | SET DEFAULT |
SETNULL
 
 deferrability::= INITIALLY { DEFERRED | IMMEDIATE } [ NOT ] DEFERRABLE


* column-constraint-def::= references-def [ deferrability ]


14.6 DEFERRED CONSTRAINT CHECKING 

Up to this point we have been assuming that all integrity constraints
are checked "immediately," i.e., as the final step in executing any SQL
statement - and, if any constraint is found to he violated, the
offending SQL statement is simply rejected, so that its overall effect
on the database is nil. Sometimes, however, it is necessary that certain
constraints not he checked until some later time, on the grounds that if
they were to be checked "immediately" they would always fail. Here is an
example (involving a referential cycle): 

* Suppose we have two base tables, Tl and T2, each of which includes a
foreign key that references some candidate key of the other, and suppose
we start with both tables empty. Then, if all foreign key checking is
done immediately, there is no way to get started: Any attempt to insert
a row into either table will fail, because there is no target row in the
other table that it can possibly reference.
The facilities described immediately following are intended to address
such situations. 
1. At any given time, with respect to any given transaction, any given
constraint must be in one or two "modes," immediate or deferred.*
Immediate means the constraint is checked "immediately" (as explained
above); deferred means it is not.
2. Any given constraint definition can optionally include either or both
of the following: 

INITIALLY { DEFERRED | IMMEDIATE }
[ NOT ] DEFERRABLE

These specifications appear as the final syntactic component of the
constraint definition. They can appear in either order.
- INITIALLY DEFERRED and NOT DEFERRABLE are mutually exclusive. If
neither INITIALLY DEFERRED nor INITIALLY IMMEDIATE is specified,
INITIALLY IMMEDIATE is implied. If INITIALLY IMMEDIATE is specified or
implied, then if neither DEFERRABLE nor NOT DEFERRABLE is specified, NOT
DEFERRABLE is implied. If lNITIALLY DEFERRED is specified, then (as
already explained) NOT DEFERRABLE must not he specified; DEFERRABLE can
be specified, but is implied anyway.
- INITIALLY DEFERRED and INITIALLY IMMEDIATE specify the "initial" mode
of the constraint   i.e., its mode immediately after it is defined and
at the start of every transaction'!   as deferred or immediate,
respectively. . DEFERRABLE and NOT DEFERRABLE specify whether or not
this constraint can ever be in deferred mode. DEFERRABLE means it can;
NOT DEFERRABLE means it cannot. 

3. The SET CONSTRAINTS statement is used to set the mode for specified
constraints with respect to the current transaction and current session
(or the next transaction to he initiated in the current session, if the
SQL-agent has no transaction currently executing). The syntax is: 

SET CONSTRAINTS { constraint-commalist | ALL }                 { DEFERRED | IMMEDIATE } 

Each "constraint" mentioned by name must he DEFERRABLE; ALL is short-
hand for "all DEFERRABLE constraints." If DEFERRED is specified, the
mode of all indicated constraints is set to deferred. If 1MMED1ATE is
specified, the mode of all indicated constraints is set to immediate,
and those constraints are then checked; if any check fails, the SET
CONSTRAINTS fails, and the mode of all indicated constraints remains
unchanged. Note that because of paragraph 4 below, the checks should not
fail if the SET CONSTRAINTS statement is executed while the SQL-agent
has no current transaction. 

4. COMMIT implies SET CONSTRAINTS ALL IMMEDIATE (for every active
SQL-session for the applicable SQL-transaction). If some implied
integrity check then fails, the COMMIT fails, and the transaction fails
also (i.e., is rolled back). 
To revert to the example mentioned at the beginning of this section (the
referential cycle involving two tables): We could deal with the problem
using the foregoing facilities as indicated by the following pseudocode.
Data definitions:
CREATE TABLE Tl CONSTRAINT T1FK FOREIGN KEY ... REFERENCES T2 
INITIALLY DEFERRED 

CREATE TABLE T2 CONSTRAINT T2FK FOREIGN KEY ... REFERENCES T1 
INITIALLY DEFERRED 

SQL-transaction: 

INSERT INTO T1 ( ... ) VALUES ( ... ) 
INSERT INTO T2 ( ... ) VALUES ( ... ) 
SET CONSTRAINTS T1FK, T2FK IMMEDIATE 
IF SQLSTATE = code meaning "SET CONSTRAINTS failed" 
THEN ROLLBACK  --cancel the INSERTs 




Jackson, DeJuan wrote:
> 
> So, could someone send me the SQL92 constraints syntax as well as the
> definition of what a deferrable constraint is supposed to be?
> ADVthanksANCE
>         -DEJ

-Jose'-


Re: [HACKERS] CONSTRAINTS...

From
jwieck@debis.com (Jan Wieck)
Date:
>
> >From "A Guide to The SQL standard" C.J.DATE:
>
> FOREIGN KEY Syntax:
>
> [Good description of foreign key constraints - tnx]
>
>
> Jackson, DeJuan wrote:
> >
> > So, could someone send me the SQL92 constraints syntax as well as the
> > definition of what a deferrable constraint is supposed to be?
> > ADVthanksANCE
> >         -DEJ
>
> -Jose'-

    This  reminds  me  on  one  of my personal TODO's, because it
    show's to me that  an  implementation  of  constraints  using
    triggers or the like wouldn't be such a good idea. Especially
    the part on deferred constraint checks would mean  a  lot  of
    buffering to do the checks at the end.

    My idea on constraints was to use the rewrite rule system for
    them.   I  wanted  first  to  implement  another  optimizable
    statement - RAISE.  RAISE is mostly the same as a SELECT, but
    the result will not be sent to the frontend. Instead it  will
    produce some formatted elog message(s?).

    The syntax of RAISE I have in mind is:

        RAISE [ALL | FIRST [n]] expr [, expr ...] FROM ...

    Anything after FROM is exactly the same as for a SELECT.

    If  the  first result attribute of RAISE is a (var|bp)char or
    text field, single occurences of % in it will be  substituted
    by  the  following  attributes.   Otherwise all the attrs are
    simply concatenated with a padding blank to  form  the  error
    message.

    ALL or FIRST n means, that not only the first error should be
    shown.  A bit tricky to implement but  I  think  a  bunch  of
    NOTICE  and  a  final  "ERROR:  5  errors  counted"  would be
    possible.

    Having this, a foreign key constraint rule  could  look  like
    this:

        CREATE RULE _CIconstraint_name AS ON INSERT TO mytab DO
            RAISE 'Key "%" not in keytab', new.myatt FROM keytab
            WHERE NOT EXISTS (SELECT * FROM keytab WHERE keyatt = new.myatt);

    Similar  rules for update are simple and an ON DELETE CASCADE
    rule isn't that hard too.



    For  the  deferred  constraints  we  now   need   some   more
    informations  on  the  rules  themself. Currently all queries
    thrown in by the  rule  system  are  executed  prior  to  the
    original  query.  If  we add some syntax to CREATE RULE so we
    can tell

        CREATE [DEFERRABLE] [INITIALLY DEFERRED] RULE ...

    the rule system would be able to collect those queries  (they
    all  would be RAISE statements) to a global querytree list if
    they should be deferred.  This global  list  is  drained  out
    (all  queries run) when either the transaction commits or the
    SET ... IMMEDIATE is executed.

    Well, the information to remember isn't a small  amount.  Per
    constraint  that  is  to  be  deferred,  there  will  be  one
    querytree. And that for  every  single  INSERT/UPDATE/DELETE.
    And  if  a  table  has 5 constraints, it will be 5 remembered
    querytrees per operation. But  the  information  to  remember
    doesn't  depend  on  the  amount  of  data  affected  in  the
    statement (like it would be in a trigger implementation).  So
    it will work in a situation like

        BEGIN TRANSACTION;
        SET CONSTRAINST ALL DEFERRED;
        UPDATE tab1 SET ref1 = ref1 + 1900;
        UPDATE tab2 SET key1 = key1 + 1900;
        COMMIT TRANSACTION;

    even if there are millions of rows in the tables.



    As  Bruce  said  once  I  mentioned using the rule system for
    constraints: "It's a tempting solution". And I'm glad to have
    the  work  delayed until now because yet the DEFERRED problem
    surfaced and could be taken into account too.

    Comments? (sure :-)


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] CONSTRAINTS...

From
Bruce Momjian
Date:
>     For  the  deferred  constraints  we  now   need   some   more
>     informations  on  the  rules  themself. Currently all queries
>     thrown in by the  rule  system  are  executed  prior  to  the
>     original  query.  If  we add some syntax to CREATE RULE so we
>     can tell
> 
>         CREATE [DEFERRABLE] [INITIALLY DEFERRED] RULE ...
> 
>     the rule system would be able to collect those queries  (they
>     all  would be RAISE statements) to a global querytree list if
>     they should be deferred.  This global  list  is  drained  out
>     (all  queries run) when either the transaction commits or the
>     SET ... IMMEDIATE is executed.
> 
>     Well, the information to remember isn't a small  amount.  Per
>     constraint  that  is  to  be  deferred,  there  will  be  one
>     querytree. And that for  every  single  INSERT/UPDATE/DELETE.
>     And  if  a  table  has 5 constraints, it will be 5 remembered
>     querytrees per operation. But  the  information  to  remember
>     doesn't  depend  on  the  amount  of  data  affected  in  the
>     statement (like it would be in a trigger implementation).  So
>     it will work in a situation like

Let's look at it another way.  If we didn't use the query rewrite
system, what method could we use for foreign key/contraints that would
function better than this?

As far as I remember, triggers are C functions?  We can't generate these
on the fly inside the backend.  (Though compiling C code from the
backend and dynamically linking it into the engine is way too cool.)

Could we generate generic triggers that would handle most/all
situations?  I don't know.  Even if we can, would they be much faster
than the needed queries themselves?  Seems triggers work on single
tables.  How do we span tables?  If it is going to launch queries from
the trigger, we should use the rewrite system.  It is better suited to
this, with predigested queries and queries that flow through the
executor in step with the user queries!

Maybe let's go with the rewrite system, because it works, and is
flexible and strangely designed for this type of problem.  Similar to
how we use the rewrite system for views.

I am basically asking for a reason _not_ to use the rewrite system for
this.  I can't think of one myself.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] CONSTRAINTS...

From
jwieck@debis.com (Jan Wieck)
Date:
> Let's look at it another way.  If we didn't use the query rewrite
> system, what method could we use for foreign key/contraints that would
> function better than this?
>
> As far as I remember, triggers are C functions?  We can't generate these
> on the fly inside the backend.  (Though compiling C code from the
> backend and dynamically linking it into the engine is way too cool.)
>
> Could we generate generic triggers that would handle most/all
> situations?  I don't know.  Even if we can, would they be much faster
> than the needed queries themselves?  Seems triggers work on single
> tables.  How do we span tables?  If it is going to launch queries from
> the trigger, we should use the rewrite system.  It is better suited to
> this, with predigested queries and queries that flow through the
> executor in step with the user queries!
>

    Generic  triggers  in  C  that  are  argument driven would be
    possible.  But the drawback is that those triggers have to be
    very  smart  to  use saved SPI plans (one for every different
    argument set). And it must be row level triggers, so  for  an
    update  to  a  2  meg  row table they will be fired 2 million
    times and run their queries inside - will take some time.

    More painful in the 2  meg  row  situation  is  that  trigger
    invocation  has  to be delayed until COMMIT if the constraint
    is deferred. I think we cannot remember 2 million OLD plus  2
    million  NEW  tuples  if one tuple can have up to 8K (will be
    32GB to remember plus overhead), so we need  to  remember  at
    least  the  CTID's  of  OLD  and NEW and refetch them for the
    trigger invocation. OUTCH - the OLD ones are at the head  and
    all the NEW ones are at the end of the tables file!

> Maybe let's go with the rewrite system, because it works, and is
> flexible and strangely designed for this type of problem.  Similar to
> how we use the rewrite system for views.

    And  the  other  changes  I've planned for the rewrite system
    will improve this much more.

    1.  Change pg_rewrite.ev_attr into an int28.  This  would  be
        useful  for  ON  UPDATE  rules  so the rewrite system can
        easily check if a rule has to be applied or not. If  none
        of  named  attributes  gets  something different assigned
        than it's own OLD value, they aren't updated so the  rule
        could  never  result  in  an  action  and  can be omitted
        completely.

    2.  Create cross reference catalog that lists  all  relations
        used in a rule (rangetable).  If we have a DELETE CASCADE
        constraint, the rule is triggered on the  key  table  and
        the action is a DELETE from the referencing table. If now
        the  referencing  table  is  dropped,  the   rule   get's
        corrupted because the resulting querytree isn't plannable
        any longer (the relations in  the  rules  rangetable  are
        identified  by  the OID in pg_class, not by relname). You
        can see the effect if you create a view and drop  one  of
        the base tables.

        Well,  we need to define what to do if a table is dropped
        that occurs in the crossref. First of all, the rules have
        to  be dropped too, but in the case of a view rule, maybe
        the whole view too?

        And in the case where a key table to  which  another  one
        has  a  CHECK  reference is dropped? The rule action will
        allways abort,  so  it  isn't  useful  any  more.  But  I
        wouldn't  like to silently drop it, because someone might
        want to drop and recreate the key table  and  this  would
        silently  result  in  that  all the constraints have been
        lost.

        Maybe we should change the rulesystem at all so that  the
        rangetable  entries  in the rule actions etc. are updated
        with a lookup from pg_class at rewrite time. Must be done
        carefully because someone might drop a table and recreate
        it with a different schema corrupting  the  parsetree  of
        the rule actions though.

    3.  Allow  an  unlimited  number  of  rules  on  a  relation.
        Currently there is a hard coded limit on  the  number  of
        rules the relation can hold in it's slots.

>
> I am basically asking for a reason _not_ to use the rewrite system for
> this.  I can't think of one myself.

    It  might  interfere with the new MVCC code. The rule actions
    must see exactly the  OLD  tuples  that  where  used  in  the
    original  statements.  Not  only  those  in the updated table
    itself, think of an INSERT...SELECT or an  UPDATE  where  the
    TLE or qual expressions are values from other tables.

    Not  a  real reason, just something to have in mind and maybe
    switching  silently  to  another  MVCC  isolation  level   if
    constraint  rules get applied, so all tables read from now on
    will  get  a  read  lock  applied  and  cannot  get   updated
    concurrently until COMMIT.

    And  it's  a  problem I've came across just writing this note
    where MVCC already could  have  broken  rewrite  rule  system
    semantics.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] CONSTRAINTS...

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
>     Generic  triggers  in  C  that  are  argument driven would be
>     possible.  But the drawback is that those triggers have to be
>     very  smart  to  use saved SPI plans (one for every different
>     argument set). And it must be row level triggers, so  for  an
>     update  to  a  2  meg  row table they will be fired 2 million
>     times and run their queries inside - will take some time.
> 
>     More painful in the 2  meg  row  situation  is  that  trigger
>     invocation  has  to be delayed until COMMIT if the constraint
>     is deferred. I think we cannot remember 2 million OLD plus  2
>     million  NEW  tuples  if one tuple can have up to 8K (will be
>     32GB to remember plus overhead), so we need  to  remember  at
>     least  the  CTID's  of  OLD  and NEW and refetch them for the
>     trigger invocation. OUTCH - the OLD ones are at the head  and
>     all the NEW ones are at the end of the tables file!

(Note that now in the case of UPDATE t_ctid of OLD tuples
points to TID of NEW tuples.)

> > I am basically asking for a reason _not_ to use the rewrite system for
> > this.  I can't think of one myself.
> 
>     It  might  interfere with the new MVCC code. The rule actions
>     must see exactly the  OLD  tuples  that  where  used  in  the
>     original  statements.  Not  only  those  in the updated table
>     itself, think of an INSERT...SELECT or an  UPDATE  where  the
>     TLE or qual expressions are values from other tables.

Two things define data visibility: SnapShot & CommandId.
We would have to save them for deffered rules and restore them 
before run rule actions. But there is one issue: for what
scans old visibility should be used? There are scans from
user query and there are scans added by rule action. Ok,
let's assume that for added scans current visibility will be used 
- this is what we need for RI rules (actually, something more -
see below). 

So, first task is enable different scans in (rewritten) query
use different visibilities (SnapShot/CommandId pair -
"snapshot", in short). We have to add new stuff to Executor
and heap scan code and so I propose also new feature addition:

1. add 

SET SNAPSHOT snapshot_name;
  statement to let users define some snapshot.

2. extend query syntax to let users specify what snapshot  must be used when a query table is scanned:

SELECT ... FROM t1 AT SNAPSHOT s1, t2 AT SNAPSHOT s2 ...etc..


Up to now new requirement due to MVCC is taking into account
not only CommandId (as already noted in last posting I got 
from Jan), but SnapShot too.

>     Not  a  real reason, just something to have in mind and maybe
>     switching  silently  to  another  MVCC  isolation  level   if
>     constraint  rules get applied, so all tables read from now on
>     will  get  a  read  lock  applied  and  cannot  get   updated
>     concurrently until COMMIT.

There is no isolevel in MVCC where locking would be used
implicitly. We could use LOCK IN SHARE or SELECT FOR UPDATE 
(FOR SHARE LOCK ?) - using rules for RI is like implementing
RI on applic level (!), - but this is bad.

Fortunately, there is a way without each row/table locking 
if scans added by RI rule could see uncommitted chages 
made by concurrent xactions. Unique btree code already use
special SnapshotDirty to see uncommitted changes and
avoid long-term row/page locking. With this Snapshot
HeapTupleSatisfies returns true if

t_xmin committed and (t_xmax is invalid OR is in-progress)OR
t_xmin is in-progress and t_xmax is invalid

- so, caller can wait (just like the same row writers do -
by locking in-progress xaction ID in transaction pseudo-table)
for in-progress t_xmin/t_xmax xaction and decide what to do after 
concurrent xaction COMMITs/ABORTs.

But before continuing with this approach I need in answer to
one question. Let's consider this case:

Xaction T1 with isolevel SERIALIZABLE inserts some row
into child table with deffered checking of primary key existance. 
There were no primary key P for row inserted by T1 at the moment 
when T1 begun, but before T1 begins constraint checking another
concurrent xaction T2 inserts P and commits.
After that T1 performs checking and - what? Will be the constraint
satisfied? I ask this because of all subsequent selects in T1
will not see P, but will see foreign key inserted - so, from the 
point of application, child --> parent relationship will be broken...

Comments?

Could someone run test below in Oracle?

1. In session 1:

CREATE TABLE p (x integer PRIMARY KEY);
CREATE TABLE c (y integer REFERENCES p);
INSERT INTO p VALUES (1);

2. In session 2:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM p;
-- empty
INSERT INTO c VALUES (1);
-- what? waits or rejects insertion?

3. In session 1:

COMMIT;

-- what in 1 if it was waiting?

4. In session 2:

INSERT INTO c VALUES (1);
-- Ok or rejected?

SELECT * FROM p;
SELECT * FROM c;
COMMIT;

TIA !!!

>     And  it's  a  problem I've came across just writing this note
>     where MVCC already could  have  broken  rewrite  rule  system
>     semantics.

How?

Vadim


Re: [HACKERS] CONSTRAINTS...

From
jwieck@debis.com (Jan Wieck)
Date:
Vadim wrote:

>
> Jan Wieck wrote:
> >
> (Note that now in the case of UPDATE t_ctid of OLD tuples
> points to TID of NEW tuples.)
>
> Two things define data visibility: SnapShot & CommandId.
> We would have to save them for deffered rules and restore them
> before run rule actions. But there is one issue: for what
> scans old visibility should be used? There are scans from
> user query and there are scans added by rule action. Ok,
> let's assume that for added scans current visibility will be used
> - this is what we need for RI rules (actually, something more -
> see below).

    I  addressed  that problem (different visibility required for
    scans in one command) also in my other mail.

    Anyway, I just checked what happens in the following case:

        T1:  begin;
        T1:  select ...

        T2:  update ...

        T1:  select ... (gets the same (old) values)

    That's the result as long as T1 doesn't run in READ COMMITTED
    mode.   And  that's  fine,  because  it doesn't have to worry
    about concurrent transactions of others.

    So the only problem left is the different visability. I think
    it  is  possible  to  change the visibility code not to check
    against the global command counter. Instead it might look  at
    a  command  counter value in the range table entry related to
    the scan node.  So the rewrite system and  tcop  could  place
    the correct values there during query rewrite/processing.

    The   range  table  of  a  rules  rewritten  parsetree  is  a
    combination of the range tables from the original user query,
    applied  view  rules  and  the  the rule itself. For deferred
    rules, only the those coming with the rule action itself must
    have  the  command counter at COMMIT. All others must get the
    command counter value that is there when the query that fired
    this rule get's executed.

    The  deferred  querytrees  can first be held in a new list of
    the rewritten querytree for the original user statement.  The
    rewrite    system    puts   into   the   rangetable   entries
    USE_CURRENT_CMDID or USE_COMMIT_CMDID depending on where they
    are coming from.

    Before tcop calls the executor, a new function in the rewrite
    system is called to set the actual  values  for  the  command
    counter  to use into the rangetable entries for one query and
    it's deferred ones.  Then it adds all the deferred queries to
    the global deferred list and runs the query itself.

    At  commit  time,  when  all the deferred queries have to get
    run, those RTE's in them having USE_COMMIT_CMDID are  set  to
    the  command  counter  at  commit  before  running the plans.
    Voila.

> >     And  it's  a  problem I've came across just writing this note
> >     where MVCC already could  have  broken  rewrite  rule  system
> >     semantics.
>
> How?

    Yes it did!

    If a transaction runs in READ COMMITTED mode,  the  scan  for
    the  rules  actions  (performed  first)  could have different
    results than that for the original query (performed last).

    For now I see only one solution. READ COMMITTED is  forbidden
    for  anything that invokes non-view rules. This check must be
    done in the tcop and SPI, because saved SPI plans can be  run
    without  invoking the rewrite system at any time. So the plan
    must remember somewhere if READ COMMITTED is allowed  for  it
    or not.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] CONSTRAINTS...

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
> > Two things define data visibility: SnapShot & CommandId.
> > We would have to save them for deffered rules and restore them
> > before run rule actions. But there is one issue: for what
> > scans old visibility should be used? There are scans from
> > user query and there are scans added by rule action. Ok,
> > let's assume that for added scans current visibility will be used
> > - this is what we need for RI rules (actually, something more -
> > see below).
> 
>     I  addressed  that problem (different visibility required for
>     scans in one command) also in my other mail.
> 
>     Anyway, I just checked what happens in the following case:
> 
>         T1:  begin;
>         T1:  select ...
> 
>         T2:  update ...
> 
>         T1:  select ... (gets the same (old) values)
> 
>     That's the result as long as T1 doesn't run in READ COMMITTED
>     mode.   And  that's  fine,  because  it doesn't have to worry
>     about concurrent transactions of others.
> 
>     So the only problem left is the different visability. I think
>     it  is  possible  to  change the visibility code not to check
>     against the global command counter. Instead it might look  at
>     a  command  counter value in the range table entry related to
>     the scan node.  So the rewrite system and  tcop  could  place
>     the correct values there during query rewrite/processing.

Why you talk about CommandID only? What about SnapShot data?
The difference between scans in SERIALIZABLE/READ COMMITTED
isolevels is that in SERIALIZABLE mode all queries use
the same SnapShot data (array of running xactions) 
and in READ COMMITTED mode new SnapShot data to use is created
for each query.

CommandId defines visibility of self-changes.
SnapShot defines visibility of concurrent changes.

> 
>     The   range  table  of  a  rules  rewritten  parsetree  is  a
>     combination of the range tables from the original user query,
>     applied  view  rules  and  the  the rule itself. For deferred
>     rules, only the those coming with the rule action itself must
>     have  the  command counter at COMMIT. All others must get the

Actually, not at COMMIT but when SET CONSTRAINT IMMEDIATE
is called. COMMIT just imlicitly switches into immediate mode.

>     command counter value that is there when the query that fired
>     this rule get's executed.
> 
>     The  deferred  querytrees  can first be held in a new list of
>     the rewritten querytree for the original user statement.  The
>     rewrite    system    puts   into   the   rangetable   entries
>     USE_CURRENT_CMDID or USE_COMMIT_CMDID depending on where they
>     are coming from.
> 
>     Before tcop calls the executor, a new function in the rewrite
>     system is called to set the actual  values  for  the  command
>     counter  to use into the rangetable entries for one query and
>     it's deferred ones.  Then it adds all the deferred queries to
>     the global deferred list and runs the query itself.
> 
>     At  commit  time,  when  all the deferred queries have to get
>     run, those RTE's in them having USE_COMMIT_CMDID are  set  to
>     the  command  counter  at  commit  before  running the plans.
>     Voila.
> 
> > >     And  it's  a  problem I've came across just writing this note
> > >     where MVCC already could  have  broken  rewrite  rule  system
> > >     semantics.
> >
> > How?
> 
>     Yes it did!
> 
>     If a transaction runs in READ COMMITTED mode,  the  scan  for
>     the  rules  actions  (performed  first)  could have different
>     results than that for the original query (performed last).
> 
>     For now I see only one solution. READ COMMITTED is  forbidden
>     for  anything that invokes non-view rules. This check must be
>     done in the tcop and SPI, because saved SPI plans can be  run
>     without  invoking the rewrite system at any time. So the plan
>     must remember somewhere if READ COMMITTED is allowed  for  it
>     or not.

READ COMMITTED will be default mode when writers in this
mode will be supported...

The solution is to use the same SnapShot data for both
action' and original scans.

But there are other abilities for inconsistances in READ COMMITTED
mode:

create table t (x int);
create table tlog (xold int, xnew int);
insert into t values (1);
create rule r as on update to t do 
insert into tlog values (old.x, new.x);

Now consider two concurrent

update t set x = 2 where x = 1;and
update t set x = 3 where x = 1;

: two rows will be inserted into tlog - (1,2) and (1,3) -
by rule actions run BEFORE original queries, but actually 
only one update will be succeeded - another one will see
not 1 in t.x after first update commit and so row will not 
be updated by second update...

One approach is to force FOR UPDATE OF "OLD"-table
in action' INSERT INTO ... SELECT, another one is 
to run rule action AFTER original query (wouldn't
it be possible having ability to directly set 
scan visibilities?).

Actually, for non-locking RI implementation (using dirty read) 
constraint checking must occure AFTER data changes are made 
(i.e. deffered untill query end): this will allow PK deleters
notice that there are concurrent FK inserters, UK inserters see 
that someone else tries to insert the same key, etc, wait for 
concurrent COMMIT/ABORT and make appropriate things after that.

More letters will follow, but maybe after week-end only...

Vadim