Thread: RE: [HACKERS] CONSTRAINTS...

RE: [HACKERS] CONSTRAINTS...

From
"Jackson, DeJuan"
Date:
> -----Original Message-----
>     Generic  triggers  in  C  that  are  argument driven would be
>     possible.  But the drawback is that those triggers have to be
>     very  smart  to  use saved SPI plans (one for every different
>     argument set). And it must be row level triggers, so  for  an
>     update  to  a  2  meg  row table they will be fired 2 million
>     times and run their queries inside - will take some time.
> 
>     More painful in the 2  meg  row  situation  is  that  trigger
>     invocation  has  to be delayed until COMMIT if the constraint
>     is deferred. I think we cannot remember 2 million OLD plus  2
>     million  NEW  tuples  if one tuple can have up to 8K (will be
>     32GB to remember plus overhead), so we need  to  remember  at
>     least  the  CTID's  of  OLD  and NEW and refetch them for the
>     trigger invocation. OUTCH - the OLD ones are at the head  and
>     all the NEW ones are at the end of the tables file!
Agreed scrap the triggers idea.

> > Maybe let's go with the rewrite system, because it works, and is
> > flexible and strangely designed for this type of problem.  
> Similar to
> > how we use the rewrite system for views.
> 
>     And  the  other  changes  I've planned for the rewrite system
>     will improve this much more.
> 
>     1.  Change pg_rewrite.ev_attr into an int28.  This  would  be
>         useful  for  ON  UPDATE  rules  so the rewrite system can
>         easily check if a rule has to be applied or not. If  none
>         of  named  attributes  gets  something different assigned
>         than it's own OLD value, they aren't updated so the  rule
>         could  never  result  in  an  action  and  can be omitted
>         completely.
Would we add a system table for constraints?  I believe that the
DEFERRABLE option means we need special information about all
constraints.  (ie. is_deferrable boolean, is_deferred boolean).

>     2.  Create cross reference catalog that lists  all  relations
>         used in a rule (rangetable).  If we have a DELETE CASCADE
>         constraint, the rule is triggered on the  key  table  and
>         the action is a DELETE from the referencing table. If now
>         the  referencing  table  is  dropped,  the   rule   get's
>         corrupted because the resulting querytree isn't plannable
>         any longer (the relations in  the  rules  rangetable  are
>         identified  by  the OID in pg_class, not by relname). You
>         can see the effect if you create a view and drop  one  of
>         the base tables.
*Bleck*  We can't place cascading or restriction information into the
table definition.  Don't know if you had thought about it, but you could
only create them in reverse order (child<-parent), very messy when you
have many 'child' tables to cascade.

>         Well,  we need to define what to do if a table is dropped
>         that occurs in the crossref. First of all, the rules have
>         to  be dropped too, but in the case of a view rule, maybe
>         the whole view too?
I agree that we should drop rules defined on the dropped table, but not
those that simply access the dropped table, including views.  If I need
to drop a table which has a view defined for it simply to rearrange
column order, I'd like not to have to recreate the view as well and/or
any RI-constraints that reference that table as well.

>         And in the case where a key table to  which  another  one
>         has  a  CHECK  reference is dropped? The rule action will
>         allways abort,  so  it  isn't  useful  any  more.  But  I
>         wouldn't  like to silently drop it, because someone might
>         want to drop and recreate the key table  and  this  would
>         silently  result  in  that  all the constraints have been
>         lost.
This is what I mean about views and RI-constraints.

>         Maybe we should change the rulesystem at all so that  the
>         rangetable  entries  in the rule actions etc. are updated
>         with a lookup from pg_class at rewrite time. Must be done
>         carefully because someone might drop a table and recreate
>         it with a different schema corrupting  the  parsetree  of
>         the rule actions though.
Could we allow a 'recompile' of the rules?  Which would update the rule
actions and report which rules have been corrupted.  And possibly a way
to simple drop all invalid rules.  We could simply mark the rules that
need to be recompiled on a DROP or ALTER TABLE (to allow for dropping a
column, or changing a column type) from the crossref.  

>     3.  Allow  an  unlimited  number  of  rules  on  a  relation.
>         Currently there is a hard coded limit on  the  number  of
>         rules the relation can hold in it's slots.
I agree with this one without reference, especially if we actually
implement RI/constraints using rewrite.

> > I am basically asking for a reason _not_ to use the rewrite 
> system for
> > this.  I can't think of one myself.
> 
>     It  might  interfere with the new MVCC code. The rule actions
>     must see exactly the  OLD  tuples  that  where  used  in  the
>     original  statements.  Not  only  those  in the updated table
>     itself, think of an INSERT...SELECT or an  UPDATE  where  the
>     TLE or qual expressions are values from other tables.
Shouldn't be too hard to accomplish.  We simply need to be sure that the
values are either cached or re-cacheable/re-producible (if there are
memory concerns) until the end of the transaction.  So, either:1. cache all the values2. store the part of the
parsetreeneeded to recreate the values and
 
have MVCC mark any needed rows off limits to vacuum
I know that there are many other subtleties that I'm not addressing such
as indexes, but the above will get you the answer.
FWIW, RI and CONSTRAINTS would be affected by this.

>     Not  a  real reason, just something to have in mind and maybe
>     switching  silently  to  another  MVCC  isolation  level   if
>     constraint  rules get applied, so all tables read from now on
>     will  get  a  read  lock  applied  and  cannot  get   updated
>     concurrently until COMMIT.
Not needed, see the above.  We just need to mark the rows with the tid,
or some-such, because dropped/updated rows will hang around until a
vacuum.

>     And  it's  a  problem I've came across just writing this note
>     where MVCC already could  have  broken  rewrite  rule  system
>     semantics.
Maybe, but I know you'll straighten it all out.  I have faith in you,
Jan.

> Jan-DEJ

P.S. I love the discussion that my simple little request has sparked.


Re: [HACKERS] CONSTRAINTS...

From
jwieck@debis.com (Jan Wieck)
Date:
Pardon.

    First  of  all  I  strongly  recommend that someone reads the
    section about the rewrite  rule  system  in  the  programmers
    manual.  From some statements in this whole discussion I'm in
    doubt if anyone really knows how the rule  system  in  detail
    works and what it actually does with a query.

    And  please  -  there  where  substantial changes in the rule
    system for v6.4.  So anyone who THINKS he knows, update  your
    knowlege.

    I really didn't wanted to step on someones feet, but detailed
    knowlege of the rule system is required here.

DeJuan Jackson wrote:

>
> Agreed scrap the triggers idea.
>
> > [...]
> >
> Would we add a system table for constraints?  I believe that the
> DEFERRABLE option means we need special information about all
> constraints.  (ie. is_deferrable boolean, is_deferred boolean).

    No, we don't need another system catalog for that.  The  only
    thing that must be assured is that a rule which is deferrable
    is a NON-INSTEAD one with  exactly  one  RAISE  statement  as
    action.  Can be checked at rule create time at the same place
    the checks for view rules are currently done.

    I  thought  to  add  is_deferrable  and  is_initdeferred   to
    pg_rewrite  itself  to  hold the information about DEFERRABLE
    and INITIAL DEFERRED.  If a constraint  actually  has  to  be
    deferred  (added to global deferred querytree list) or not is
    a per transaction runtime information and isn't useful in the
    catalog.

>
> >     2.  Create cross reference catalog that lists  all  relations
> >         used in a rule (rangetable).  If we have a DELETE CASCADE
> >         constraint, the rule is triggered on the  key  table  and
> >         the action is a DELETE from the referencing table. If now
> >         the  referencing  table  is  dropped,  the   rule   get's
> >         corrupted because the resulting querytree isn't plannable
> >         any longer (the relations in  the  rules  rangetable  are
> >         identified  by  the OID in pg_class, not by relname). You
> >         can see the effect if you create a view and drop  one  of
> >         the base tables.
> *Bleck*  We can't place cascading or restriction information into the
> table definition.  Don't know if you had thought about it, but you could
> only create them in reverse order (child<-parent), very messy when you
> have many 'child' tables to cascade.

    I  don't  want  to put it into the table definition. Anything
    that will be required for checks  and  cascaded  deletes  are
    just rules going into pg_rewrite.

    OTOH  as I read Oracles SQL Language Quick Reference I can do
    a

    ALTER TABLE tab1 MODIFY key1 integer
        CONSTRAINT ref_tab2 REFERENCES ... ON DELETE CASCADE;

    I think I can create the tables in any order and later define
    all the (maybe circular) constraints.

>
> >         Well,  we need to define what to do if a table is dropped
> >         that occurs in the crossref. First of all, the rules have
> >         to  be dropped too, but in the case of a view rule, maybe
> >         the whole view too?
> I agree that we should drop rules defined on the dropped table, but not
> those that simply access the dropped table, including views.  If I need
> to drop a table which has a view defined for it simply to rearrange
> column order, I'd like not to have to recreate the view as well and/or
> any RI-constraints that reference that table as well.
>
> [...]
>
> Could we allow a 'recompile' of the rules?  Which would update the rule
> actions and report which rules have been corrupted.  And possibly a way
> to simple drop all invalid rules.  We could simply mark the rules that
> need to be recompiled on a DROP or ALTER TABLE (to allow for dropping a
> column, or changing a column type) from the crossref.

    Nice  idea.  Would  require  the  crossref  catalog, one more
    catalog to store the rules  definition  text  and  a  compile
    status field in pg_rewrite.

    If  a  relation  used  in  a  rule is dropped or altered, the
    status is set to RECOMPILE. So the rule system can check  the
    rule  locks  and try to recompile those rules from the source
    before applying them.

    Yeah - I really like it! This way only those rules defined on
    the dropped table have to be dropped too (as it is now).  And
    as  long  as  another  table   used   in   the   rule   isn't
    recreated/altered  with  a  suitable  schema, the action that
    fires the  rule  cannot  be  performed  on  that  table  (the
    recompile would fail and abort the transaction).

    There   are   some  subtle  points  about  the  rule  caching
    performed. Actually the  relation  descriptor  contains  rule
    locks  where  in memory parsetrees are placed at heap_open(),
    but it must be possible.

> >     It  might  interfere with the new MVCC code. The rule actions
> >     must see exactly the  OLD  tuples  that  where  used  in  the
> >     original  statements.  Not  only  those  in the updated table
> >     itself, think of an INSERT...SELECT or an  UPDATE  where  the
> >     TLE or qual expressions are values from other tables.
> Shouldn't be too hard to accomplish.  We simply need to be sure that the
> values are either cached or re-cacheable/re-producible (if there are
> memory concerns) until the end of the transaction.  So, either:
>  1. cache all the values
>  2. store the part of the parsetree needed to recreate the values and
> have MVCC mark any needed rows off limits to vacuum
> I know that there are many other subtleties that I'm not addressing such
> as indexes, but the above will get you the answer.
> FWIW, RI and CONSTRAINTS would be affected by this.

    We cannot cache all the values without the danger running out
    of memory.

    Those  parts  of  the parsetree needed to recreate the values
    are actually incorporated into the  parsetrees  of  the  rule
    actions.  BUT  THESE  ARE PARSETREES!  They result in another
    execution plan and when they get executed they perform  their
    own table scans!

>
> >     Not  a  real reason, just something to have in mind and maybe
> >     switching  silently  to  another  MVCC  isolation  level   if
> >     constraint  rules get applied, so all tables read from now on
> >     will  get  a  read  lock  applied  and  cannot  get   updated
> >     concurrently until COMMIT.
> Not needed, see the above.  We just need to mark the rows with the tid,
> or some-such, because dropped/updated rows will hang around until a
> vacuum.

    As I understood MVCC, it is possible that two identical scans
    in one and the same  transaction,  can  result  in  different
    rows.

    Maybe I misunderstood something. Isn't changing the isolation
    level exactly that what can force MVCC to do  locking/marking
    with the result that the two identical scans will produce the
    same rows?

    Vadim - please clearify!

    But uh - oh! Now it's me who forgot how in  detail  the  rule
    system works :-)

    The  deferred constraint queries performed at transaction end
    will do their own scans.  So  tuples  removed/changed  during
    this  transaction  could  not  be  rescanned  as  they  where
    (command  counter  is  incremented!).  And  the   incremented
    command  counter is required to see that finally all circular
    constraints are satisfied.

    This one in detail now. Let's have the following setup:

        CREATE TABLE t1 (key1 int4);
        CREATE TABLE t2 (ref1 int4);
        CREATE TABLE t3 (a int4, b int4);

        The check constraint for ref1->key1 is the rule

        CREATE DEFERRABLE RULE _CU_t2_ref1 AS ON UPDATE TO t2 DO
          RAISE 'key "%" not in t1', new.ref1
          WHERE NOT EXISTS (SELECT key1 FROM t1 WHERE key1 = new.ref1);

    Now we execute

        BEGIN;
        SET CONSTRAINT ALL DEFERRED;
        UPDATE t2 SET ref1 = t3.b WHERE ref1 = t3.a;
        DELETE FROM t3;
        INSERT INTO t1 VALUES (4);
        COMMIT;

    The rewritten parsetree for the rule action would  look  like
    one for the query

        RAISE 'key "%" not in t1', t3.b
          FROM t2, t3
          WHERE t2.ref1 = t3.a
          AND NOT EXISTS (SELECT key1 FROM t1 WHERE key1 = t3.b);

    This  parsetree is deferred until the end of the transaction,
    the UPDATE will already have set cmax in  the  tuples  of  t2
    that where updated. So this RAISE cannot work as expected! To
    do so, the scans of t2  and  t3  must  be  done  against  the
    command counter of the UPDATE, but the scan of t1 against the
    final command counter at COMMIT.

    This mess is why I  changed  the  rule  system  for  v6.4  to
    perform  the  rule  actions BEFORE the original query itself.
    They cannot work AFTER (what would happen for deferred ones).


> P.S. I love the discussion that my simple little request has sparked.

    Me too!


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #