Re: [HACKERS] CONSTRAINTS... - Mailing list pgsql-hackers

From jwieck@debis.com (Jan Wieck)
Subject Re: [HACKERS] CONSTRAINTS...
Date
Msg-id m100kAK-000EBQC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
In response to RE: [HACKERS] CONSTRAINTS...  ("Jackson, DeJuan" <djackson@cpsgroup.com>)
List pgsql-hackers
Pardon.

    First  of  all  I  strongly  recommend that someone reads the
    section about the rewrite  rule  system  in  the  programmers
    manual.  From some statements in this whole discussion I'm in
    doubt if anyone really knows how the rule  system  in  detail
    works and what it actually does with a query.

    And  please  -  there  where  substantial changes in the rule
    system for v6.4.  So anyone who THINKS he knows, update  your
    knowlege.

    I really didn't wanted to step on someones feet, but detailed
    knowlege of the rule system is required here.

DeJuan Jackson wrote:

>
> Agreed scrap the triggers idea.
>
> > [...]
> >
> Would we add a system table for constraints?  I believe that the
> DEFERRABLE option means we need special information about all
> constraints.  (ie. is_deferrable boolean, is_deferred boolean).

    No, we don't need another system catalog for that.  The  only
    thing that must be assured is that a rule which is deferrable
    is a NON-INSTEAD one with  exactly  one  RAISE  statement  as
    action.  Can be checked at rule create time at the same place
    the checks for view rules are currently done.

    I  thought  to  add  is_deferrable  and  is_initdeferred   to
    pg_rewrite  itself  to  hold the information about DEFERRABLE
    and INITIAL DEFERRED.  If a constraint  actually  has  to  be
    deferred  (added to global deferred querytree list) or not is
    a per transaction runtime information and isn't useful in the
    catalog.

>
> >     2.  Create cross reference catalog that lists  all  relations
> >         used in a rule (rangetable).  If we have a DELETE CASCADE
> >         constraint, the rule is triggered on the  key  table  and
> >         the action is a DELETE from the referencing table. If now
> >         the  referencing  table  is  dropped,  the   rule   get's
> >         corrupted because the resulting querytree isn't plannable
> >         any longer (the relations in  the  rules  rangetable  are
> >         identified  by  the OID in pg_class, not by relname). You
> >         can see the effect if you create a view and drop  one  of
> >         the base tables.
> *Bleck*  We can't place cascading or restriction information into the
> table definition.  Don't know if you had thought about it, but you could
> only create them in reverse order (child<-parent), very messy when you
> have many 'child' tables to cascade.

    I  don't  want  to put it into the table definition. Anything
    that will be required for checks  and  cascaded  deletes  are
    just rules going into pg_rewrite.

    OTOH  as I read Oracles SQL Language Quick Reference I can do
    a

    ALTER TABLE tab1 MODIFY key1 integer
        CONSTRAINT ref_tab2 REFERENCES ... ON DELETE CASCADE;

    I think I can create the tables in any order and later define
    all the (maybe circular) constraints.

>
> >         Well,  we need to define what to do if a table is dropped
> >         that occurs in the crossref. First of all, the rules have
> >         to  be dropped too, but in the case of a view rule, maybe
> >         the whole view too?
> I agree that we should drop rules defined on the dropped table, but not
> those that simply access the dropped table, including views.  If I need
> to drop a table which has a view defined for it simply to rearrange
> column order, I'd like not to have to recreate the view as well and/or
> any RI-constraints that reference that table as well.
>
> [...]
>
> Could we allow a 'recompile' of the rules?  Which would update the rule
> actions and report which rules have been corrupted.  And possibly a way
> to simple drop all invalid rules.  We could simply mark the rules that
> need to be recompiled on a DROP or ALTER TABLE (to allow for dropping a
> column, or changing a column type) from the crossref.

    Nice  idea.  Would  require  the  crossref  catalog, one more
    catalog to store the rules  definition  text  and  a  compile
    status field in pg_rewrite.

    If  a  relation  used  in  a  rule is dropped or altered, the
    status is set to RECOMPILE. So the rule system can check  the
    rule  locks  and try to recompile those rules from the source
    before applying them.

    Yeah - I really like it! This way only those rules defined on
    the dropped table have to be dropped too (as it is now).  And
    as  long  as  another  table   used   in   the   rule   isn't
    recreated/altered  with  a  suitable  schema, the action that
    fires the  rule  cannot  be  performed  on  that  table  (the
    recompile would fail and abort the transaction).

    There   are   some  subtle  points  about  the  rule  caching
    performed. Actually the  relation  descriptor  contains  rule
    locks  where  in memory parsetrees are placed at heap_open(),
    but it must be possible.

> >     It  might  interfere with the new MVCC code. The rule actions
> >     must see exactly the  OLD  tuples  that  where  used  in  the
> >     original  statements.  Not  only  those  in the updated table
> >     itself, think of an INSERT...SELECT or an  UPDATE  where  the
> >     TLE or qual expressions are values from other tables.
> Shouldn't be too hard to accomplish.  We simply need to be sure that the
> values are either cached or re-cacheable/re-producible (if there are
> memory concerns) until the end of the transaction.  So, either:
>  1. cache all the values
>  2. store the part of the parsetree needed to recreate the values and
> have MVCC mark any needed rows off limits to vacuum
> I know that there are many other subtleties that I'm not addressing such
> as indexes, but the above will get you the answer.
> FWIW, RI and CONSTRAINTS would be affected by this.

    We cannot cache all the values without the danger running out
    of memory.

    Those  parts  of  the parsetree needed to recreate the values
    are actually incorporated into the  parsetrees  of  the  rule
    actions.  BUT  THESE  ARE PARSETREES!  They result in another
    execution plan and when they get executed they perform  their
    own table scans!

>
> >     Not  a  real reason, just something to have in mind and maybe
> >     switching  silently  to  another  MVCC  isolation  level   if
> >     constraint  rules get applied, so all tables read from now on
> >     will  get  a  read  lock  applied  and  cannot  get   updated
> >     concurrently until COMMIT.
> Not needed, see the above.  We just need to mark the rows with the tid,
> or some-such, because dropped/updated rows will hang around until a
> vacuum.

    As I understood MVCC, it is possible that two identical scans
    in one and the same  transaction,  can  result  in  different
    rows.

    Maybe I misunderstood something. Isn't changing the isolation
    level exactly that what can force MVCC to do  locking/marking
    with the result that the two identical scans will produce the
    same rows?

    Vadim - please clearify!

    But uh - oh! Now it's me who forgot how in  detail  the  rule
    system works :-)

    The  deferred constraint queries performed at transaction end
    will do their own scans.  So  tuples  removed/changed  during
    this  transaction  could  not  be  rescanned  as  they  where
    (command  counter  is  incremented!).  And  the   incremented
    command  counter is required to see that finally all circular
    constraints are satisfied.

    This one in detail now. Let's have the following setup:

        CREATE TABLE t1 (key1 int4);
        CREATE TABLE t2 (ref1 int4);
        CREATE TABLE t3 (a int4, b int4);

        The check constraint for ref1->key1 is the rule

        CREATE DEFERRABLE RULE _CU_t2_ref1 AS ON UPDATE TO t2 DO
          RAISE 'key "%" not in t1', new.ref1
          WHERE NOT EXISTS (SELECT key1 FROM t1 WHERE key1 = new.ref1);

    Now we execute

        BEGIN;
        SET CONSTRAINT ALL DEFERRED;
        UPDATE t2 SET ref1 = t3.b WHERE ref1 = t3.a;
        DELETE FROM t3;
        INSERT INTO t1 VALUES (4);
        COMMIT;

    The rewritten parsetree for the rule action would  look  like
    one for the query

        RAISE 'key "%" not in t1', t3.b
          FROM t2, t3
          WHERE t2.ref1 = t3.a
          AND NOT EXISTS (SELECT key1 FROM t1 WHERE key1 = t3.b);

    This  parsetree is deferred until the end of the transaction,
    the UPDATE will already have set cmax in  the  tuples  of  t2
    that where updated. So this RAISE cannot work as expected! To
    do so, the scans of t2  and  t3  must  be  done  against  the
    command counter of the UPDATE, but the scan of t1 against the
    final command counter at COMMIT.

    This mess is why I  changed  the  rule  system  for  v6.4  to
    perform  the  rule  actions BEFORE the original query itself.
    They cannot work AFTER (what would happen for deferred ones).


> P.S. I love the discussion that my simple little request has sparked.

    Me too!


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: "D'Arcy" "J.M." Cain
Date:
Subject: Re: [HACKERS] SUM() and GROUP BY
Next
From: Stefan Kapfhammer
Date:
Subject: ...