Re: [HACKERS] CONSTRAINTS... - Mailing list pgsql-hackers

From jwieck@debis.com (Jan Wieck)
Subject Re: [HACKERS] CONSTRAINTS...
Date
Msg-id m100R6N-000EBPC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
In response to Re: [HACKERS] CONSTRAINTS...  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
> Let's look at it another way.  If we didn't use the query rewrite
> system, what method could we use for foreign key/contraints that would
> function better than this?
>
> As far as I remember, triggers are C functions?  We can't generate these
> on the fly inside the backend.  (Though compiling C code from the
> backend and dynamically linking it into the engine is way too cool.)
>
> Could we generate generic triggers that would handle most/all
> situations?  I don't know.  Even if we can, would they be much faster
> than the needed queries themselves?  Seems triggers work on single
> tables.  How do we span tables?  If it is going to launch queries from
> the trigger, we should use the rewrite system.  It is better suited to
> this, with predigested queries and queries that flow through the
> executor in step with the user queries!
>

    Generic  triggers  in  C  that  are  argument driven would be
    possible.  But the drawback is that those triggers have to be
    very  smart  to  use saved SPI plans (one for every different
    argument set). And it must be row level triggers, so  for  an
    update  to  a  2  meg  row table they will be fired 2 million
    times and run their queries inside - will take some time.

    More painful in the 2  meg  row  situation  is  that  trigger
    invocation  has  to be delayed until COMMIT if the constraint
    is deferred. I think we cannot remember 2 million OLD plus  2
    million  NEW  tuples  if one tuple can have up to 8K (will be
    32GB to remember plus overhead), so we need  to  remember  at
    least  the  CTID's  of  OLD  and NEW and refetch them for the
    trigger invocation. OUTCH - the OLD ones are at the head  and
    all the NEW ones are at the end of the tables file!

> Maybe let's go with the rewrite system, because it works, and is
> flexible and strangely designed for this type of problem.  Similar to
> how we use the rewrite system for views.

    And  the  other  changes  I've planned for the rewrite system
    will improve this much more.

    1.  Change pg_rewrite.ev_attr into an int28.  This  would  be
        useful  for  ON  UPDATE  rules  so the rewrite system can
        easily check if a rule has to be applied or not. If  none
        of  named  attributes  gets  something different assigned
        than it's own OLD value, they aren't updated so the  rule
        could  never  result  in  an  action  and  can be omitted
        completely.

    2.  Create cross reference catalog that lists  all  relations
        used in a rule (rangetable).  If we have a DELETE CASCADE
        constraint, the rule is triggered on the  key  table  and
        the action is a DELETE from the referencing table. If now
        the  referencing  table  is  dropped,  the   rule   get's
        corrupted because the resulting querytree isn't plannable
        any longer (the relations in  the  rules  rangetable  are
        identified  by  the OID in pg_class, not by relname). You
        can see the effect if you create a view and drop  one  of
        the base tables.

        Well,  we need to define what to do if a table is dropped
        that occurs in the crossref. First of all, the rules have
        to  be dropped too, but in the case of a view rule, maybe
        the whole view too?

        And in the case where a key table to  which  another  one
        has  a  CHECK  reference is dropped? The rule action will
        allways abort,  so  it  isn't  useful  any  more.  But  I
        wouldn't  like to silently drop it, because someone might
        want to drop and recreate the key table  and  this  would
        silently  result  in  that  all the constraints have been
        lost.

        Maybe we should change the rulesystem at all so that  the
        rangetable  entries  in the rule actions etc. are updated
        with a lookup from pg_class at rewrite time. Must be done
        carefully because someone might drop a table and recreate
        it with a different schema corrupting  the  parsetree  of
        the rule actions though.

    3.  Allow  an  unlimited  number  of  rules  on  a  relation.
        Currently there is a hard coded limit on  the  number  of
        rules the relation can hold in it's slots.

>
> I am basically asking for a reason _not_ to use the rewrite system for
> this.  I can't think of one myself.

    It  might  interfere with the new MVCC code. The rule actions
    must see exactly the  OLD  tuples  that  where  used  in  the
    original  statements.  Not  only  those  in the updated table
    itself, think of an INSERT...SELECT or an  UPDATE  where  the
    TLE or qual expressions are values from other tables.

    Not  a  real reason, just something to have in mind and maybe
    switching  silently  to  another  MVCC  isolation  level   if
    constraint  rules get applied, so all tables read from now on
    will  get  a  read  lock  applied  and  cannot  get   updated
    concurrently until COMMIT.

    And  it's  a  problem I've came across just writing this note
    where MVCC already could  have  broken  rewrite  rule  system
    semantics.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: Tom Ivar Helbekkmo
Date:
Subject: Re: [HACKERS] postgres and year 2000
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] postgres and year 2000