Thread: Re: Referential Integrity In PostgreSQL
> > Hi , Jan > > my name is Max . Hi Max, > > I have contributed to SPI interface , > that with external Trigger try to make > a referential integrity. > > If I can Help , in something , > I'm here . > You're welcome. I've CC'd the hackers list because we might get some ideas from there too (and to surface once in a while - Bruce already missed me). Currently I'm very busy for serious work so I don't find enough spare time to start on such a big change to PostgreSQL. But I'd like to give you an overview of what I have in mind so far so you can decide if you're able to help. Referential integrity (RI) is based on constraints defined in the schema of a database. There are some different types of constraints: 1. Uniqueness constraints. 2. Foreign key constraints that ensure that a key value used in an attribute exists in another relation. One constraint must ensure you're unable to INSERT/UPDATE to a value that doesn't exist, another one must prevent DELETE on a referenced key item or that it is changed during UPDATE. 3. Cascading deletes that let rows referring to a key follow on DELETE silently. Even if not defined in the standard (AFAIK) there could be others like letting references automatically follow on UPDATE to a key value. All constraints can be enabled and/or default to be deferred. That means, that the RI checks aren't performed when they are triggerd. Instead, they're checked at transaction end or if explicitly invoked by some special statement. This is really important because someone must be able to setup cyclic RI checks that could never be satisfied if the checks would be performed immediately. The major problem on this is the amount of data affected until the checks must be performed. The number of statements executed, that trigger such deferred constraints, shouldn't be limited. And one single INSERT/UPDATE/DELETE could affect thousands of rows. Due to these problems I thought, it might not be such a good idea to remember CTID's or the like to get back OLD/NEW rows at the time the constraints are checked. Instead I planned to misuse the rule system for it. Unfortunately, the rule system has damned tricky problems itself when it comes to having-, distinct and other clauses and extremely on aggregates and subselects. These problems would have to get fixed first. So it's a solution that cannot be implemented right now. Fallback to CTID remembering though. There are problems too :-(. Let's enhance the trigger mechanism with a deferred feature. First this requires two additional bool attributes in the pg_trigger relation that tell if this trigger is deferrable and if it is deferred by default. While at it we should add another bool that tells if the trigger is enabled (ALTER TRIGGER {ENABLE|DISABLE} trigger). Second we need an internal list of triggers, that are currently DEFINED AS DEFERRED. Either because they default to it, or the user explicitly asked to deferr it. Third we need an internal list of triggers that must be invoked later because at the time an event occured where they should have been triggered, they appeared in the other list and their execution is delayed until transaction end or explicit execution. This list must remember the OID of the trigger to invoke (to identify the procedure and the arguments), the relation that caused the trigger and the CTID's of the OLD and NEW row. That last list could grow extremely! Think of a trigger that's executing commands over SPI which in turn activate deferred triggers. Since the order of trigger execution is very important for RI, I can't see any chance to simplify/condense this information. Thus it is 16 bytes at least per deferred trigger call (2 OID's plus 2 CTID's). I think one or more temp files would fit best for this. A last tricky point is if one of a bunch of deferred triggers is explicitly called for execution. At this time, the entries for it in the temp file(s) must get processed and marked executed (maybe by overwriting the triggers OID with the invalid OID) while other trigger events still have to get recorded. Needless to say that reading thousands of those entries just to find a few isn't good on performance. But better have this special case slow that dealing with hundreds of temp files or other overhead slowing down the usual case where ALL deferred triggers get called at transaction end. Trigger invocation is simple now - fetch the OLD and NEW rows by CTID and execute the trigger as done by the trigger manager. Oh - well - vacuum shouldn't touch relations where deferred triggers are outstanding. Might require some special lock entry - Vadim? Did I miss something? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote: > > Third we need an internal list of triggers that must be > invoked later because at the time an event occured where they > should have been triggered, they appeared in the other list > and their execution is delayed until transaction end or > explicit execution. This list must remember the OID of the > trigger to invoke (to identify the procedure and the > arguments), the relation that caused the trigger and the > CTID's of the OLD and NEW row. > > That last list could grow extremely! Think of a trigger > that's executing commands over SPI which in turn activate > deferred triggers. Since the order of trigger execution is > very important for RI, I can't see any chance to > simplify/condense this information. Thus it is 16 bytes at > least per deferred trigger call (2 OID's plus 2 CTID's). I > think one or more temp files would fit best for this. > > A last tricky point is if one of a bunch of deferred triggers > is explicitly called for execution. At this time, the entries > for it in the temp file(s) must get processed and marked > executed (maybe by overwriting the triggers OID with the > invalid OID) while other trigger events still have to get > recorded. I believe that things are much simpler. For each deferable constraint (trigger) we have to remember the LastCommandIdProccessedByConstraint. When the mode of a constraint changes from defered to immediate (SET CONSTRAINT MODE), modified tuple will be fetched from WAL from down to up until tuple modified by LastCommandIdProccessedByConstraint is fetched and this is show stopper. Now we remember CommandId of SET CONSTRAINT MODE as new LastCommandIdProccessedByConstraint. When LastCommandIdProccessedByConstraint is changed by SET CONSTRAINT MODE DEFERRED we remeber this in flag to update LastCommandIdProccessedByConstraint later with higher CommandId of first modification of triggered table (to reduce amount of data to read from WAL). ? Vadim
Vadim wrote: > I believe that things are much simpler. > For each deferable constraint (trigger) we have to remember > the LastCommandIdProccessedByConstraint. When the mode of > a constraint changes from defered to immediate (SET CONSTRAINT MODE), > modified tuple will be fetched from WAL from down to up until > tuple modified by LastCommandIdProccessedByConstraint is fetched > and this is show stopper. Now we remember CommandId of > SET CONSTRAINT MODE as new LastCommandIdProccessedByConstraint. > When LastCommandIdProccessedByConstraint is changed by > SET CONSTRAINT MODE DEFERRED we remeber this in flag to > update LastCommandIdProccessedByConstraint later with higher > CommandId of first modification of triggered table (to reduce > amount of data to read from WAL). Hmmm, I'm not sure what side effects it could have if the triggers at the time of SET CONSTRAINTS c1, c2 IMMEDIATE arent fired in the same order they have been recorded - must think about that for a while. In that case I must be able to scan WAL from one command ID until another regardless of the resultrelation. Is that possible? Another issue is this: isn't it possible to run a database (or maybe an entire installation) without WAL? Does it make the code better maintainable to have WAL and RI coupled that strong? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote: > > I'm not sure what side effects it could have if the triggers > at the time of > > SET CONSTRAINTS c1, c2 IMMEDIATE > > arent fired in the same order they have been recorded - must ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Did you mean - in the same order as tables were modified? > think about that for a while. In that case I must be able to > scan WAL from one command ID until another regardless of the > resultrelation. Is that possible? WAL records are in the same order as tuples (regardless of result relation) was modified. Example: UPDATE of T1 fires (immediate) after row trigger inserting tuple into T2. WAL records: --> up {old_T1_tuple version ID, new_T1_tuple version ID and values} {new_T2_tuple ID and values} ... T1 update record T2 insert record ... --> down But records will be fetched from WAL in reverse order, from down to up. Does it matter? Order of modifications made by UPDATE/DELETE is undefined. Though, order has some sence for INSERT ... SELECT ORDER BY -:) Nevertheless, I don't see in standard anything about order of constraint checks. BTW, I found what standard means by "immediate": --- The checking of a constraint depends on its constraint mode within the current SQL-transaction. If the constraint modeis immediate, | then the constraint is effectively checked at the end of each ^^^^^^^^^^^^^^^^^^ | ___________________________________________________________________ | ANSI Only-SQL3 | ___________________________________________________________________ | SQL-statement S, unless S is executed because it is a <triggered ^^^^^^^^^^^^^^^ | SQL statement>, in which case, the constraint is effectively | checked at the end of the SQL-statement that is the root cause | of S. --- And now about triggers (regardless of ROW or STATEMENT level!): --- 4.22.2 Execution of triggered actions The execution of triggered actions depends on the cursor mode of the current SQL-transaction. If the cursor mode isset to cascade off, then the execution of the <triggered SQL statement>s is effec- tively deferred until enacted implicitlybe execution of a <commit statement> or a <close statement>. Otherwise, the <triggered SQL statement>s are effectivelyexecuted either before or after the ^^^^^^^^^^^^^^^^^^^ executionof each SQL-statement, as determined by the specified ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <trigger action time>. --- Hm. > Another issue is this: isn't it possible to run a database > (or maybe an entire installation) without WAL? Does it make Do you worry about disk space? -:) With archive mode off only log segments (currently, 64M each) required by active transactions (which made some changes) will present on disk. > the code better maintainable to have WAL and RI coupled that > strong? This doesn't add any complexity to WAL manager. Vadim
Vadim wrote: > But records will be fetched from WAL in reverse order, from > down to up. > > Does it matter? Might require to teach the WAL-manager to do it top-down too. And even then it might be better on performance to scan my constraint-log for events to the same tuple. It has records of a fixed, very small size and fetching tuples by CTID from the heap (direct block access) is required anyway because for delayed trigger invocation I neen OLD values too - and that's not in WAL if I read it right. But as I said I'd like to leave that coupling for later. > BTW, I found what standard means by "immediate": > --- > The checking of a constraint depends on its constraint mode within > the current SQL-transaction. If the constraint mode is immediate, > | then the constraint is effectively checked at the end of each > ^^^^^^^^^^^^^^^^^^ > | ___________________________________________________________________ > | ANSI Only-SQL3 > | ___________________________________________________________________ > | SQL-statement S, unless S is executed because it is a <triggered > ^^^^^^^^^^^^^^^ > | SQL statement>, in which case, the constraint is effectively > | checked at the end of the SQL-statement that is the root cause > | of S. > --- Ah - so ALL constraint-triggers must be AFTER <event> and deferred at least until the end of the USER-query. > > And now about triggers (regardless of ROW or STATEMENT level!): > --- > 4.22.2 Execution of triggered actions > > The execution of triggered actions depends on the cursor mode of > the current SQL-transaction. If the cursor mode is set to cascade > off, then the execution of the <triggered SQL statement>s is effec- > tively deferred until enacted implicitly be execution of a <commit > statement> or a <close statement>. Otherwise, the <triggered SQL > statement>s are effectively executed either before or after the > ^^^^^^^^^^^^^^^^^^^ > execution of each SQL-statement, as determined by the specified > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > <trigger action time>. > --- We do not have FOR UPDATE cursors. So (even if to be kept in mind) there is no CURSOR mode to care for right now. Changing BEFORE triggers to behave exactly like that would require to do the execution of the plan twice, one time to fire triggers, another time to perform the action itself. I don't think that the perfomance cost is worth this little amount of accuracy. Such a little difference should be mentioned in the product notes and period. AFTER triggers could simply be treated half like IMMEDIATE constraints - deferred until the end of a single statement (not user-query). So there are four times where the deferred trigger queue is run (maybe partially). At the end of a statement, end of a user-query, at a syncpoint (not sure if we have them up to now) and end of transaction. Things are getting much clearer - Tnx. > > Do you worry about disk space? -:) > With archive mode off only log segments (currently, 64M each) > required by active transactions (which made some changes) > will present on disk. Never - my motto is "don't force it - use a bigger hammer". But the above seems to be exactly like the Oracle behaviour where online-redolog's aren't affected by archive mode. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #