Thread: Guarenteeing complex referencial integrity through custom triggers

Guarenteeing complex referencial integrity through custom triggers

From
"Joris Dobbelsteen"
Date:
At this time Postgresql is unable to guarentee that complex* referencial
integrity (RI) constraints can be enforced using standard (read non-C)
triggers or SQL. This is an unfortunate side-effect of the MVCC model,
as it allows multiple transactions to modify a snapshot of the data at a
certain point in time. Later these modifications are 'merged' together,
resuling in a violation of a constraint that was not in the individual
snapshots. (You might notice that the same happens with most version
constrol software).

Some previous discussion was in
<http://archives.postgresql.org/pgsql-general/2007-02/msg01234.php>.

As a real-world example where the constraint cannot be enforced in
postgresql.
"For every tuple t in cartridge_change, there must exists a tuple t' in
printers with t.id = t'.id, and a tuple t'' in cartridge_types with
t.color = t''.color and t'.printertype = t''.printertype"

Under serializable isolation its impossible to enforce this constraint
in all cases (without triggers written in C). For read committed its
probably possible. The failure lies in scenarios where the MVCC rules
hide newly inserted rows (by a transaction that starts later). This is
also a concern in the referencial integrity (RI) constraints. Here the
problem is solved by cross-checking against the 'latest' snapshot.
However this functionality is not exposed to the user and can only be
used from functions written in C.

My intention is to expose the functionality to the outside world for
general use. This provides means to ensure custom complex constraints
can be enforced properly. I hope to push it into 8.3 if possible.


Now lets get down to the details:

The problem boils down to:
* Newly inserted child (dependent on existance of others) tuples are not
detected, while we should be aware of their presence.

This happens under the following situations:
* Checking the existance of child tuples.
* Deleting the child tuples.
* Updating the child tuples.

We can exclude these situations:
* Inserting a child, since the parent must exist for the current
transaction.
* Doing something to a parent, it's the 'propagation' that is a possible
problem.

One of the commercial competition, Oracle, which also has an
implementation of the MVCC model, allows the "FOR UPDATE" clause to
function as a way to ensure referencial integrity. See
<http://download-uk.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns
_sqlproc.htm#sthref294> (and beyond) for details. It seems Oracle treats
SELECT FOR UPDATE as something special.
Unfortunally the manual is not too in-depth and slight confusing on the
details, so I could be wrong.

Oracle has choosen to allow constraint enforcement by locking on the
parent tuple. In contrast postgres has chosen (historically, see RI
triggers) to fail on detecting conflicting newly inserted rows (the
cross-check).
I can't debate which is better, since the situations where the problem
can happen are restricted and transactions are not normally in
serializable isolation, its doesn't seem to be a problem in practice.
Nevertheless continuing as currently is probably the way to go, as the
infrastructure is available and working properly.

A few things remain:
* Language extension? "for referencial" seems not too intrusive.
* SPI_execute_snapshot interoperation, should give a failure.
* Doing parallel tests?

I think about hacking it in... Any opinions?

- Joris Dobbelsteen



Re: Guarenteeing complex referencial integrity through custom triggers

From
Tom Lane
Date:
"Joris Dobbelsteen" <Joris@familiedobbelsteen.nl> writes:
> My intention is to expose the functionality to the outside world for
> general use. This provides means to ensure custom complex constraints
> can be enforced properly. I hope to push it into 8.3 if possible.

You are at least a month too late for 8.3, even if you had a solid
design now, which you clearly don't.  Nor am I convinced that we really
want/need to support what you are talking about at the SQL level.  To me,
the crosscheck stuff in the RI support is an extremely dirty hack that
might or might not be 100% correct.  Exposing it to the SQL level,
and thereby committing to support it forever, seems the height of folly.
        regards, tom lane


Re: Guarenteeing complex referencial integrity through custom triggers

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> As a real-world example where the constraint cannot be enforced in
> postgresql.
>
> "For every tuple t in cartridge_change, there must exists a tuple t' in
> printers with t.id = t'.id, and a tuple t'' in cartridge_types with
> t.color = t''.color and t'.printertype = t''.printertype"

While there may be unenforceable constraints, unless I am misreading your 
example, I don't see a problem with enforcing this one. To restate 
your problem, a company has a finite number of printers, and tracks when 
a printer cartridge is changed. Each printer is of a certain type, and 
each type has one or more types of cartridges that can go with it. Thus:

- -- Generic type of printer
CREATE TABLE printer ( id    SERIAL NOT NULL PRIMARY KEY, brand TEXT   NOT NULL, model TEXT   NOT NULL
);
INSERT INTO printer (brand,model) VALUES ('epson','1200');
INSERT INTO printer (brand,model) VALUES ('hp','laserjet99');

- -- A printer can hold one or more cartridges, distinguished by 'color'
CREATE TABLE cartridge_types ( ptype INT  NOT NULL,   CONSTRAINT cc_ptype FOREIGN KEY (ptype) REFERENCES printer(id) ON
DELETERESTRICT, color TEXT NOT NULL
 
);
CREATE UNIQUE INDEX ctype ON cartridge_types(ptype,color);
INSERT INTO cartridge_types VALUES (1,'black');
INSERT INTO cartridge_types VALUES (2,'black');
INSERT INTO cartridge_types VALUES (2,'color');

- -- Specific printers in the company
CREATE TABLE printers ( id        SERIAL  NOT NULL PRIMARY KEY, ptype     INTEGER NOT NULL,   CONSTRAINT ptype FOREIGN
KEY(ptype) REFERENCES printer(id), location  TEXT    NOT NULL
 
);
INSERT INTO printers(ptype,location) VALUES (1,'Room 234');
INSERT INTO printers(ptype,location) VALUES (2,'Break room #2');
INSERT INTO printers(ptype,location) VALUES (2,'NE corner of warehouse');

- -- Printers require lots of pesky hardware updates
CREATE TABLE cartridge_change ( printer_id INT  NOT NULL,   CONSTRAINT change_printer FOREIGN KEY (printer_id)
REFERENCESprinters(id), color      TEXT NOT NULL, whenchanged TIMESTAMPTZ NOT NULL DEFAULT now()
 
);
ALTER TABLE cartridge_change ADD CONSTRAINT cc FOREIGN KEY (printer_id, color) REFERENCES cartridge_types
(ptype,color);

INSERT INTO cartridge_change (printer_id, color) VALUES (1,'black');
INSERT INTO cartridge_change (printer_id, color) VALUES (1,'blue');

- -- Session 1:
- -- BEGIN;
- -- INSERT INTO cartridge_change (printer_id, color) VALUES (2,'color');

- -- Session 2:
- -- BEGIN;
- -- DELETE FROM cartridge_types WHERE ptype=2 AND color = 'color';
- -- <blocks>

- -- Session 1:
- -- COMMIT;

- -- Session 2:
- -- ERROR


- -- Session 1:
- -- BEGIN;
- -- DELETE FROM cartridge_types WHERE ptype=2 AND color = 'color';

- -- Session 2:
- -- BEGIN;
- -- INSERT INTO cartridge_change (printer_id, color) VALUES (2,'color');
- -- <blocks>

- -- Session 1:
- -- COMMIT;

- -- Session 2:
- -- ERROR


- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200703261429
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFGCBJ/vJuQZxSWSsgRA16BAJ4hkfcY4ui+yLUGWNerHZf0FvRbPACg++X5
e4tmrrJ1BFcxjM3PCXyKP6Y=
=CDAM
-----END PGP SIGNATURE-----




Re: Guarenteeing complex referencial integrity through custom triggers

From
"Joris Dobbelsteen"
Date:
>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: maandag 26 maart 2007 19:52
>To: Joris Dobbelsteen
>Cc: pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Guarenteeing complex referencial
>integrity through custom triggers
>
>"Joris Dobbelsteen" <Joris@familiedobbelsteen.nl> writes:
>> My intention is to expose the functionality to the outside world for
>> general use. This provides means to ensure custom complex
>constraints
>> can be enforced properly. I hope to push it into 8.3 if possible.
>
>You are at least a month too late for 8.3, even if you had a
>solid design now, which you clearly don't.

Than its not possible, next try later on. I was messing up different
dates it seemed.

>Nor am I convinced
>that we really want/need to support what you are talking about
>at the SQL level.  To me, the crosscheck stuff in the RI
>support is an extremely dirty hack that might or might not be
>100% correct.  Exposing it to the SQL level, and thereby
>committing to support it forever, seems the height of folly.

Debatable...

Yet I see several options:
1) Extend the approach taken for the current RI triggers (i.e.
'cross-check hack').
2) Build some general framework for constraint enforcement.
3) Invent something new.
[Few more that aren't really proposable]

At this point:
1) At least Tom's not in favor and there is little commerical motivation
to do it right.
2) This is extremely huge project and needs to build on a primitive,
with the current only a 'dirty hack' available. Probably it extends the
CHECK syntax currently supported, and this is extremely involved.
3) Falling short of the innovative/sparkling idea.

The case is that at this point consistency within a single modified
snapshot of the database, does not imply all possible views (snapshots)
are consistent too. So we need to ensure both are consistent. Yet there
is no single _supported_ way to make that work. Its falling short on its
commercial competitors (and my view of an 'enterprise dbms'
unfortunally).

I'm fully open to other suggestions...

- Joris



Re: Guarenteeing complex referencial integrity through custom triggers

From
"Joris Dobbelsteen"
Date:
[Resent: mailing list only]
Tom, you mail server won't accept:
The e-mail system was unable to deliver the message, but did not report
a specific reason.  Check the address and try again.  If it still fails,
contact your system administrator.
< orange.nl #5.0.0 X-SMTP-Server; host sss.pgh.pa.us[66.207.139.130]
said: 550    5.0.0 Go away, spammer (in reply to MAIL FROM command)>
[//]

>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: maandag 26 maart 2007 19:52
>To: Joris Dobbelsteen
>Cc: pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Guarenteeing complex referencial
>integrity through custom triggers
>
>"Joris Dobbelsteen" <Joris@familiedobbelsteen.nl> writes:
>> My intention is to expose the functionality to the outside world for
>> general use. This provides means to ensure custom complex
>constraints
>> can be enforced properly. I hope to push it into 8.3 if possible.
>
>You are at least a month too late for 8.3, even if you had a
>solid design now, which you clearly don't.

Than its not possible, next try later on. I was messing up different
dates it seemed.

>Nor am I convinced
>that we really want/need to support what you are talking about
>at the SQL level.  To me, the crosscheck stuff in the RI
>support is an extremely dirty hack that might or might not be
>100% correct.  Exposing it to the SQL level, and thereby
>committing to support it forever, seems the height of folly.

Debatable...

Yet I see several options:
1) Extend the approach taken for the current RI triggers (i.e.
'cross-check hack').
2) Build some general framework for constraint enforcement.
3) Invent something new.
[Few more that aren't really proposable]

At this point:
1) At least Tom's not in favor and there is little commerical motivation
to do it right.
2) This is extremely huge project and needs to build on a primitive,
with the current only a 'dirty hack' available. Probably it extends the
CHECK syntax currently supported, and this is extremely involved.
3) Falling short of the innovative/sparkling idea.

The case is that at this point consistency within a single modified
snapshot of the database, does not imply all possible views (snapshots)
are consistent too. So we need to ensure both are consistent. Yet there
is no single _supported_ way to make that work. Its falling short on its
commercial competitors (and my view of an 'enterprise dbms'
unfortunally).

I'm fully open to other suggestions...

- Joris



Re: Guarenteeing complex referencial integrity through custom triggers

From
Hannu Krosing
Date:
Ühel kenal päeval, E, 2007-03-26 kell 16:05, kirjutas Joris Dobbelsteen:

> Oracle has choosen to allow constraint enforcement by locking on the
> parent tuple. In contrast postgres has chosen (historically, see RI
> triggers) to fail on detecting conflicting newly inserted rows (the
> cross-check).

Could you give an example, where postgresql fails to detect conflicting
newly inserted rows ?

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




Re: Guarenteeing complex referencial integrity throughcustom triggers

From
"Joris Dobbelsteen"
Date:
>-----Original Message-----
>From: Hannu Krosing [mailto:hannu@skype.net]
>Sent: dinsdag 27 maart 2007 15:45
>To: Joris Dobbelsteen
>Cc: pgsql-hackers@postgreSQL.org
>Subject: Re: [HACKERS] Guarenteeing complex referencial
>integrity throughcustom triggers
>
>Ühel kenal päeval, E, 2007-03-26 kell 16:05, kirjutas Joris
>Dobbelsteen:
>
>> Oracle has choosen to allow constraint enforcement by locking on the
>> parent tuple. In contrast postgres has chosen (historically, see RI
>> triggers) to fail on detecting conflicting newly inserted rows (the
>> cross-check).
>
>Could you give an example, where postgresql fails to detect
>conflicting newly inserted rows ?

I'm re-using Greg Sabino Mullane example, so here goes the first part

- -- Generic type of printer
CREATE TABLE printer ( id    SERIAL NOT NULL PRIMARY KEY, brand TEXT   NOT NULL, model TEXT   NOT NULL
);
INSERT INTO printer (brand,model) VALUES ('epson','1200');
INSERT INTO printer (brand,model) VALUES ('hp','laserjet99');

- -- A printer can hold one or more cartridges, distinguished by 'color'
CREATE TABLE cartridge_types ( ptype INT  NOT NULL,   CONSTRAINT cc_ptype FOREIGN KEY (ptype) REFERENCES printer(id) ON
DELETERESTRICT, color TEXT NOT NULL 
);
CREATE UNIQUE INDEX ctype ON cartridge_types(ptype,color);
INSERT INTO cartridge_types VALUES (1,'black');
INSERT INTO cartridge_types VALUES (2,'black');
INSERT INTO cartridge_types VALUES (2,'color');

- -- Specific printers in the company
CREATE TABLE printers ( id        SERIAL  NOT NULL PRIMARY KEY, ptype     INTEGER NOT NULL,   CONSTRAINT ptype FOREIGN
KEY(ptype) REFERENCES printer(id), location  TEXT    NOT NULL 
);
INSERT INTO printers(ptype,location) VALUES (1,'Room 234');
INSERT INTO printers(ptype,location) VALUES (2,'Break room #2');
INSERT INTO printers(ptype,location) VALUES (2,'NE corner of warehouse');

- -- Printers require lots of pesky hardware updates
CREATE TABLE cartridge_change ( printer_id INT  NOT NULL,   CONSTRAINT change_printer FOREIGN KEY (printer_id)
REFERENCESprinters(id), color      TEXT NOT NULL, whenchanged TIMESTAMPTZ NOT NULL DEFAULT now() ); 
ALTER TABLE cartridge_change ADD CONSTRAINT cc FOREIGN KEY (printer_id, color) REFERENCES cartridge_types
(ptype,color);

INSERT INTO cartridge_change (printer_id, color) VALUES (1,'black');

------------------------

As a slight note: I'm not going to write out all the triggers that should normally be needed on the cartidge_change,
printers,printer and cartridge_type tables to normally enforce the following constraint: 

FOR ALL t IN cartridge_change
THERE MUST EXIST (SELECT 1                 FROM printers                 INNER JOIN cartridge_types ON
cartridge_types.ptype= printers.ptype                 WHERE printers.id = cartridge_change.printer_id        AND
cartridge_types.color= cartridge_change.color                 ) 

Obviously we follow good practices:
Before inserting a row into cartridge_change, we should check the cartridge_types table and lock the row.
After deleting a row in cartridge_types, we should check the cartridge_change table for constraint violation.
We will first lock the parent and then the childs. This gives consistency and reduces the chance of deadlocks.

As you should note, this cannot be done with a referencial constraint (unless you are really willing to materialize a
quitelarge view "SELECT p.id, c.color FROM printers p INNER JOIN cartridge_types c ON p.ptype = c.ptype"). In these
casesI'm even not sure that the constraint can be enforced (need to think this over carefully). 

------------------------

For the the actions:
We will happen to do the following at the same time: Change the color toner on printer #2 (break room), and remove the
definitionof the "color" toner from the "hp laserjet99". Trigger actions are in implicit... 
The T1 and T2 prefixes will denote the executing transactions. Statements might wait for locks, in which case you
shouldcontinue with the next ones (unless its for the same transaction, in which case I did it wrong). 

T1: BEGIN ISOLATION LEVEL SERIALIZABLE;
T2: BEGIN ISOLATION LEVEL SERIALIZABLE;

-- T1 will insert into cartridge_change, but first the trigger
-- it will lock the parent (it should return exactly a single row).
T1: SELECT 1 FROM printers p INNER JOIN cartridge_change c ON p.ptype = c.ptype WHERE p.printer_id = 2 AND c.color =
'color'FOR SHARE; 
-- Now the actual function
T1: INSERT INTO cartridge_change (printer_id, color) VALUES (2,'color');

-- T2 will now delete the 'color' cartridge definition for hp laserjet99.
-- Trigger doing a check, that dependencies do not exists...
-- First delete statement will wait for the lock of T1...
T2: DELETE FROM cartridge_change WHERE ptype=2 AND color='color';

-- T1 commits in the meanwhile
T1: COMMIT

-- T2 is now out of the lock, so execute trigger
T2: SELECT 1   FROM cartridge_change c   INNER JOIN printers p ON p.id = c.printer_id   WHERE c.color = 'color'   AND
p.ptype= 2; -- Could also be delete... 

-- Please notice we didn't find any rows.
-- It works under read commited nevertheless...

T2: COMMIT;

-----------------------------

At this point my constraint is violated...

Another go would be to instead of the SELECT FOR SHARE we had executed an UPDATE, but this is really ugly, hidious,
counter-intuitiveand concurrency reducing. 

- Joris






Re: Guarenteeing complex referencial integrity through custom triggers

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Included is a more eloborate example, which has some webby thing at
> http://www.familiedobbelsteen.nl/printer-test/ (should work).

Much too elaborate - I'm sorry, but I don't think anyone here is willing 
to wade through nearly 900 lines of code. Can you break it down to a 
simpler test case? Or try and break the schema I provided in my 
previous message perhaps?

> You cannot enforce the above constraint in the database. The keyword is
> serializable isolation level.
>
> Create new printer + cartidge_defs and such.
> Now in T1 delete a cartridge_def
> Now in T2 insert a cartridge replacement.

Works as expected in my schema when I tried it: T2 blocked at the insert, 
waiting on T1. Once T1 committed, T2 threw an error, as the insert was no 
longer valid. Using serializable or not, same result.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200703272044
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFGCbqSvJuQZxSWSsgRA6SXAJ0Wehl9d5PGEsnyRk+FP8fNy21GzwCg06l4
DVog8s9FEXKQ5R6U2DDe/Pw=
=n4Vl
-----END PGP SIGNATURE-----




Re: Guarenteeing complex referencial integrity through custom triggers

From
"Joris Dobbelsteen"
Date:
>-----Original Message-----
>From: pgsql-hackers-owner@postgresql.org
>[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Greg
>Sabino Mullane
>Sent: woensdag 28 maart 2007 2:50
>To: pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Guarenteeing complex referencial
>integrity through custom triggers
>
[snip]
>
>Much too elaborate - I'm sorry, but I don't think anyone here
>is willing to wade through nearly 900 lines of code. Can you
>break it down to a simpler test case? Or try and break the
>schema I provided in my previous message perhaps?

Was the only readily available example I had available at the moment in
very short notice.

-- Generic type of printer
CREATE TABLE printer ( id    SERIAL NOT NULL PRIMARY KEY, brand TEXT   NOT NULL, model TEXT   NOT NULL
);
INSERT INTO printer (brand,model) VALUES ('epson','1200');
INSERT INTO printer (brand,model) VALUES ('hp','laserjet99');

-- A printer can hold one or more cartridges, distinguished by 'color'
CREATE TABLE cartridge_types ( ptype INT  NOT NULL, CONSTRAINT cc_ptype FOREIGN KEY (ptype) REFERENCES printer(id) ON
DELETE RESTRICT, color TEXT NOT NULL, PRIMARY KEY (ptype, color)
);
INSERT INTO cartridge_types VALUES (1,'black');
INSERT INTO cartridge_types VALUES (2,'black');
INSERT INTO cartridge_types VALUES (2,'color');

-- Specific printers in the company
CREATE TABLE printers ( id        SERIAL  NOT NULL PRIMARY KEY, ptype     INTEGER NOT NULL,   CONSTRAINT ptype FOREIGN
KEY(ptype) REFERENCES printer(id), location  TEXT    NOT NULL 
);
INSERT INTO printers(ptype,location) VALUES (1,'Room 234');
INSERT INTO printers(ptype,location) VALUES (2,'Break room #2');
INSERT INTO printers(ptype,location) VALUES (2,'NE corner of
warehouse');

-- Printers require lots of pesky hardware updates
CREATE TABLE cartridge_change ( printer_id INT  NOT NULL,   CONSTRAINT change_printer FOREIGN KEY (printer_id)
REFERENCES
printers(id), color      TEXT NOT NULL, whenchanged TIMESTAMPTZ NOT NULL DEFAULT now()
);

--
-- !!!!!!!!!!!!!!!!!!!
-- NOTICE
--
-- This constraint is invalid, printer_id should reference printers, not
printer...
-- IF this constraint where valid, you could never change a cartridge on
printer #3...
--ALTER TABLE cartridge_change ADD CONSTRAINT cc FOREIGN KEY
(printer_id, color) REFERENCES cartridge_types (ptype,color);

INSERT INTO cartridge_change (printer_id, color) VALUES (1,'black');
-- dropped your last insert, was a constraint violation...


------------------------

>> You cannot enforce the above constraint in the database. The keyword
>> is serializable isolation level.
>>
>> Create new printer + cartidge_defs and such.
>> Now in T1 delete a cartridge_def
>> Now in T2 insert a cartridge replacement.
>
>Works as expected in my schema when I tried it: T2 blocked at
>the insert, waiting on T1. Once T1 committed, T2 threw an
>error, as the insert was no longer valid. Using serializable
>or not, same result.

As noted above, you constraint does not enforce the constraint I
mentioned (though with loose grammer). It cannot use the primitives that
are in the postgresql database, but rather need triggers to have them
enforced.

FOR ALL t IN cartridge_change
THERE MUST EXIST (SELECT 1                 FROM printers                 INNER JOIN cartridge_types ON
cartridge_types.ptype= 
printers.ptype                 WHERE printers.id = cartridge_change.printer_id        AND cartridge_types.color =
cartridge_change.color                 )

If we replace a catridge, the cartridge_type should be defined.

Obviously we follow good practices:
Before inserting a row into cartridge_change, we should check the
cartridge_types table and lock the row.
After deleting a row in cartridge_types, we should check the
cartridge_change table for constraint violation.
We will first lock the parent and then the childs. This gives
consistency (normally) and reduces the chance of deadlocks.

This means two constraints must be defined (assume you have plpgsql
available).

CREATE FUNCTION "TR_cartridgeRI_cartridge_change_insupd"() RETURNS
trigger AS
$BODY$BEGIN-- This must be a BEFORE trigger: we lock the parent first...PERFORM 1FROM printers p INNER JOIN
cartridge_typesc ON p.ptype = 
c.ptypeWHERE p.id = NEW.printer_idAND c.color = NEW.colorFOR SHARE;
IF NOT FOUND THEN    RAISE EXCEPTION 'Cartridge_type not defined';END IF;
RETURN NEW;
END;$BODY$
LANGUAGE 'plpgsql' VOLATILE;

CREATE TRIGGER "TR_cartridgeRI"
BEFORE INSERT OR UPDATE ON cartridge_change FOR EACH ROW
EXECUTE PROCEDURE "TR_RI_cartridgeRI_cartridge_change_insupd"();

CREATE FUNCTION "TR_cartridgeRI_cartridge_types_upddel"() RETURNS
trigger AS
$BODY$BEGIN-- This must be a AFTER trigger: we lock the parent first...PERFORM 1FROM printers p INNER JOIN
cartridge_changec ON p.id = 
c.printer_idWHERE p.ptype = OLD.ptypeAND c.color = OLD.colorFOR SHARE;
IF FOUND THEN    RAISE EXCEPTION 'Cartridge_changes found';END IF;
RETURN NEW;
END;$BODY$
LANGUAGE 'plpgsql' VOLATILE;

CREATE TRIGGER "TR_cartridgeRI"
AFTER UPDATE OR DELETE ON cartridge_types FOR EACH ROW
EXECUTE PROCEDURE "TR_cartridgeRI_cartridge_types_upddel"();

--------------------------------------

Lets test it:

INSERT INTO cartridge_change (printer_id, color) VALUES (1,'purple');
ERROR: Cartridge_type not defined

DELETE FROM cartridge_types;
ERROR: Cartridge_changes found

So these seem to work perfectly well...

---------------------------------------

Now to violate the constraints

T1: BEGIN ISOLATION LEVEL SERIALIZABLE;
T2: BEGIN ISOLATION LEVEL SERIALIZABLE;

-- T1 will insert into cartridge_change, but first the trigger
-- it will lock the parent (it should return exactly a single row).
-- Now the actual function
T1: INSERT INTO cartridge_change (printer_id, color) VALUES (2,'color');

-- T2 will now delete the 'color' cartridge definition for hp
laserjet99.
-- It will block, but not fail with an error (as it should)
T2: DELETE FROM cartridge_types WHERE ptype=2 AND color='color';

-- T1 commits in the meanwhile
T1: COMMIT
-- T2 is not waiting any more
T2: COMMIT;

At this point there is something in cartridge_change that shouldn't be
there. A color cartridge was changed, even though there wasn't any color
cartridge defined any more.

---------------------------------------------

This will not happen with read commited isolation, though. However I
have no way of enforcing this.

The ugly workarround for the problem is to do the following: (please
reload the data at this point)

BEGIN;
DELETE FROM cartridge_change;
DELETE FROM printers;
DELETE FROM cartridge_types;
DELETE FROM printer;
INSERT INTO printer (id,brand,model) VALUES (1,'epson','1200');
INSERT INTO printer (id,brand,model) VALUES (2,'hp','laserjet99');
INSERT INTO cartridge_types VALUES (1,'black');
INSERT INTO cartridge_types VALUES (2,'black');
INSERT INTO cartridge_types VALUES (2,'color');
INSERT INTO printers(id,ptype,location) VALUES (1,1,'Room 234');
INSERT INTO printers(id,ptype,location) VALUES (2,2,'Break room #2');
INSERT INTO printers(id,ptype,location) VALUES (3,2,'NE corner of
warehouse');
INSERT INTO cartridge_change (printer_id, color) VALUES (1,'black');
COMMIT;

Now patch the first trigger to do an update operation instead...

CREATE OR REPLACE FUNCTION "TR_cartridgeRI_cartridge_change_insupd"()
RETURNS trigger AS
$BODY$BEGIN-- This must be a BEFORE trigger: we lock the parent first...PERFORM 1FROM printers p INNER JOIN
cartridge_typesc ON p.ptype = 
c.ptypeWHERE p.id = NEW.printer_idAND c.color = NEW.colorFOR SHARE;
IF NOT FOUND THEN    RAISE EXCEPTION 'Cartridge_type not defined';END IF;
RETURN NEW;
END;$BODY$
LANGUAGE 'plpgsql' VOLATILE;

If you now rerun on the
At this point the T2: DELETE ...
Will fail with "ERROR: could not serialize access due to concurrent
update".
This is what is desired, but not at the costs presented here.

This things has a few undesired side-effects:
* Physical access for delete/insert on WAL and table files.
* Takes an exclusive lock, so blocks other transactions trying to
insert/modify.
* Two simultanously active transactions doing inserts, with the second
serializable, will produce a "cannot serialize" failure, though its not
needed in any way.

In this case we rely (explicitly) on locking of the parent tuple. This
is the same as done in Oracle (but here there are a few optimizations
there, I think). In contrast, the cross-checking over different
snapshots has the nice effect of significantly reducing lock contention,
since the exclusive lock is a shared lock instead. Foremost it allows
for concurrent inserts into the child table (for the same parent). In
fact, it truly allows transactions that are active at the same time to
do inserts for the same parent tuples.

This is the reason for the feature to be desirable:
* Guarentee database consistency, even for complex constraints.
* Provide the best possible concurrency.

-------------------------------------

Now for a little motivation of the subject. My believes are that ACID
properties for a database are really good things (hence my not so good
opinion about MySQL for the upcoming future). In fact, even several
years ago I developed a database that provided decent consistency over
its predecessor. Unfortunally performance suffered. However the project
was very successful as it finally didn't require frequent network
administrator interference to 'fix' the data.

Now for complex projects its good if there is a solid system that you
can rely on to enforce you constraints on the data. It should support
you doing things correctly, and punish for mistakes made in
significantly more complex code.

The absense of a feature here might have impact on operations running in
serializable isolation. Though usage of this will be uncommon, postgres
has commited support for it. In this respect one can ask:
* Can materialized views run out-of-sync?
* Could the constraint be enforced through a materialized view?
In fact, if we make assuptions on the correct operation of the triggers:
* DoS attacks?
* Intentional 'corruption' by users?

-------------------------------------

Quite likely I should have done this earlier and not cut the corners as
much as possible. I'm sorry for that and should have known earlier.
Nevertheless, hopefully this clears it a bit up and provides you with a
better understanding of what I intended to say earlier.

Thanks for the effort,

- Joris