Guarenteeing complex referencial integrity through custom triggers - Mailing list pgsql-hackers
From | Joris Dobbelsteen |
---|---|
Subject | Guarenteeing complex referencial integrity through custom triggers |
Date | |
Msg-id | 73427AD314CC364C8DF0FFF9C4D693FF037A3A@nehemiah.joris2k.local Whole thread Raw |
Responses |
Re: Guarenteeing complex referencial integrity through custom triggers
Re: Guarenteeing complex referencial integrity through custom triggers Re: Guarenteeing complex referencial integrity through custom triggers |
List | pgsql-hackers |
At this time Postgresql is unable to guarentee that complex* referencial integrity (RI) constraints can be enforced using standard (read non-C) triggers or SQL. This is an unfortunate side-effect of the MVCC model, as it allows multiple transactions to modify a snapshot of the data at a certain point in time. Later these modifications are 'merged' together, resuling in a violation of a constraint that was not in the individual snapshots. (You might notice that the same happens with most version constrol software). Some previous discussion was in <http://archives.postgresql.org/pgsql-general/2007-02/msg01234.php>. As a real-world example where the constraint cannot be enforced in postgresql. "For every tuple t in cartridge_change, there must exists a tuple t' in printers with t.id = t'.id, and a tuple t'' in cartridge_types with t.color = t''.color and t'.printertype = t''.printertype" Under serializable isolation its impossible to enforce this constraint in all cases (without triggers written in C). For read committed its probably possible. The failure lies in scenarios where the MVCC rules hide newly inserted rows (by a transaction that starts later). This is also a concern in the referencial integrity (RI) constraints. Here the problem is solved by cross-checking against the 'latest' snapshot. However this functionality is not exposed to the user and can only be used from functions written in C. My intention is to expose the functionality to the outside world for general use. This provides means to ensure custom complex constraints can be enforced properly. I hope to push it into 8.3 if possible. Now lets get down to the details: The problem boils down to: * Newly inserted child (dependent on existance of others) tuples are not detected, while we should be aware of their presence. This happens under the following situations: * Checking the existance of child tuples. * Deleting the child tuples. * Updating the child tuples. We can exclude these situations: * Inserting a child, since the parent must exist for the current transaction. * Doing something to a parent, it's the 'propagation' that is a possible problem. One of the commercial competition, Oracle, which also has an implementation of the MVCC model, allows the "FOR UPDATE" clause to function as a way to ensure referencial integrity. See <http://download-uk.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns _sqlproc.htm#sthref294> (and beyond) for details. It seems Oracle treats SELECT FOR UPDATE as something special. Unfortunally the manual is not too in-depth and slight confusing on the details, so I could be wrong. Oracle has choosen to allow constraint enforcement by locking on the parent tuple. In contrast postgres has chosen (historically, see RI triggers) to fail on detecting conflicting newly inserted rows (the cross-check). I can't debate which is better, since the situations where the problem can happen are restricted and transactions are not normally in serializable isolation, its doesn't seem to be a problem in practice. Nevertheless continuing as currently is probably the way to go, as the infrastructure is available and working properly. A few things remain: * Language extension? "for referencial" seems not too intrusive. * SPI_execute_snapshot interoperation, should give a failure. * Doing parallel tests? I think about hacking it in... Any opinions? - Joris Dobbelsteen
pgsql-hackers by date: