Re: Implementing SQL ASSERTION - Mailing list pgsql-hackers
From | Joe Wildish |
---|---|
Subject | Re: Implementing SQL ASSERTION |
Date | |
Msg-id | 985632EC-3E39-4C51-B47A-ED0ABF63D64F@elusive.cx Whole thread Raw |
In response to | Re: Implementing SQL ASSERTION (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: Implementing SQL ASSERTION
|
List | pgsql-hackers |
Hackers, Attached is a WIP patch for SQL assertion. I am posting it for anyone who might be interested in seeing it, for comments/feedback,and to see if others are keen to collaborate on taking it further. It is not near production-ready (seethoughts on that below). The patch builds on the work posted by Peter back in 2013. I've taken his code and updated it to conform to some generalchanges made to the codebase since then. The bulk of the new work I have done is around when an assertion needs tobe checked. Essentially it is an implementation of the algorithm described by Ceri & Widom in "Deriving Production Rulesfor Constraint Maintenance” — http://infolab.stanford.edu/pub/papers/constraint-maintenance.ps The general idea is to traverse the expression tree and derive the set of potentially invalidating operations. These operationsare used to determine when the constraint trigger fires and causes a re-check. The detail is in the paper but someexamples are: * insertion into the subject of an exists cannot be invalidating; * deletion from the subject of a not exists cannot be invalidating; * update of columns in the target list of an exists cannot be invalidating; * certain combinations of aggregates with comparison operations cannot be invalidating. As an example of the last point, the expression "CHECK (10 > (SELECT COUNT(*) FROM t))" cannot be invalidated by a deleteor an update but can be invalidated by an insert. I have implemented most of the optimisations mentioned in the paper. There are one or two that I am unsure about, specificallyhow to deal with set-operations that are the subject of an exists. According to the paper, these are optimisablewhen they're the subject of an exists, but I think it is only applicable for union and not intersect or except,so I have skipped that particular optimisation for the time being. The algorithm works under the assumption that when a recheck occurs the previous check result was true (the research reportby Ceri & Widom does acknolwedge this assumption). However, unfortunately the SQL specification requires that bothtrue and unknown be valid results for an assertion's check expression. This doesn't play too well with the algorithmso for the time being I have disallowed null. I think the solution here may be that when a null result for a checkoccurs, the assertion is changed to trigger on all operations against the involved tables; once it returns to true,the triggers can be returned to fire only on the derived invalidating operations. More thought required though. (note:having just written this paragraph, I've realised I can't right now think of a concrete example illustrating the point,so it may be that I'm wrong on this). The paper does mention a set of optimisations that I have not yet attempted to implement. These are essentially the techniqueof evaluating the expression against the deltas of a change rather than the full tables. Clearly there is a largeoverlap with incremental maintainence of views and actually the two authors of the paper have a similiarly named papercalled "Deriving Production Rules for Incremental View Maintanence". Although I have yet to finish reviewing all theliterature on the subject, I suspect that realistically for this to make it into production, we'd need some implementationof these techniques to make the performance palatable. Cheers, -Joe
Attachment
pgsql-hackers by date: