Thread: Order of enforcement of CHECK constraints?

Order of enforcement of CHECK constraints?

From

Tom Lane

Date:

20 March 2015, 19:15:14

My Salesforce colleagues noticed some tests flapping as a result of table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported.  This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up the constraints
in whatever order it happens to find them in pg_constraint.

There's at least one regression test case where this can happen, so we've
been lucky so far that this hasn't caused buildfarm noise.

We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.

In principle the same problem could occur for domain CHECK constraints,
though the odds of multiple CHECKs failing are probably a lot lower.

Do people think this is worth fixing?
        regards, tom lane

Re: Order of enforcement of CHECK constraints?

From

Peter Geoghegan

Date:

20 March 2015, 19:19:43

On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We could fix it by, say, having CheckConstraintFetch() sort the
> constraints by name after loading them.


What not by OID, as with indexes? Are you suggesting that this would
become documented behavior?

-- 
Peter Geoghegan

Re: Order of enforcement of CHECK constraints?

From

Stephen Frost

Date:

20 March 2015, 19:21:45

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> My Salesforce colleagues noticed some tests flapping as a result of table
> CHECK constraints not always being enforced in the same order; ie, if a
> tuple insertion/update violates more than one CHECK constraint, it's not
> deterministic which one is reported.  This is evidently because
> relcache.c's CheckConstraintFetch() just happily loads up the constraints
> in whatever order it happens to find them in pg_constraint.
>
> There's at least one regression test case where this can happen, so we've
> been lucky so far that this hasn't caused buildfarm noise.
>
> We could fix it by, say, having CheckConstraintFetch() sort the
> constraints by name after loading them.
>
> In principle the same problem could occur for domain CHECK constraints,
> though the odds of multiple CHECKs failing are probably a lot lower.
>
> Do people think this is worth fixing?

Yes...  I had thought they were sorted and enforced in alphabetical
order similar to how triggers are fired.  Having non-deterministic
check constraint firing seems bad to me.
Thanks,
    Stephen

Re: Order of enforcement of CHECK constraints?

From

Fabrízio de Royes Mello

Date:

20 March 2015, 19:24:06

<div dir="ltr"><br />On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <<a
href="mailto:pg@heroku.com">pg@heroku.com</a>>wrote:<br />><br />> On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane
<<ahref="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br />> > We could fix it by, say, having
CheckConstraintFetch()sort the<br />> > constraints by name after loading them.<br />><br />><br />>
Whatnot by OID, as with indexes? Are you suggesting that this would<br />> become documented behavior?<br />><br
/> <br/>I think they should be executed in alphabetical order like triggers.<br /><br />Regards,<br /><br />--<br
/>Fabríziode Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog: <a
href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a
href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div>

Re: Order of enforcement of CHECK constraints?

From

Tom Lane

Date:

20 March 2015, 19:37:37

Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
> On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> We could fix it by, say, having CheckConstraintFetch() sort the
>>> constraints by name after loading them.

>> What not by OID, as with indexes? Are you suggesting that this would
>> become documented behavior?

> I think they should be executed in alphabetical order like triggers.

Yeah.  We already have a comparable, and documented, behavior for
triggers, so if we're going to do anything about this I'd vote for
sorting by name (or more specifically, by strcmp()).
        regards, tom lane

Re: Order of enforcement of CHECK constraints?

From

Gavin Flower

Date:

20 March 2015, 19:39:10

On 21/03/15 08:15, Tom Lane wrote:
> My Salesforce colleagues noticed some tests flapping as a result of table
> CHECK constraints not always being enforced in the same order; ie, if a
> tuple insertion/update violates more than one CHECK constraint, it's not
> deterministic which one is reported.  This is evidently because
> relcache.c's CheckConstraintFetch() just happily loads up the constraints
> in whatever order it happens to find them in pg_constraint.
>
> There's at least one regression test case where this can happen, so we've
> been lucky so far that this hasn't caused buildfarm noise.
>
> We could fix it by, say, having CheckConstraintFetch() sort the
> constraints by name after loading them.
>
> In principle the same problem could occur for domain CHECK constraints,
> though the odds of multiple CHECKs failing are probably a lot lower.
>
> Do people think this is worth fixing?
>
>             regards, tom lane
>
>
I think that this is a good idea, I would have implicitly assumed that 
it was deterministic.

Additionally, people could then name CHECK constraints in a way to get 
whatever order they wanted.

The documentation of CREATE TRIGGER says (reading Fabrizio's post 
inspired me to look it up):
"If multiple triggers of the same kind are defined for the same event, 
they will be fired in alphabetical order by name."
So I would have implicitly assumed the same for CHECK constraints, had I 
recently read that.

So I think the current situation is a violation of POLA.

Cheers,
Gavin

Re: Order of enforcement of CHECK constraints?

From

David Steele

Date:

22 March 2015, 14:30:45

On 3/20/15 3:37 PM, Tom Lane wrote:
> Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
>> On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> We could fix it by, say, having CheckConstraintFetch() sort the
>>>> constraints by name after loading them.
>
>>> What not by OID, as with indexes? Are you suggesting that this would
>>> become documented behavior?
>
>> I think they should be executed in alphabetical order like triggers.
>
> Yeah.  We already have a comparable, and documented, behavior for
> triggers, so if we're going to do anything about this I'd vote for
> sorting by name (or more specifically, by strcmp()).
>
>             regards, tom lane

+1 for strcmp() ordering.  Not only is this logical and consistent with
the way triggers are fired, names can be manipulated by the user to
force order when desired.  Not sure if that's as important for check
constraints as it is for triggers but it might be useful, even if only
for things like unit tests.

--
- David Steele
david@pgmasters.net

Re: Order of enforcement of CHECK constraints?

From

"David G. Johnston"

Date:

22 March 2015, 15:11:29

On Sunday, March 22, 2015, David Steele <david@pgmasters.net> wrote:

On 3/20/15 3:37 PM, Tom Lane wrote:
> Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
>> On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> We could fix it by, say, having CheckConstraintFetch() sort the
>>>> constraints by name after loading them.
>
>>> What not by OID, as with indexes? Are you suggesting that this would
>>> become documented behavior?
>
>> I think they should be executed in alphabetical order like triggers.
>
> Yeah. We already have a comparable, and documented, behavior for
> triggers, so if we're going to do anything about this I'd vote for
> sorting by name (or more specifically, by strcmp()).
>
> regards, tom lane

+1 for strcmp() ordering. Not only is this logical and consistent with
the way triggers are fired, names can be manipulated by the user to
force order when desired. Not sure if that's as important for check
constraints as it is for triggers but it might be useful, even if only
for things like unit tests.

--
- David Steele
david@pgmasters.net

It would be nice to know that, at scale, the added comparisons are negligible since it almost all cases all checks are run and pass and the order is irrelevant. Would it be possible for all check constraints to always be run and the resultant error outputs stored in an array which could then be sorted before it is printed? You get better and stable output for errors while minimally impacting good inserts.

David J.

Re: Order of enforcement of CHECK constraints?

From

Ashutosh Bapat

Date:

23 March 2015, 05:23:28

I might be only one objecting here but ...

On Sat, Mar 21, 2015 at 12:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

My Salesforce colleagues noticed some tests flapping as a result of table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported. This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up the constraints
in whatever order it happens to find them in pg_constraint.

Why is it important to report in deterministic manner? If it really matters, we should probably report all the failing constraints. A comparable example would be compiler throwing errors.

There's at least one regression test case where this can happen, so we've
been lucky so far that this hasn't caused buildfarm noise.

We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.

In principle the same problem could occur for domain CHECK constraints,
though the odds of multiple CHECKs failing are probably a lot lower.

Do people think this is worth fixing?

Downthread, parallels are being drawn with triggers. The order in which triggers are fired matters since the triggers can affect each other's results but constraint don't do that. Do they? So, why should we waste some cycles in sorting the constraints?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Order of enforcement of CHECK constraints?

From

Tom Lane

Date:

23 March 2015, 14:16:52

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
> I might be only one objecting here but ...
> On Sat, Mar 21, 2015 at 12:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My Salesforce colleagues noticed some tests flapping as a result of table
>> CHECK constraints not always being enforced in the same order; ie, if a
>> tuple insertion/update violates more than one CHECK constraint, it's not
>> deterministic which one is reported.  This is evidently because
>> relcache.c's CheckConstraintFetch() just happily loads up the constraints
>> in whatever order it happens to find them in pg_constraint.

> Why is it important to report in deterministic manner?

If nothing else, so as not to have regression-test failures.

> If it really
> matters, we should probably report all the failing constraints.

That wouldn't in itself make the output deterministic (you'd still have to
sort); and in any case that's not going to happen because it would require
running each CHECK constraint in its own subtransaction.  Catching errors
that way is *expensive*.  And there's been zero field demand for such a
behavior, so I don't see us adding cycles for something no one's asked
for.  Sorting the check constraints during relcache load, on the other
hand, is a negligible burden compared to the cost of reading
pg_constraint.
        regards, tom lane

Re: Order of enforcement of CHECK constraints?

From

Fabrízio de Royes Mello

Date:

23 March 2015, 15:20:19

<div dir="ltr"><div class="gmail_extra"><br />On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>wrote:<br />><br />> Fabrízio de Royes Mello <<a
href="mailto:fabriziomello@gmail.com">fabriziomello@gmail.com</a>>writes:<br />> > On Fri, Mar 20, 2015 at
4:19PM, Peter Geoghegan <<a href="mailto:pg@heroku.com">pg@heroku.com</a>> wrote:<br />> >> On Fri, Mar
20,2015 at 12:15 PM, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br />>
>>>We could fix it by, say, having CheckConstraintFetch() sort the<br />> >>> constraints by name
afterloading them.<br />><br />> >> What not by OID, as with indexes? Are you suggesting that this would<br
/>>>> become documented behavior?<br />><br />> > I think they should be executed in alphabetical
orderlike triggers.<br />><br />> Yeah.  We already have a comparable, and documented, behavior for<br />>
triggers,so if we're going to do anything about this I'd vote for<br />> sorting by name (or more specifically, by
strcmp()).<br/>><br /><br /></div><div class="gmail_extra">Isn't better do this to read pg_constraint in name
order?<br/><br />-   conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true,<br />+   conscan =
systable_beginscan(conrel,ConstraintNameNspIndexId, true,<br /><br /></div><div class="gmail_extra">Regards,<br
/></div><divclass="gmail_extra"><br />--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br
/>>>Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br />>> Blog: <a
href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a
href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div></div>

Re: Order of enforcement of CHECK constraints?

From

Tom Lane

Date:

23 March 2015, 15:33:27

Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
> On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> We could fix it by, say, having CheckConstraintFetch() sort the
>>>> constraints by name after loading them.

> Isn't better do this to read pg_constraint in name order?

> -   conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true,
> +   conscan = systable_beginscan(conrel, ConstraintNameNspIndexId, true,

Surely not.  That would end up having to read *all* of pg_constraint, not
only the rows applicable to the current relation.

We could get the index to do the work for us if we changed it from an
index on conrelid to one on conrelid, conname.  However, seeing that that
would bloat the index by a factor of sixteen, it hardly sounds like a
free fix either.

I really think that a quick application of qsort is the best-performing
way to do this.
        regards, tom lane

Re: Order of enforcement of CHECK constraints?

From

Ashutosh Bapat

Date:

24 March 2015, 04:51:34

On Mon, Mar 23, 2015 at 7:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
> I might be only one objecting here but ...
> On Sat, Mar 21, 2015 at 12:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My Salesforce colleagues noticed some tests flapping as a result of table
>> CHECK constraints not always being enforced in the same order; ie, if a
>> tuple insertion/update violates more than one CHECK constraint, it's not
>> deterministic which one is reported. This is evidently because
>> relcache.c's CheckConstraintFetch() just happily loads up the constraints
>> in whatever order it happens to find them in pg_constraint.

> Why is it important to report in deterministic manner?

If nothing else, so as not to have regression-test failures.

May be then we are writing tests which are not doing the intended thing. Unless the test is explicitly testing the behaviour in case of more than one failing constraint, it looks like the test is not doing the right thing. If it's testing that case explicitly, by imposing an order, we are still testing a single constraint failure case.

In case of million inserts or bulk load with constraints on, these few cycles spent in ordering the constraints might be problematic, unless the ordering happens only once for a series of inserts.

> If it really
> matters, we should probably report all the failing constraints.

That wouldn't in itself make the output deterministic (you'd still have to
sort); and in any case that's not going to happen because it would require
running each CHECK constraint in its own subtransaction. Catching errors
that way is *expensive*. And there's been zero field demand for such a
behavior, so I don't see us adding cycles for something no one's asked
for. Sorting the check constraints during relcache load, on the other
hand, is a negligible burden compared to the cost of reading
pg_constraint.

regards, tom lane

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Order of enforcement of CHECK constraints?

From

David Rowley

Date:

24 March 2015, 07:16:13

On 24 March 2015 at 17:51, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

In case of million inserts or bulk load with constraints on, these few cycles spent in ordering the constraints might be problematic, unless the ordering happens only once for a series of inserts.

As far as I can see this sort only occurs when a new relation is loaded into the relcache. Relations are cached in the relcache when they're accessed for the first time in each backend, and the cache will be invalidated when the current backend or another backend perform DDL on that relation.

Without relcache we'd need to lookup the system catalogs for just about every SQL statement... This would be very slow indeed!

Regards

David Rowley

Re: Order of enforcement of CHECK constraints?

From

Fabrízio de Royes Mello

Date:

24 March 2015, 19:28:55

On Mon, Mar 23, 2015 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
> > On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>>> We could fix it by, say, having CheckConstraintFetch() sort the
> >>>> constraints by name after loading them.
>
> > Isn't better do this to read pg_constraint in name order?
>
> > - conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true,
> > + conscan = systable_beginscan(conrel, ConstraintNameNspIndexId, true,
>
> Surely not. That would end up having to read *all* of pg_constraint, not
> only the rows applicable to the current relation.
>

Yeah... you're correct... we need the oid in the index.

> We could get the index to do the work for us if we changed it from an
> index on conrelid to one on conrelid, conname. However, seeing that that
> would bloat the index by a factor of sixteen, it hardly sounds like a
> free fix either.
>

But in this way we can save some cicles as Ashutosh complains... or am I missing something?

> I really think that a quick application of qsort is the best-performing
> way to do this.
>

Something like the attached?

With current master:

fabrizio=# create table foo(a integer, b integer);
CREATE TABLE
fabrizio=# alter table foo add constraint aa check(a>0);
ALTER TABLE
fabrizio=# alter table foo add constraint bb check(b>0);
ALTER TABLE
fabrizio=# insert into foo values (0,0);
ERROR: new row for relation "foo" violates check constraint "bb"
DETAIL: Failing row contains (0, 0).

With the attached patch:

fabrizio=# create table foo(a integer, b integer);
CREATE TABLE
fabrizio=# alter table foo add constraint aa check(a>0);
ALTER TABLE
fabrizio=# alter table foo add constraint bb check(b>0);
ALTER TABLE
fabrizio=# insert into foo values (0,0);
ERROR: new row for relation "foo" violates check constraint "aa"
DETAIL: Failing row contains (0, 0).

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Attachment

order-of-check-constraints-v1.patch

Re: Order of enforcement of CHECK constraints?

From

Fabrízio de Royes Mello

Date:

24 March 2015, 20:05:36

On Tue, Mar 24, 2015 at 4:28 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote:
>
>
>
> On Mon, Mar 23, 2015 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Fabrízio de Royes Mello <fabriziomello@gmail.com> writes:
> > > On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>>> We could fix it by, say, having CheckConstraintFetch() sort the
> > >>>> constraints by name after loading them.
> >
> > > Isn't better do this to read pg_constraint in name order?
> >
> > > - conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true,
> > > + conscan = systable_beginscan(conrel, ConstraintNameNspIndexId, true,
> >
> > Surely not. That would end up having to read *all* of pg_constraint, not
> > only the rows applicable to the current relation.
> >
>
> Yeah... you're correct... we need the oid in the index.
>
>
> > We could get the index to do the work for us if we changed it from an
> > index on conrelid to one on conrelid, conname. However, seeing that that
> > would bloat the index by a factor of sixteen, it hardly sounds like a
> > free fix either.
> >
>
> But in this way we can save some cicles as Ashutosh complains... or am I missing something?
>
>
> > I really think that a quick application of qsort is the best-performing
> > way to do this.
> >
>
> Something like the attached?
>
> With current master:
>
> fabrizio=# create table foo(a integer, b integer);
> CREATE TABLE
> fabrizio=# alter table foo add constraint aa check(a>0);
> ALTER TABLE
> fabrizio=# alter table foo add constraint bb check(b>0);
> ALTER TABLE
> fabrizio=# insert into foo values (0,0);
> ERROR: new row for relation "foo" violates check constraint "bb"
> DETAIL: Failing row contains (0, 0).
>
>
> With the attached patch:
>
> fabrizio=# create table foo(a integer, b integer);
> CREATE TABLE
> fabrizio=# alter table foo add constraint aa check(a>0);
> ALTER TABLE
> fabrizio=# alter table foo add constraint bb check(b>0);
> ALTER TABLE
> fabrizio=# insert into foo values (0,0);
> ERROR: new row for relation "foo" violates check constraint "aa"
> DETAIL: Failing row contains (0, 0).
>

Forgot this patch... you've already pushed to the master the qsort for check constraints [1]. Really nice!!

Regards,

[1] http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e5f455f59fed0632371cddacddd79895b148dc07

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello