Thread: Checking number of entries

Checking number of entries

From

oberpwd@anubis.network.com (Wade D. Oberpriller)

Date:

28 September 2000, 18:31:18

Hello,

Is it possible to restrict the number of entries in a table? I have attempted
to write a check like so:

    CREATE TABLE mytable (
        id integer NOT NULL,
        CHECK (COUNT(id) <= 10)
    );

This is not allowed, I get an error saying:

    ERROR:  ExecEvalAggref: no aggregates in this expression context

What is the proper way of doing this? I want to be able to force a minimum and
maximum number of entries.

Wade Oberpriller
StorageTek
oberpwd@network.com

Re: Checking number of entries

From

Alfred Perlstein

Date:

28 September 2000, 19:57:37

* Wade D. Oberpriller <oberpwd@anubis.network.com> [000928 15:34] wrote:
> Hello,
>
> Is it possible to restrict the number of entries in a table? I have attempted
> to write a check like so:
>
>     CREATE TABLE mytable (
>         id integer NOT NULL,
>         CHECK (COUNT(id) <= 10)
>     );
>
> This is not allowed, I get an error saying:
>
>     ERROR:  ExecEvalAggref: no aggregates in this expression context
>
> What is the proper way of doing this? I want to be able to force a minimum and
> maximum number of entries.

Use a trigger on INSERT that does that.

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."

Re: Checking number of entries

From

Philip Warner

Date:

28 September 2000, 20:18:36

At 16:57 28/09/00 -0700, Alfred Perlstein wrote:
>* Wade D. Oberpriller <oberpwd@anubis.network.com> [000928 15:34] wrote:
>> Hello,
>>
>> Is it possible to restrict the number of entries in a table? I have
attempted
>> to write a check like so:
>>
>>     CREATE TABLE mytable (
>>         id integer NOT NULL,
>>         CHECK (COUNT(id) <= 10)
>>     );
>>
>> This is not allowed, I get an error saying:
>>
>>     ERROR:  ExecEvalAggref: no aggregates in this expression context

I just tried:

    CREATE TABLE mytable (
        id integer NOT NULL
    );

    Alter  TABLE mytable Add
        CHECK ( (Select COUNT(distinct id) From mytable) <= 10)
    );


Under 7.1 (and with the constraint in the table definition in 7.0.2) and it
lets me define the table & constraint, but when I insert into the table, I
get:

    ERROR:  ExecEvalExpr: unknown expression type 108

Seems to me it should either disallow the creation, or work properly.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Philip Warner

Date:

28 September 2000, 22:33:20

At 11:19 29/09/00 +1000, Philip Warner wrote:
>At 16:57 28/09/00 -0700, Alfred Perlstein wrote:
>
>Under 7.1 (and with the constraint in the table definition in 7.0.2) and it
>lets me define the table & constraint, but when I insert into the table, I
>get:
>
>    ERROR:  ExecEvalExpr: unknown expression type 108
>

Just searched my mail archives, and this has been reported various times
before; AFAICT it's a known issue.

A previously suggested solution was to define a function that does the
subselect.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Philip Warner

Date:

29 September 2000, 00:07:20

At 00:04 29/09/00 -0400, Tom Lane wrote:
>
>Picky, picky ...
>
>I've added the appropriate checks to AddRelationRawConstraints():
>

I thought you'd just add the feature. My mistake ;-).


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Tom Lane

Date:

29 September 2000, 00:08:45

Philip Warner <pjw@rhyme.com.au> writes:
> Under 7.1 (and with the constraint in the table definition in 7.0.2) and it
> lets me define the table & constraint, but when I insert into the table, I
> get:
>     ERROR:  ExecEvalExpr: unknown expression type 108
> Seems to me it should either disallow the creation, or work properly.

Picky, picky ...

I've added the appropriate checks to AddRelationRawConstraints():

regression=# Alter  TABLE mytable Add
regression-# CHECK ( (Select COUNT(distinct id) From mytable) <= 10);
ERROR:  Cannot use subselect in CHECK clause
regression=# Alter  TABLE mytable Add
regression-# CHECK (  COUNT( id) <= 10);
ERROR:  Cannot use aggregate in CHECK clause

Coming soon to a CVS server near you.

            regards, tom lane

Re: Checking number of entries

From

Tom Lane

Date:

29 September 2000, 00:22:42

Philip Warner <pjw@rhyme.com.au> writes:
>> I've added the appropriate checks to AddRelationRawConstraints():

> I thought you'd just add the feature. My mistake ;-).

Unfortunately, that's not a four-line change :-(.

Quite aside from implementation shortcomings, there are some big
definitional issues with subselects in constraints --- just exactly
what are they constraining?  See past discussions in the archives.

            regards, tom lane

Re: Checking number of entries

From

Philip Warner

Date:

29 September 2000, 01:19:27

At 00:22 29/09/00 -0400, Tom Lane wrote:
>
>Quite aside from implementation shortcomings, there are some big
>definitional issues with subselects in constraints --- just exactly
>what are they constraining?  See past discussions in the archives.
>

I've done a search for 'select constraint', with no obvious matches. My
naieve approach would be that a CHECK constraint should evaluate 'true'
assuming the data change was commited/applied (depending on evaluation
time). The CHECK is evaluated like an AFTER <action> trigger.

I presume I am missing something...


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Tom Lane

Date:

29 September 2000, 01:57:50

Philip Warner <pjw@rhyme.com.au> writes:
> At 00:22 29/09/00 -0400, Tom Lane wrote:
>> Quite aside from implementation shortcomings, there are some big
>> definitional issues with subselects in constraints --- just exactly
>> what are they constraining?  See past discussions in the archives.

> I presume I am missing something...

The canonical example is a check like
    CHECK (mycol > (SELECT max(othertab.othercol) FROM othertab))
declared as a constraint on mytab.  Here, an insertion into
othertab could mean that the constraint on mytab no longer holds
(because you inserted a value larger than some existing mytab row).
Should this mean that the constraint on mytab can prevent insertions
into othertab?  Quite aside from implementation problems ---
a simplistic approach would mean re-evaluation of every constraint
in the whole DB against every row in the whole DB for every update ---
there are also interesting security issues.  At first sight one would
think that the owner of mytab only needs read access on othertab to
define the above constraint, but in fact this "read access" is
sufficient to deny updates on othertab.  Not good.

You can devise related paradoxes within a single table if constraints
involving aggregates are permitted.  The basic problem is that it's
not clear which data values are the constrainers and which are the
constrainees.

            regards, tom lane

Re: Checking number of entries

From

Philip Warner

Date:

29 September 2000, 02:17:13

At 01:53 29/09/00 -0400, Tom Lane wrote:
>
>The canonical example is a check like
>    CHECK (mycol > (SELECT max(othertab.othercol) FROM othertab))
>declared as a constraint on mytab.

I would argue that a CHECK constraint only applies to the table on which it
is defined - basically a contract to ensure that certain conditions apply
to changes on that table.

Once you allow functions in constraints, you give up all hope of
cross-compatibility, eg.

Create Table tab1 (f1 integer);
Create Function tot_of_tab1() returns integer
     as 'select cast(sum(f1) as int4) from tab1' language 'sql';
Alter Table tab1 add check(tot_of_tab1() > 0);

zzz=# insert into tab1 values(1);
INSERT 142380 1
zzz=# insert into tab1 values(-10);
INSERT 142381 1
zzz=# select tot_of_tab1();
 tot_of_tab1
-------------
          -9
(1 row)

zzz=# insert into tab1 values(-12);
ERROR:  ExecAppend: rejected due to CHECK constraint $1

This demonstrates the problem, and, I think, highlights a bug in the
constraint checking code: the constraint should have failed on the second
insert. Maybe the constraint is evaluate before the insert?

>Should this mean that the constraint on mytab can prevent insertions
>into othertab?

I'd say not. Consider:

   weird_function() returns boolean as 'select current_time > ''16:00:00''';

This *could* be used to prevent updates occurring before 4pm - it says
nothing about the validity of the data, it is just a rule about how updates
can be performed.

Does DB theory say what sort of contract CHECK constraints imply?

>You can devise related paradoxes within a single table if constraints
>involving aggregates are permitted.  The basic problem is that it's
>not clear which data values are the constrainers and which are the
>constrainees.

What's wrong with CHECK constraints only affecting the table on which they
are defined? It seems better than (a) allowing back-door functions, and (b)
preventing more sensible subselects.

Sorry if this ground has already been covered...

----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Stephan Szabo

Date:

29 September 2000, 14:46:37

On Fri, 29 Sep 2000, Philip Warner wrote:

> At 01:53 29/09/00 -0400, Tom Lane wrote:
> >
> >The canonical example is a check like
> >    CHECK (mycol > (SELECT max(othertab.othercol) FROM othertab))
> >declared as a constraint on mytab.
>
> I would argue that a CHECK constraint only applies to the table on which it
> is defined - basically a contract to ensure that certain conditions apply
> to changes on that table.

The problem is that the spec seems to say that constraints
are checked either at end of transaction or statement
and doesn't seem to say anything about limiting which statements
(in fact, it says, "each SQL statement" for the immediate case),
and that if a constraint is not satisified an exception is raised.
It basically appears to be implying that all immediate constraints
must be satisfied at end of statement and all deferred ones at
end of transaction, regardless of what the statement or transaction
was.  I could be misreading the "Checking constraints" section
as well, though:
     4.10.1  Checking of constraints

      Every constraint is either deferrable or non-deferrable. Within
      a transaction, every constraint has a constraint mode; if a con-
      straint is non-deferrable, then its constraint mode is always im-
      mediate, otherwise it is either or immediate or deferred. Every
      constraint has an initial constraint mode that specifies the
      constraint mode for that constraint at the start of each SQL-
      transaction and immediately after definition of that constraint.
      If a constraint is deferrable, then its constraint mode may be
      changed (from immediate to deferred, or from deferred to immediate)
      by execution of a <set constraints mode statement>.

      The checking of a constraint depends on its constraint mode within
      the current SQL-transaction. If the constraint mode is immedi-
      ate, then the constraint is effectively checked at the end of
      each SQL-statement. If the constraint mode is deferred, then the
      constraint is effectively checked when the constraint mode is
      changed to immediate either explicitly by execution of a <set con-
      straints mode statement>, or implicitly at the end of the current
      SQL-transaction.

      When a constraint is checked other than at the end of an SQL-
      transaction, if it is not satisfied, then an exception condition
      is raised and the SQL-statement that caused the constraint to be
      checked has no effect other than entering the exception information
      into the diagnostics area. When a <commit statement> is executed,
      all constraints are effectively checked and, if any constraint
      is not satisfied, then an exception condition is raised and the
      transaction is terminated by an implicit <rollback statement>.


> Once you allow functions in constraints, you give up all hope of
> cross-compatibility, eg.

> Create Table tab1 (f1 integer);
> Create Function tot_of_tab1() returns integer
>      as 'select cast(sum(f1) as int4) from tab1' language 'sql';
> Alter Table tab1 add check(tot_of_tab1() > 0);
>
> zzz=# insert into tab1 values(1);
> INSERT 142380 1
> zzz=# insert into tab1 values(-10);
> INSERT 142381 1
> zzz=# select tot_of_tab1();
>  tot_of_tab1
> -------------
>           -9
> (1 row)
>
> zzz=# insert into tab1 values(-12);
> ERROR:  ExecAppend: rejected due to CHECK constraint $1
>
> This demonstrates the problem, and, I think, highlights a bug in the
> constraint checking code: the constraint should have failed on the second
> insert. Maybe the constraint is evaluate before the insert?

Yeah, my guess is that it checks the new value of the row before the
insert and so the select doesn't see the new row.  And yes, as far as I
can tell, this would be incorrect behavior.

And functions are a problem because they shouldn't give different
behavior.  One solution would be to take check constraints calling
functions that may check sql data (i.e. any functions without a certain
attribute) and treat them as a check after each statement, but that sounds
like it would suck.

The other thing is that if you really want a insert/update check only, you
can do that by using a trigger.

Re: Checking number of entries

From

Philip Warner

Date:

29 September 2000, 20:53:09

At 11:45 29/09/00 -0700, Stephan Szabo wrote:
>On Fri, 29 Sep 2000, Philip Warner wrote:
>
>> At 01:53 29/09/00 -0400, Tom Lane wrote:
>> >
>> >The canonical example is a check like
>> >    CHECK (mycol > (SELECT max(othertab.othercol) FROM othertab))
>> >declared as a constraint on mytab.
>>
>> I would argue that a CHECK constraint only applies to the table on which it
>> is defined - basically a contract to ensure that certain conditions apply
>> to changes on that table.
>
>The problem is that the spec seems to say that constraints
>are checked either at end of transaction or statement
>and doesn't seem to say anything about limiting which statements

You're quite right; my (weak) argument was that since we are already
breaking the spec by allowing arbitrary functions, we should either
strengthen our conformance to the spec (by disallowing arbitrary
functions), or allow statements inside CHECK constraints (since they are
there already via functions), and ignore the consequences in other tables
(as we currently do with functions).


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Stephan Szabo

Date:

29 September 2000, 21:10:20

On Sat, 30 Sep 2000, Philip Warner wrote:

> At 11:45 29/09/00 -0700, Stephan Szabo wrote:
> >On Fri, 29 Sep 2000, Philip Warner wrote:
> >
> >> At 01:53 29/09/00 -0400, Tom Lane wrote:
> >> >
> >> >The canonical example is a check like
> >> >    CHECK (mycol > (SELECT max(othertab.othercol) FROM othertab))
> >> >declared as a constraint on mytab.
> >>
> >> I would argue that a CHECK constraint only applies to the table on which it
> >> is defined - basically a contract to ensure that certain conditions apply
> >> to changes on that table.
> >
> >The problem is that the spec seems to say that constraints
> >are checked either at end of transaction or statement
> >and doesn't seem to say anything about limiting which statements
>
> You're quite right; my (weak) argument was that since we are already
> breaking the spec by allowing arbitrary functions, we should either
> strengthen our conformance to the spec (by disallowing arbitrary
> functions), or allow statements inside CHECK constraints (since they are
> there already via functions), and ignore the consequences in other tables
> (as we currently do with functions).

I'd be worried about doing the latter only because it will definately
cause problems for people who are actually expecting the behavior
to follow the spec behavior, plus it would make it much harder for
us to move to the correct behavior later.
And, the former can still be basically done via triggers whether or not we
were to limit them in check constraints that might be a safer route,
except for the fact it breaks compatibility with existing versions.
Finally, there is the possibility of having the class of constraints
checked after any sql statement which is going to suck for performance
but actually probably be "correct" in a strict spec sense.

How do other systems (Oracle, etc...) handle the subselect and arbitrary
function cases?

And I haven't given much thought to it really, but has anyone thought
about deferred check constraints?

Re: Checking number of entries

From

Philip Warner

Date:

30 September 2000, 08:51:31

At 18:09 29/09/00 -0700, Stephan Szabo wrote:
>
>How do other systems (Oracle, etc...) handle the subselect and arbitrary
>function cases?
>

Dec RDB seems to go the distance. With the following incredibly silly
constraint:

alter table zzz add constraint
     check( (select sum(zzzz.f1*zzz1.f1_1) from zzzz,zzz1
                where zzzz.f2=zzz1.f1_1) + f1 > 0);

Any updates to the referenced tables cause it to run a query that *seems*
to be:

     select * from ZZZ where not ( <constraint-condition> )

This is my guess based on the optimizer output, but it seems reasonable.

An update to the main table causes the same query to be executed but only
for the updated row.

Adding the main table to the check query produces a similar result.

This may be a useful general model? Is there a problem with the approach?
It couls be further improved by using 'limit 1', but otherwise seems good...

----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Stephan Szabo

Date:

30 September 2000, 14:25:56

On Sat, 30 Sep 2000, Philip Warner wrote:

> At 18:09 29/09/00 -0700, Stephan Szabo wrote:
> >
> >How do other systems (Oracle, etc...) handle the subselect and arbitrary
> >function cases?
> >
>
> Dec RDB seems to go the distance. With the following incredibly silly
> constraint:
>
> alter table zzz add constraint
>      check( (select sum(zzzz.f1*zzz1.f1_1) from zzzz,zzz1
>                 where zzzz.f2=zzz1.f1_1) + f1 > 0);
>
> Any updates to the referenced tables cause it to run a query that *seems*
> to be:
>
>      select * from ZZZ where not ( <constraint-condition> )
>
> This is my guess based on the optimizer output, but it seems reasonable.
>
> An update to the main table causes the same query to be executed but only
> for the updated row.
>
> Adding the main table to the check query produces a similar result.
>
>
> This may be a useful general model? Is there a problem with the approach?
> It couls be further improved by using 'limit 1', but otherwise seems good...

Well, I'm not sure how easy/hard it is to get all referenced tables right
now, since you'd have to decend into subqueries.   But, yeah, that seems
like it would probably work since you probably couldn't currently put
anything in the check constraint that would work there but not in a
subquery context (although i could be wrong about that).

Can you do arbitrary user functions in Dec RDB that access tables and put
those in constraints? I'm not sure what to do about the fact that we can't
actually get the referenced tables inside functions for some cases.

Re: Checking number of entries

From

Philip Warner

Date:

30 September 2000, 19:44:23

At 11:25 30/09/00 -0700, Stephan Szabo wrote:
>
>On Sat, 30 Sep 2000, Philip Warner wrote:
>
>>
>> alter table zzz add constraint
>>      check( (select sum(zzzz.f1*zzz1.f1_1) from zzzz,zzz1
>>                 where zzzz.f2=zzz1.f1_1) + f1 > 0);
>>
>> Any updates to the referenced tables cause it to run a query that *seems*
>> to be:
>>
>>      select * from ZZZ where not ( <constraint-condition> )
>>
>
>Well, I'm not sure how easy/hard it is to get all referenced tables right
>now, since you'd have to decend into subqueries.   But, yeah, that seems
>like it would probably work since you probably couldn't currently put
>anything in the check constraint that would work there but not in a
>subquery context (although i could be wrong about that).

Yes; we'd need to generate a plan for the constraint, and find all the
tables it references. Is that a hard thing to do?


>Can you do arbitrary user functions in Dec RDB that access tables and put
>those in constraints? I'm not sure what to do about the fact that we can't
>actually get the referenced tables inside functions for some cases.

RDB has two kinds of functions: external & SQL. External functions can't
make data changes, or  even easily read the database, and SQL functions are
just pieces of (complex multi-line) SQL, that can be parsed like anything
else. As a result, when you call a function in a constraint, it plans the
function, and gets the list of tables.

Also, RDB does allow non-deterministic constraints, but if someone is silly
enough to use one, then so be it...


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Stephan Szabo

Date:

01 October 2000, 14:10:26

On Sun, 1 Oct 2000, Philip Warner wrote:

> At 11:25 30/09/00 -0700, Stephan Szabo wrote:
> >
> >On Sat, 30 Sep 2000, Philip Warner wrote:
> >
> >>
> >> alter table zzz add constraint
> >>      check( (select sum(zzzz.f1*zzz1.f1_1) from zzzz,zzz1
> >>                 where zzzz.f2=zzz1.f1_1) + f1 > 0);
> >>
> >> Any updates to the referenced tables cause it to run a query that *seems*
> >> to be:
> >>
> >>      select * from ZZZ where not ( <constraint-condition> )
> >>
> >
> >Well, I'm not sure how easy/hard it is to get all referenced tables right
> >now, since you'd have to decend into subqueries.   But, yeah, that seems
> >like it would probably work since you probably couldn't currently put
> >anything in the check constraint that would work there but not in a
> >subquery context (although i could be wrong about that).
>
> Yes; we'd need to generate a plan for the constraint, and find all the
> tables it references. Is that a hard thing to do?

Probably not, although I've been wrong about that before... :(
Well, if I do end up doing the stuff for holding what objects reference
what other objects, I'm going to have to do this anyway since the
constraint references all of those tables and should either be removed
or restrict the removal of those tables (I think there are wierd special
cases involved, but in general...)

The other part could probably be done by creating after
insert/update/delete triggers on those tables with the oid of
the constraint row as data.  I'm not sure of the best way to do
the actual check... it'd be easy to do in spi, but that has its
own problems. Doing a manual scan looking for rows that fail is also
easy but rather slow if there are alot rows where very few fail.

> >Can you do arbitrary user functions in Dec RDB that access tables and put
> >those in constraints? I'm not sure what to do about the fact that we can't
> >actually get the referenced tables inside functions for some cases.
>
> RDB has two kinds of functions: external & SQL. External functions can't
> make data changes, or  even easily read the database, and SQL functions are
> just pieces of (complex multi-line) SQL, that can be parsed like anything
> else. As a result, when you call a function in a constraint, it plans the
> function, and gets the list of tables.

Unfortunately, we can have cases where the plan depends on other data
outside of stuff that's known at creation time, like data in random
tables.  I'm really not sure how to handle those cases except either
disallowing them or handling them incorrectly.

Re: Checking number of entries

From

Philip Warner

Date:

04 October 2000, 06:17:09

At 11:09 1/10/00 -0700, Stephan Szabo wrote:
>On Sun, 1 Oct 2000, Philip Warner wrote:
>>
>> Yes; we'd need to generate a plan for the constraint, and find all the
>> tables it references. Is that a hard thing to do?
>
>Probably not, although I've been wrong about that before... :(
>Well, if I do end up doing the stuff for holding what objects reference
>what other objects, I'm going to have to do this anyway since the
>constraint references all of those tables and should either be removed
>or restrict the removal of those tables (I think there are wierd special
>cases involved, but in general...)

This sounds great! As you know, there's a whole lot of places that will
profit from this.


>The other part could probably be done by creating after
>insert/update/delete triggers on those tables with the oid of
>the constraint row as data.  I'm not sure of the best way to do
>the actual check... it'd be easy to do in spi, but that has its
>own problems. Doing a manual scan looking for rows that fail is also
>easy but rather slow if there are alot rows where very few fail.

I'd have thought sending it to something that lets the optimizer deal with
it; manual row by row would be a disaster, since in 99% of cases is a well
designed application, no rows would match (ie. no failures).

I know people (Tom?) have complained about SPI in the backend before, I
think, but it seems like the way to go - unless there is a lower level
query representation that can be generated when the constraint is defined
then passed to the optimiser at runtime...


>>
>> RDB has two kinds of functions: external & SQL. External functions can't
>> make data changes, or  even easily read the database, and SQL functions are
>> just pieces of (complex multi-line) SQL, that can be parsed like anything
>> else. As a result, when you call a function in a constraint, it plans the
>> function, and gets the list of tables.
>
>Unfortunately, we can have cases where the plan depends on other data
>outside of stuff that's known at creation time, like data in random
>tables.  I'm really not sure how to handle those cases except either
>disallowing them or handling them incorrectly.

If someone defines a constraint based on random or varying data (eg.
CURRENT_TIMESTAMP), then they either (a) know what they are doing, and
would not expect reverse validation, or (b) haven't got a clue what they
are doing and probably don't expect reverse validation. How does that
sound? ;-}



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Checking number of entries

From

Stephan Szabo

Date:

04 October 2000, 13:05:35

On Wed, 4 Oct 2000, Philip Warner wrote:

> >The other part could probably be done by creating after
> >insert/update/delete triggers on those tables with the oid of
> >the constraint row as data.  I'm not sure of the best way to do
> >the actual check... it'd be easy to do in spi, but that has its
> >own problems. Doing a manual scan looking for rows that fail is also
> >easy but rather slow if there are alot rows where very few fail.
>
> I'd have thought sending it to something that lets the optimizer deal with
> it; manual row by row would be a disaster, since in 99% of cases is a well
> designed application, no rows would match (ie. no failures).
>
> I know people (Tom?) have complained about SPI in the backend before, I
> think, but it seems like the way to go - unless there is a lower level
> query representation that can be generated when the constraint is defined
> then passed to the optimiser at runtime...

Yeah, if there is a good way to do that, I should probably change the
check constraint stuff for alter table so it does the not (<expr>)
rather than a scan and evaluate.

> >Unfortunately, we can have cases where the plan depends on other data
> >outside of stuff that's known at creation time, like data in random
> >tables.  I'm really not sure how to handle those cases except either
> >disallowing them or handling them incorrectly.
>
> If someone defines a constraint based on random or varying data (eg.
> CURRENT_TIMESTAMP), then they either (a) know what they are doing, and
> would not expect reverse validation, or (b) haven't got a clue what they
> are doing and probably don't expect reverse validation. How does that
> sound? ;-}

Heh.  Probably a good assumption at least for a first pass.  And it would
at least get us closer to complience.  I guess CURRENT_TIMESTAMP,
CURRENT_USER (or whatever its real name is) and functions that might
provide inconsistant results or access other tables are danger signs.

RE: Checking number of entries

From

"Hiroshi Inoue"

Date:

05 October 2000, 14:14:08

I don't understand this thread well,sorry.
(I've received no pgsql-general mails for a few days).

IIRC,SQL92 recommends to handle this case
as ASSERTIONS rather than as CONSTRAINTS.
The implementation would be quite different from
that of (current) column constraints.
I've been suspicious if it could be implemented
properly and effectively.
For example,are there any discussions about
locking needed to guarantee the consistency ?

Regards.

Hiroshi Inoue

> -----Original Message-----
> From: Stephan Szabo
>
> On Wed, 4 Oct 2000, Philip Warner wrote:
>
> > >The other part could probably be done by creating after
> > >insert/update/delete triggers on those tables with the oid of
> > >the constraint row as data.  I'm not sure of the best way to do
> > >the actual check... it'd be easy to do in spi, but that has its
> > >own problems. Doing a manual scan looking for rows that fail is also
> > >easy but rather slow if there are alot rows where very few fail.
> >
> > I'd have thought sending it to something that lets the
> optimizer deal with
> > it; manual row by row would be a disaster, since in 99% of
> cases is a well
> > designed application, no rows would match (ie. no failures).
> >
> > I know people (Tom?) have complained about SPI in the backend before, I
> > think, but it seems like the way to go - unless there is a lower level
> > query representation that can be generated when the constraint
> is defined
> > then passed to the optimiser at runtime...
>
> Yeah, if there is a good way to do that, I should probably change the
> check constraint stuff for alter table so it does the not (<expr>)
> rather than a scan and evaluate.
>
> > >Unfortunately, we can have cases where the plan depends on other data
> > >outside of stuff that's known at creation time, like data in random
> > >tables.  I'm really not sure how to handle those cases except either
> > >disallowing them or handling them incorrectly.
> >
> > If someone defines a constraint based on random or varying data (eg.
> > CURRENT_TIMESTAMP), then they either (a) know what they are doing, and
> > would not expect reverse validation, or (b) haven't got a clue what they
> > are doing and probably don't expect reverse validation. How does that
> > sound? ;-}
>
> Heh.  Probably a good assumption at least for a first pass.  And it would
> at least get us closer to complience.  I guess CURRENT_TIMESTAMP,
> CURRENT_USER (or whatever its real name is) and functions that might
> provide inconsistant results or access other tables are danger signs.
>
>
>

Re: Checking number of entries

From

Hiroshi Inoue

Date:

09 October 2000, 21:40:54

Stephan Szabo wrote:

>
> >The implementation would be quite different from
> >that of (current) column constraints.
> >I've been suspicious if it could be implemented
> >properly and effectively.
> >For example,are there any discussions about
> >locking needed to guarantee the consistency ?
> Not yet, and I'm not sure the backend even provides
> the necessary hooks right now (I haven't looked).
> For example, what rows get locked when you are
> checking a count(*) constraint (apart from grabbing
> a lock on all the tables involved).

Though it's not preferable, we should grab an exclusive
lock on all the tables for this case.

>  How do you
> prevent someone from inserting a new row that matches
> a criteria until your check is finished (locking
> non-existant rows).

AFAIK,we have no standard way to lock non-existent rows
now. I don't know such cases other than unique constraints.
I'm not sure that the technique is applicable to all cases.
The current simplest and practical way is to grab an exclusive
lock on the target tables. However it's not preferable of cource.

Regards.

Hiroshi Inoue

Re: Checking number of entries

From

Stephan Szabo

Date:

09 October 2000, 23:04:27

>IIRC,SQL92 recommends to handle this case
>as ASSERTIONS rather than as CONSTRAINTS.
Yeah, that was my first thought was to build
them and try to make the constraint look like
an assertion.  That doesn't get around locking
though.

>The implementation would be quite different from
>that of (current) column constraints.
>I've been suspicious if it could be implemented
>properly and effectively.
>For example,are there any discussions about
>locking needed to guarantee the consistency ?
Not yet, and I'm not sure the backend even provides
the necessary hooks right now (I haven't looked).
For example, what rows get locked when you are
checking a count(*) constraint (apart from grabbing
a lock on all the tables involved).  How do you
prevent someone from inserting a new row that matches
a criteria until your check is finished (locking
non-existant rows).  It's much easier in the
alter table case since I already have a table lock
there. :(

I want to do the table/columns that a constraint
references anyway, so that we can move towards
having drops that cascade/restrict.