Re: [Bizgres-general] A Guide to Constraint Exclusion - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: [Bizgres-general] A Guide to Constraint Exclusion
Date
Msg-id 1121372974.3970.442.camel@localhost.localdomain
Whole thread Raw
In response to Re: [Bizgres-general] A Guide to Constraint Exclusion  (Hannu Krosing <hannu@tm.ee>)
Responses Re: [Bizgres-general] A Guide to Constraint Exclusion (  ("Luke Lonergan" <llonergan@greenplum.com>)
Re: [Bizgres-general] A Guide to Constraint Exclusion  (Hannu Krosing <hannu@skype.net>)
List pgsql-hackers
On Thu, 2005-07-14 at 00:13 +0300, Hannu Krosing wrote:
> On K, 2005-07-13 at 11:53 +0100, Simon Riggs wrote:

> > We aren't able to exclude the parent table from the above query because
> > no Constraint was defined upon it. Since, in our example, the parent is
> > empty there will be little effect on the query performance. It would be
> > a mistake to attempt to get around this by placing a Constraint on the
> > parent, since that would then automatically be created on all child
> > tables also. So we can never exclude the parent without automatically
> > excluding *all* of the children also.
> 
> At least in 8.0 you can drop the inherited constraint from child table:

Yes, you can. But in discussion on -hackers on 20 May there was
agreement (OK, me agreeing with Tom) that being able to drop inherited
constraints was a bug (or at least a deprecated feature...)

> > Currently, there is no restriction that all constraints *must* be
> > mutually exclusive, nor even that the constraints may be similar on each
> > table. This can be useful for some designs where the inheritance
> > hierarchy is not "disjoint", as UML would term this situation.
> 
> actually this is GOOD, as this way I can have a constraint on both 
> insert_timestamp and primary_key fields, which are mostly but not
> absolutely in the same order. And also to add extra IN (X,Y,Z)
> constraints for some other fields. 

Yes, understood. My description was not very good. I did not mean that
constraints on any one table should be mutually exclusive. I meant that
a set of constraints across a set of child tables should be able to be
defined mutually exclusive (e.g. just as Oracle partitions are).

> > CE does not prevent direct access to one of the child tables in an
> > inheritance hierarchy. In this case, no exclusion test would be
> > performed. Exclusion tests are performed *only* when the parent table in
> > an inheritance hierarchy is accessed. Exclusion tests are performed even
> > if the inheritance hierarchy is many levels deep (e.g. parent-child-
> > grandchild). CE also supports multiple inheritance.
> 
> I'd like to see an option to ALWAYS do CE, inheritance or union (all) or
> even simple queries.

I've argued against that, but I guess if there is enough opinion I can
be persuaded.

My argument is that most constraints look something likeCHECK ( salary > 0 and salary < 1000000)
If you ask a query likeselect count(*) from emp where salary < -10000
then CE would work great. But you don't find many people asking such
nonsensical questions, or at least very often. Those queries do get
asked, but they seem either naive or "data profiling" type queries. So
my conclusion is that for most constraints and most queries, CE is
literally just a waste of planning time.

When a designer deliberately creates a scenario where constraints have
meaning, as in a mutually exclusive inheritance hierarchy or classic
partitioned table design, then you have a high chance of CE being
effective. 

The indicator for the value of CE, IMHO, is the "and/or" aspect of
multiple related tables.

Thats my argument, but I'm willing to hear counter arguments. I just
wish to avoid overselling CE and slowing most people's queries down for
no good reason.

> > CURRENT RESTRICTIONS
> > 
> > It is not yet possible to specify that Constraints on child tables will
> > be mutually exclusive of each other. Currently, it would be up to the
> > designer to ensure that, if desired.
> > 
> > It is not yet possible to specify that an inheritance parent has no
> > rows, and, if so, should always be excluded from the query.
> 
> I think that a simple "CHECK(false)" constraint should be enough for
> this.

That works, but as I said....

> > If a parent table has a Constraint defined upon it, then this will be
> > automatically copied to all child tables. 
> 
> But they can be removed later if desired.

But should not be...

> > Currently, there is no way to
> > tell which Constraints have been inherited from the parent, so exclusion
> > tests will be re-executed against all child tables. This will cause
> > additional optimization time.
> 
> Have you done any performance testing, i.e. what is the actual impact of
> CE on planning time ?

I think "bad" would be my assessment. But not "very bad". But this must
be offset by the extraordinary time saving in execution time.

Planning time depends upon the complexity of the query, the number and
complexity of the constraints and the number of child tables.

I think I would guess currently at child tables of around 1 GB, with
sizes as small as 100 MB being reasonable. Trying to keep table size
relatively evenly distributed would be useful also. With those
recommendations it will always be worth 0.1 sec investment in trying to
avoid 100+ sec sequential scan times. But again, you need to consider
your workload.

> > Currently, all child tables will be considered. It may be possible in
> > the future to pre-sort the list of child tables, so that optimization
> > time can be reduced for parent tables with large numbers of partitions.
> >
> > Currently, there is no index on the pg_inherits system table. As a
> > result, parents with more than 1000 child tables are likely to
> > experience longer than desirable planning times for their queries.
> 
> Am I right that this is a general postgresql issue and has nothing to do
> with CE ?

Yep.

> > CE checks will not currently recognise STABLE functions within a query.
> > So WHERE clauses such as
> >     DateKey > CURRENT DATE
> > will not cause exclusion because CURRENT DATE is a STABLE function.
> > 
> > CE checks are not made when the parent table is involved in a join.
> 
> Is this also the case where parent table is inside subquery and that
> subquery is involved in a join?

My comment was too terse. What I meant was that you can't do dynamic
exclusion based upon the results of a join. i.e. PPUC2

> Or do I have to make sure that it is not lifted out of that subquery
> using something like pl/pgsql function ?
> 
> > Other existing restrictions on Inherited tables continue to apply.
> 
> WHat happens for multiple inheritance ? 

CE works.

> Is it detected and then also excluded from CE ?

No

> Or is that just a "don't do it" item ?

Thats a religious debate.... I'll leave that one.

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Simplifying identification of temporary tables
Next
From: Hannu Krosing
Date:
Subject: Re: Autovacuum loose ends