Re: Constraint Exclusion (Partitioning) - Initial Review - Mailing list pgsql-patches
From | Simon Riggs |
---|---|
Subject | Re: Constraint Exclusion (Partitioning) - Initial Review |
Date | |
Msg-id | 1120431087.3940.103.camel@localhost.localdomain Whole thread Raw |
In response to | Re: Constraint Exclusion (Partitioning) - Initial Review requested (Bruce Momjian <pgman@candle.pha.pa.us>) |
Responses |
Re: Constraint Exclusion (Partitioning) - Initial Review
|
List | pgsql-patches |
On Sat, 2005-07-02 at 15:56 -0400, Bruce Momjian wrote: > Seems you have managed to combine inheritance, check constraints, and > partial index into table partitioning. It is nice it requires no new > syntax. Here is an example from your tests: > > DROP TABLE pm cascade; > CREATE TABLE pm > ( dkey INT NOT NULL > ); > > CREATE TABLE p1 ( CHECK (dkey BETWEEN 10000 AND 19999)) INHERITS (pm); > CREATE TABLE p2 ( CHECK (dkey BETWEEN 20000 AND 29999)) INHERITS (pm); > CREATE TABLE p3 ( CHECK (dkey BETWEEN 30000 AND 39999)) INHERITS (pm); > > So, in this case, a SELECT from pm would pull from the three base > tables, and those tables can be located on different tablespaces, and > the backend will only look in child tables that might contain rows basd > on the check constraints. It is that last phrase that is the new > functionality here. Yes, dead on. Thank you for this elegant summary. The main idea was originally Hannu Krosing's, I believe, with suggestion from Tom to enhance the partial index machinery to this end. So a query such as select * from pm where dkey = 25000 will have an EXPLAIN that completely ignores p1 and p3, since these can be provably excluded from the plan without effecting the result. I see the "no syntax" version as the first step towards additional functionality that would require additional syntax. > Oh, why would someone want to set enable_constraint_exclusion to false? The included functionality performs the exclusion at plan time. If a query was prepared for later execution, it *could* return the wrong answer when the plan was executed at a later time since plans are not invalidated when constraints change. So, in general, this should be set to false except for circumstances where the user can guarantee no such mistake would be made. > You had a few questions: > > > Main questions: > > 1. How should we handle the case where *all* inherited relations are > > excluded? (This is not currently covered in the code). > > I assume this means we don't return any rows. Why it is an issue? A code question only. No issue, just how should the code look? > > 2. Should this feature be available for all queries or just inherited > > relations? > > I don't see why other queries should not use this. Our TODO already > has: > > * Use CHECK constraints to influence optimizer decisions > > CHECK constraints contain information about the distribution of values > within the table. This is also useful for implementing subtables where > a tables content is distributed across several subtables. > > and this looks like what you are doing. However, again, I see the > constraint as just informing whether there might be any rows in the > table. Am I missing something? Are you thinking views with UNION could > benefit from this? In general, it seems you might want this. In normal use check constraints tend to be on minor columns, not key columns. Queries that would be provably able to exclude tables based upon this would be strange queries. i.e. select count(distinct item_pk) from warehouse where quantity < 0 is not a very common query. So we would pay the overhead of checking for exclusion for all queries when only a few wierd ones would ever take advantage of it. Sounds like a poor trade-off to me. IMHO, the only time you might expect to see benefit is when you have many similar tables that are partitioned by design into pieces that lend themselves to exclusion. If you specifically designed a set of tables and used UNION to bring them together, then I can see that you would want it then also.... but is there any benefit in supporting two different ways of achieving the same basic design: partitioned table. > > 3. And should we extend RelOptInfo to include constraint information? > > Is the problem that you have to do multiple lookups without it? > > 4. Do we want to integrate the test suite also? > > No, once this thing works, I don't see it getting broken frequently. We > usually don't test the optimizer, but we could add a single test for one > row in each table. > > > 5. Presumably a section under Performance tips would be appropriate to > > document this feature? (As well as section in run-time parameters). > > Yep. > > I am surprised no one else has commented on it, which I think means your > code is ready for the queue. Do you want to adjust it based on this > feedback or should I apply and you can adjust it later. I think that it means I did not explain myself well enough. :-) Best Regards, Simon Riggs
pgsql-patches by date: