Thread: Table Partitioning, Part 1
A more in-depth consideration of the major design options and trade-offs available to us... this is an internals related discussion. Comments? 1. Embellish inheritance or separate infrastructure? Inheritance is a somewhat strange PostgreSQL feature, with a few gotchas. The "big one" is the lack of a multi-relation index that can be used against all of the inherited relations also. The full complexity of multiple inheritance, possible multi-level inheritance doesn't seem to solve any problems for us. The main argument in favour of improving inheritance as a route to partitioning is that many people have already developed applications using it to provide some of the desirable aspects of partitioning. The amount of code used to provide some of these features is actually quite low, and the code itself is already specialised. It seems prudent to avoid building on that foundation, even though we may decide to use some similar approaches. 2. Individual Relations explicitly in the plan or MultiRelation plan nodes? (i.e. is a SeqScan of a Partitioned Table one Node or many nodes?) Currently, Inheritance produces a plan for a SeqScan that looks like this... Append SeqScan SeqScan SeqScan ISTM fairly straightforward to produce a similar "static" plan along the same lines, using Result nodes to implement Partition Elimination. Append Result SeqScan Result SeqScan Result SeqScan Some may think that looks a little clunky, especially since the plan length would increase as the number of Partitions increased. An alternative, would be to have a single Scan node that performs a Multi- Partition Scan, which looks like this... PartitionSeqScan There'd always one node, no matter how many partitions. Again, easy to implement, since we just add a new type of Scan access method, or alter the existing ones iff we access partitioned tables. We'd be able to use physical tlists, and all the complexity is hidden. Both solve the basic case of Direct restriction of Partitioning Key (PPUC1) The multi-scan node stops us from doing a mixed Scan plan: Append Result IndexScan Result SeqScan Result SeqScan Why? The node acts at execution time, though the planner is the usual place to decide between scan types. But why do we need that? Why not just make all partitions the same size, give them all the same Statistics etc.. Well, there are Use Cases that strongly suggest that partition size could vary considerably (PPUC1) and also that there is some incentive to have tables that have indexes on the most recent partitions yet none on the older partitions (PPUC5.2) My current thinking is to implement explicitly partition plans. But what about join plans? There are Use Cases that depend almost entirely upon joins for Partition Elimination rather than direct restriction of the PPK. (PPUC2) How could a join plan be performed with the explicit plan shown above? The only option would seem to be a Sort/Merge, which could be a disaster if we only wanted a few rows returned from a very large table. If all partitions in the query had identical indexes on them, then we have another option. In that case, each index could be thought to form part of a larger index ordered initially on the Partitioning Key (PPK). If the first column was actually the PPK, then the set of indexes would be exactly equivalent to a Global Index. We can call this a Pseudo Global Index. The Pseudo Global Index could also be used to enforce uniqueness. If all of the composite indexes were defined unique and the index contained the PPK as one of its columns, this would work. The index enforces uniqueness within each partition and the PPK enforces uniqueness across partitions because the same PPK value cannot be in two partitions. We can also use the Pseudo Global Index as the object of a multi-index scan node, since the PGI would be ordered on the PPK. The plan node would use the partition boundary definitions to locate the starting partition, then seek within the partition to the correct part of the index then scan sideways, skipping to the next partition as required. This sounds a great idea, but it is exactly the technique I argued against only 4 paragraphs ago for use with SeqScans. My current thinking is that we need a mix of the two approaches, though I raise all of this here to gain everybody's input... 3. Partitions are relations? Or sub-relations? If we ever want to implement explicit plans, as discussed above, ISTM that we should consider partitions to be full relations. If partitions become sub-relations then we complicate many other parts of the machinery in the planner. This would also complicate many other things, since each rel currently has a single relfilenode for almost every aspect. We might think about doing this as a way of making "Global Indexes". If its all one relation, we already have them! But that would only work for indexes defined with the PPK as the leading column because otherwise we would have no way of adding or dropping new partitions without ripping the guts out of each index (and failing PPUC4.2 as a result). It turns out that we can have roughly the same kind of Pseudo Global Index whether we consider partitions to be relations or sub-relations. It seems much simpler to have partitions as relations... 4. Should all partitions have the same indexes? It also seems that we have somewhat opposing Use Cases: PPUC2 requires joins to achieve Partition Elimination, PPUC 5.2 requires us to have no indexes on older partitions (since they reside on tape). Some people will say that having all indexes defined the same eases maintenance. Others will say it gives us flexibility. I would argue in favour of flexibility now and automation later. 5. Constraints or specific Partitioning syntax? Partition Elimination relies upon being able to prove at execution time that two clauses in a query restriction are always false when taken together. Taken generically, extending the concept of Partitioning to the max, we might imagine that we could take any number of constraints, then AND them together with an arbitrarily complex restriction clause, solve it and then eliminate partitions. That seems too much work to solve that generically and provably. I aim for an easier subset, with a more bounded solution algorithm. If we know which constraints are to be used for Partition Elimination, and we make some restrictions on the types of restricion clauses that can take part in Partition Elimination, then the task gets easier. So, I think this requires explicit syntax, even though I am very attracted to minimal syntax implementations. Anyway, that was a bit of a ramble, so thanks for reading this far. Best regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > 1. Embellish inheritance or separate infrastructure? > It seems prudent to avoid building on that foundation, even though we > may decide to use some similar approaches. I disagree. The code is there, it could use work, and what you are basically proposing is to duplicate both the existing work and much of the improvement it needs. > 2. Individual Relations explicitly in the plan or MultiRelation plan > nodes? (i.e. is a SeqScan of a Partitioned Table one Node or many > nodes?) This one is so obvious it hardly requires any discussion. You cannot have intelligent planning if you fold everything into a single plan node. I just finished getting rid of a similarly bad decision in the context of indexscan planning (ie, a hardwired approach to OR logic) --- let's not make that mistake again. > 5. Constraints or specific Partitioning syntax? > Partition Elimination relies upon being able to prove at execution time You mean plan time. > that two clauses in a query restriction are always false when taken > together. We already have a reasonably decent implementation of such proving for partial-index predicate handling. I see no reason not to use that. So I don't agree with the idea of special-purpose syntax or logic. regards, tom lane
On Mon, 2005-05-09 at 18:53 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > 1. Embellish inheritance or separate infrastructure? > > > It seems prudent to avoid building on that foundation, even though we > > may decide to use some similar approaches. > > I disagree. The code is there, it could use work, and what you are > basically proposing is to duplicate both the existing work and much > of the improvement it needs. Minefields need clearing someday, I suppose. Multiple inheritance isn't something I'll be spending time on though. > > 2. Individual Relations explicitly in the plan or MultiRelation plan > > nodes? (i.e. is a SeqScan of a Partitioned Table one Node or many > > nodes?) > > This one is so obvious it hardly requires any discussion. You cannot > have intelligent planning if you fold everything into a single plan > node. That was my first thought, and my proposed approach. But many people have asked for the discussion to be aired, so thats why its mentioned. Could allow me to complete something useful though basic for 8.1 > I just finished getting rid of a similarly bad decision in the > context of indexscan planning (ie, a hardwired approach to OR logic) > --- let's not make that mistake again. Well, yes, I saw. ...but my problem remains: how do we handle the Partition Elimination via joins. What do you think of my later discussion on the possibility of nested loops joins via Multi-indexscans. I don't like that answer, but it does give us something...can you see a different way? > > 5. Constraints or specific Partitioning syntax? > > > Partition Elimination relies upon being able to prove at execution time > > You mean plan time. No, definitely execution time. Estimated only at plan time. We need to support Bound plans, with constant supplied as parameter. We also need to support Partition Elimination via restrictions using time- functions, most of which are Stable functions, which again are not evaluated until execution time. Yeh, we might be able to prove it at Plan time, but an implementation that offered Plan-time-only PE would be too restrictive. > > that two clauses in a query restriction are always false when taken > > together. > > We already have a reasonably decent implementation of such proving for > partial-index predicate handling. I see no reason not to use that. > So I don't agree with the idea of special-purpose syntax or logic. Oh? Thanks. I wasn't looking forward to that bit... I'd seen the partial CNF stuff for OR handling and the range clause identification for clause selectivity. Many thanks, Best Regards, Simon Riggs
Tom Lane <tgl@sss.pgh.pa.us> writes: > > Partition Elimination relies upon being able to prove at execution time > > You mean plan time. Fwiw, both are possible. In oracle there are (at least) three different cases: 1. For queries like "select * from tab" the plan shows a multiple partition scan. 2. For queries like "select * from tab where partition_key = 1" the plan shows the partition that the planner proved would suffice. 3. For queries like select * from tab where partition_key = ?" the plan shows a single partition scan of a partition to be determined at run-time. This last case can be very important for joins as well. For queries like select * from tab,tab2 where tab1.partition_key = tab2.fk" the planner can prove that each nested loop iteration will only require a single btree lookup. This is important since if it can't prove this the nested loop has to do multiple index lookups making it look very poor compared to a merge join or hash join. There are also cases where Oracle knows that the plan node will involve reading multiple partitions but not all. -- greg
On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote: > A more in-depth consideration of the major design options and trade-offs > available to us... this is an internals related discussion. > > Comments? > > 1. Embellish inheritance or separate infrastructure? > > Inheritance is a somewhat strange PostgreSQL feature, with a few > gotchas. The "big one" is the lack of a multi-relation index that can be > used against all of the inherited relations also. I think this can be solevd in two ways - 1. create a "merged index" which is similar to a partitioned table, where inserts go to the index corresponding to underlying table but selects (including uniqueness checks) are done by merging data from several indexes. This also solves the problem of fast adding and deleting the partitions to/from the global index. 2. second is variation of my old suggestion of using the fact that we have 1G storage tables for implementing "partitioned tables" - just make a mapping from each 1G (128kpage) range to subtable/offset and then build the index as usual. This has problems with max partitioned table size (same as our current max for ordinary tables) and also slow adding of new partitions - deleting can be done fast by mapping the allocated range to 'removed'. If 1. can be implemented reasonably efficiently for large number of merged indexes then there is no point in investigating 2. . > The full complexity of multiple inheritance, possible multi-level > inheritance doesn't seem to solve any problems for us. The main argument > in favour of improving inheritance as a route to partitioning is that > many people have already developed applications using it to provide some > of the desirable aspects of partitioning. The amount of code used to > provide some of these features is actually quite low, and the code > itself is already specialised. > > It seems prudent to avoid building on that foundation, even though we > may decide to use some similar approaches. I would not mind if we give up multiple inheritance. OTOH multi-level inheritance (and thus multi-level partitioning) can be quite useful sometimes. For example I could have old data partitioned by YEAR, but newer also sub-partitioned by month or even day. Then I could manually PE all previous year by simply doing my query against sales_2005 and let the automatic PE worry for doing the right thing within this year. Also, the multi-level nature of inheritance should be hidden from planner/optimiser by moving the sub-sub-sub tables up the hierarchy into a single flat append-like plan. At least until/if we start applying check constraints to table and its all child tables, in which case some of the PE could be done on intermediate nodes before expansion. > 2. Individual Relations explicitly in the plan or MultiRelation plan > nodes? (i.e. is a SeqScan of a Partitioned Table one Node or many > nodes?) > > Currently, Inheritance produces a plan for a SeqScan that looks like > this... > > Append > SeqScan > SeqScan > SeqScan Seems reasonable for a seqscan :) I have an inheritance-paritioned table of about 50 tables and when I restrict it on some column that has an index then the plan does index- scan on bigger tables and seqscan with some old tables which have only a few thousands of rows, > ISTM fairly straightforward to produce a similar "static" plan along the > same lines, using Result nodes to implement Partition Elimination. > > Append > Result > SeqScan > Result > SeqScan > Result > SeqScan So you mean another node that looks inside seqscans restrictions and then determines if there is any chance that seqscan will return any rows ? Why can't this be done at planning time ? And if it can't be done at planning time, then how do you determine which plan is cheapest ? > Some may think that looks a little clunky, especially since the plan > length would increase as the number of Partitions increased. An > alternative, would be to have a single Scan node that performs a Multi- > Partition Scan, which looks like this... > > PartitionSeqScan > > There'd always one node, no matter how many partitions. Again, easy to > implement, since we just add a new type of Scan access method, or alter > the existing ones iff we access partitioned tables. We'd be able to use > physical tlists, and all the complexity is hidden. Pehaps we could have just one scan type which looks like this Scan And hide all the complexity within it ? ;p Greater visibility of internals is good, both for optimiser and for user trying to understand what is going on (explain analyse). I don't think many nodes are a problem per se, only when they cause some non-linear growth in some operations. > ... > > My current thinking is to implement explicitly partition plans. Agreed. <OT: philosopical musing> As a general guideline I think it is (almost) never a good idea to trade CPU cycles for disk accesses. The speed difference between CPU and disk seems destined to grow in the forseeable future and unless some cheap RAM-like storage suddenly emerges, this does not change. </OT: philosopical musing> > But what about join plans? There are Use Cases that depend almost > entirely upon joins for Partition Elimination rather than direct > restriction of the PPK. (PPUC2) > > How could a join plan be performed with the explicit plan shown above? > The only option would seem to be a Sort/Merge, which could be a disaster > if we only wanted a few rows returned from a very large table. I think the new bitmap scan code in 8.1 solves a similar problem, only at a smaller scale (eliminating pages/tuples instead of partitions). Perhaps something similar can be used here for PE. > If all partitions in the query had identical indexes on them, then we > have another option. In that case, each index could be thought to form > part of a larger index ordered initially on the Partitioning Key (PPK). > If the first column was actually the PPK, then the set of indexes would > be exactly equivalent to a Global Index. We can call this a Pseudo > Global Index. > > The Pseudo Global Index could also be used to enforce uniqueness. If all > of the composite indexes were defined unique and the index contained the > PPK as one of its columns, this would work. > > The index enforces > uniqueness within each partition and the PPK enforces uniqueness across > partitions because the same PPK value cannot be in two partitions. But only uniqueness of PPK, not any other columns. > We can also use the Pseudo Global Index as the object of a multi-index > scan node, since the PGI would be ordered on the PPK. The plan node > would use the partition boundary definitions to locate the starting > partition, then seek within the partition to the correct part of the > index then scan sideways, skipping to the next partition as required. > > This sounds a great idea, but it is exactly the technique I argued > against only 4 paragraphs ago for use with SeqScans. Also this works only for PPK and not other columns. > My current thinking is that we need a mix of the two approaches, though > I raise all of this here to gain everybody's input... I'm afraid that hiding the decisions inside the access methods will make things harder for the optimiser . Still there may be cases where smarter access methods make sense as an additional feture, though I cant come up with an example right now. > 3. Partitions are relations? Or sub-relations? > > If we ever want to implement explicit plans, as discussed above, ISTM > that we should consider partitions to be full relations. > ... > It seems much simpler to have partitions as relations... Agreed > 4. Should all partitions have the same indexes? > > It also seems that we have somewhat opposing Use Cases: PPUC2 requires > joins to achieve Partition Elimination, PPUC 5.2 requires us to have no > indexes on older partitions (since they reside on tape). > > Some people will say that having all indexes defined the same eases > maintenance. Others will say it gives us flexibility. > > I would argue in favour of flexibility now and automation later. Yes. > 5. Constraints or specific Partitioning syntax? > > Partition Elimination relies upon being able to prove at execution time > that two clauses in a query restriction are always false when taken > together. As much as possible of PE should be done at planning time, as only then can our cost-based optimiser make use of it. Execution-time PE (smart access methods) is mostly a black box for optimiser, and thus may often produce pessimal plans. > Taken generically, extending the concept of Partitioning to the max, we > might imagine that we could take any number of constraints, then AND > them together with an arbitrarily complex restriction clause, solve it > and then eliminate partitions. That seems too much work to solve that > generically and provably. I aim for an easier subset, with a more > bounded solution algorithm. This could be done by converting the "arbitrarily complex restriction clause" and convert it into a list of "ANDed simpler clauses" (I guess we already have most code needed for that) and then run over "simple enough" clauses. If we can prove that any 2 of them contradict then the partition can be liminated. There are 2 possibly expensive steps: 1. the conversion to "AND'ed list of simple clauses" (unknown complexity) 2. matching each of "simple" clauses in the and list with all others (should be N+(N-1)+(N-2)+..+(1) ~= 2N) complexity) fortunately 1. needs to be only once, as this part stays constant for the query and all constraints bfrom the subtable/partition (CHECK's and other) are appended to AND list. The appended list does also not need to be checked among themselves as they are by definition non-contradictory (unless the partition is empty ;) further decreasing the complexity. This approach can be made equal to your proposal if the check for simplicity of clauses returns true only for range and equality comparisons on PPK. Then later when the system is made to deal with more complicated pairs of clauses , the "is simple clause" check will be expanded correspondingly. Of course the checks themselves can bail out with "don't know" response any time, so the "is simple clause" is just an optimisation. For a query with a where clause expanding to 4 part ANDed list and a subtable with 3 check's we need to do only 4 x 3 = 12 PE comparisons per partition if we compare each restrictio from where clause with each from checks. If we only have 1 "simple enough" clause in WHERE and 1 in CHECK then we need to do 4 "simple enough" check per plan + 3 per partiotion + one actual PE check per partition. > If we know which constraints are to be used for Partition Elimination, > and we make some restrictions on the types of restricion clauses that > can take part in Partition Elimination, then the task gets easier. > > So, I think this requires explicit syntax, even though I am very > attracted to minimal syntax implementations. I don't like explicit syntax. We have objected adding optimiser hints so far, why should we start now ? Maybe we could just consider all range and equality checks (IN expands to a list of OR's and does not qualify here, NOT IN does, as this is a list of ANDs) for the start and move on later if needed. > Anyway, that was a bit of a ramble, so thanks for reading this far. So were my comments :) Thanks for everybody who has read this far ! -- Hannu Krosing <hannu@skype.net>
On T, 2005-05-10 at 16:31 +0300, Hannu Krosing wrote: > On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote: > There are 2 possibly expensive steps: > > 1. the conversion to "AND'ed list of simple clauses" (unknown > complexity) > > 2. matching each of "simple" clauses in the and list with all others > (should be N+(N-1)+(N-2)+..+(1) ~= 2N) complexity) actually not 2N but (N * ((N-1)/2) , thus 3 clauses need 2+1=3 checks and 11 clasues need (10+9+..+1) = 55 checks. -- Hannu Krosing <hannu@skype.net>
On Tue, May 10, 2005 at 12:16:17AM +0100, Simon Riggs wrote: > On Mon, 2005-05-09 at 18:53 -0400, Tom Lane wrote: > > Simon Riggs <simon@2ndquadrant.com> writes: > > > 1. Embellish inheritance or separate infrastructure? > > > > > It seems prudent to avoid building on that foundation, even though we > > > may decide to use some similar approaches. > > > > I disagree. The code is there, it could use work, and what you are > > basically proposing is to duplicate both the existing work and much > > of the improvement it needs. > > Minefields need clearing someday, I suppose. > > Multiple inheritance isn't something I'll be spending time on though. I'm also not sure that inheritance would support all cases. For example, in some situations PPUC3 leads to doing individual value partitioning, where a partition is guaranteed to have only one value for part of the PPK, meaning that there's no reason to store that part of the key in the partition itself. Currently this is possible with partitions built out of views but not out of inherited tables. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
"Jim C. Nasby" <decibel@decibel.org> writes: > On Tue, May 10, 2005 at 12:16:17AM +0100, Simon Riggs wrote: >> On Mon, 2005-05-09 at 18:53 -0400, Tom Lane wrote: >>> I disagree. The code is there, it could use work, and what you are >>> basically proposing is to duplicate both the existing work and much >>> of the improvement it needs. >> >> Minefields need clearing someday, I suppose. >> >> Multiple inheritance isn't something I'll be spending time on though. > I'm also not sure that inheritance would support all cases. My point seems to have been widely misunderstood ;-) I was not suggesting that partitioning must be built on top of inheritance, nor vice versa, nor that they need to support exactly the same feature sets. What I am saying is that if you adopt an NIH attitude to the existing code, you are going to end up with a lot of duplication. There is a substantial amount of potentially common infrastructure, as well as common problems that you might as well solve for both cases at once. (Remember the inventor's paradox: the more general problem is often easier to solve.) In particular, the planning problems look essentially the same to me, as does the indexing problem. regards, tom lane
On Tue, 2005-05-10 at 16:31 +0300, Hannu Krosing wrote: > On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote: > > ISTM fairly straightforward to produce a similar "static" plan along the > > same lines, using Result nodes to implement Partition Elimination. > > > > Append > > Result > > SeqScan > > Result > > SeqScan > > Result > > SeqScan > > So you mean another node that looks inside seqscans restrictions and > then determines if there is any chance that seqscan will return any > rows ? > > Why can't this be done at planning time ? And if it can't be done at > planning time, then how do you determine which plan is cheapest ? It can if you have constants. If you have you have bind variables or Stable functions then it can only be done for certain at execution time, though can be estimated prior at planning time. This is exactly what is done currently in the planner for constant eval and simplification. > > If all partitions in the query had identical indexes on them, then we > > have another option. In that case, each index could be thought to form > > part of a larger index ordered initially on the Partitioning Key (PPK). > > If the first column was actually the PPK, then the set of indexes would > > be exactly equivalent to a Global Index. We can call this a Pseudo > > Global Index. > > > > The Pseudo Global Index could also be used to enforce uniqueness. If all > > of the composite indexes were defined unique and the index contained the > > PPK as one of its columns, this would work. > > > > The index enforces > > uniqueness within each partition and the PPK enforces uniqueness across > > partitions because the same PPK value cannot be in two partitions. > > But only uniqueness of PPK, not any other columns. No, it would work for *any* set of columns that included the PPK. > > We can also use the Pseudo Global Index as the object of a multi-index > > scan node, since the PGI would be ordered on the PPK. The plan node > > would use the partition boundary definitions to locate the starting > > partition, then seek within the partition to the correct part of the > > index then scan sideways, skipping to the next partition as required. > > > > This sounds a great idea, but it is exactly the technique I argued > > against only 4 paragraphs ago for use with SeqScans. > > Also this works only for PPK and not other columns. This would work for any set of columns, as long as the PPK were the leftmost column in the set. > > My current thinking is that we need a mix of the two approaches, though > > I raise all of this here to gain everybody's input... > > I'm afraid that hiding the decisions inside the access methods will make > things harder for the optimiser . Me too. > Still there may be cases where smarter access methods make sense as an > additional feture, though I cant come up with an example right now. Look at PPUC 2 Join partition elimination, which is the classic Fact to TimeDimension join. Hannu, It's time to follow your thinking for the first implementation: just implement Constraints as possible gating tests on each inherited relation. No syntax changes, just some light but complex internals. Once that works, we can discuss improving it to solve other required use cases. Until we're at that stage, we should leave further discussion to one side, now that we have the basic architecture agreed: inheritance. Best Regards, Simon Riggs
On Tue, 2005-05-10 at 16:44 +0300, Hannu Krosing wrote: > On T, 2005-05-10 at 16:31 +0300, Hannu Krosing wrote: > > On E, 2005-05-09 at 23:30 +0100, Simon Riggs wrote: > > > There are 2 possibly expensive steps: > > > > 1. the conversion to "AND'ed list of simple clauses" (unknown > > complexity) > > > > 2. matching each of "simple" clauses in the and list with all others > > (should be N+(N-1)+(N-2)+..+(1) ~= 2N) complexity) > > actually not 2N but (N * ((N-1)/2) , thus 3 clauses need 2+1=3 checks > and 11 clasues need (10+9+..+1) = 55 checks. Well, it doesn't need to be quite that bad. We can just check each of the table constraints against each of the Restrict clauses. We don't need to test the constraints against themselves, since the table would be empty if any ever turned out to be mutually exclusive, so we wouldn't be saving time doing it. Later, we can enforce constraints to be mutually exclusive at the time they are created. We don't need to test the Restrict clauses against themselves either, since we can assume that the user is smart enough to submit SQL that returns some rows. Eval will pick up the stupid case if there is one. Anyway, the existing code has an even simpler heuristic for avoiding the full check... Best Regards, Simon Riggs
On Tue, 2005-05-10 at 15:01 -0400, Tom Lane wrote: > "Jim C. Nasby" <decibel@decibel.org> writes: > > On Tue, May 10, 2005 at 12:16:17AM +0100, Simon Riggs wrote: > >> On Mon, 2005-05-09 at 18:53 -0400, Tom Lane wrote: > >>> I disagree. The code is there, it could use work, and what you are > >>> basically proposing is to duplicate both the existing work and much > >>> of the improvement it needs. > >> > >> Minefields need clearing someday, I suppose. > >> > >> Multiple inheritance isn't something I'll be spending time on though. > > > I'm also not sure that inheritance would support all cases. > > My point seems to have been widely misunderstood ;-) > > I was not suggesting that partitioning must be built on top of > inheritance, nor vice versa, nor that they need to support exactly > the same feature sets. What I am saying is that if you adopt an > NIH attitude to the existing code, you are going to end up with a > lot of duplication. There is a substantial amount of potentially > common infrastructure, as well as common problems that you might as > well solve for both cases at once. (Remember the inventor's paradox: > the more general problem is often easier to solve.) In particular, > the planning problems look essentially the same to me, as does the > indexing problem. Agreed - no worries. I was already seeing that I can't solve all of the Bizgres use cases at once, so its time for me to pick one and make that work. If I can do at least one in a way that solves a general case problem, then I'm happy to do it that way. We may learn something during the journey about the best approach for the other use cases. ...and at least we would have something to put into 8.1 Best Regards, Simon Riggs
On T, 2005-05-10 at 23:09 +0100, Simon Riggs wrote: > On Tue, 2005-05-10 at 16:31 +0300, Hannu Krosing wrote: > > > If all partitions in the query had identical indexes on them, then we > > > have another option. In that case, each index could be thought to form > > > part of a larger index ordered initially on the Partitioning Key (PPK). > > > If the first column was actually the PPK, then the set of indexes would > > > be exactly equivalent to a Global Index. We can call this a Pseudo > > > Global Index. > > > > > > The Pseudo Global Index could also be used to enforce uniqueness. If all > > > of the composite indexes were defined unique and the index contained the > > > PPK as one of its columns, this would work. > > > > > > The index enforces > > > uniqueness within each partition and the PPK enforces uniqueness across > > > partitions because the same PPK value cannot be in two partitions. > > > > But only uniqueness of PPK, not any other columns. > > No, it would work for *any* set of columns that included the PPK. What I meant was that it would quarantee the uniqueness of the whole set only, not any other columns except the PPK. and in case PPK was itself uniqe the other columns don't matter at all. > > Still there may be cases where smarter access methods make sense as an > > additional feture, though I cant come up with an example right now. > > Look at PPUC 2 Join partition elimination, which is the classic Fact to > TimeDimension join. Maybe using a global index (+ bitmap scan if number of matching rows is large and not clustered) here is enough for stage 1 implementation. As PPUC2 needs PE elimination step for each and every value, using global index for "kind of PE" could be the most efficient way to do it for quite large class of PPUC2 queries. -- Hannu Krosing <hannu@skype.net>