Thread: Re: Auto Partitioning Patch - WIP version 1
Hi,
Am planning to do the above by using the check constraint specified for each partition. This constraint's raw_expr field ends up becoming the whereClause for the rule specific to that partition.
One question is whether we should we allow auto creation of UPDATE rules given that updates can end up spanning multiple partitions if the column on which partitioning is specified gets updated?
Also if we decide to auto - add rules for UPDATE, the raw_expr will need to be modified to refer to "OLD."col, which can be quite a headache. We do not have parsetree walker/mutator functions as far as I could see in the code.
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
The following things are TODOs:
iv) Auto generate rules using the checks mentioned for the partitions, to handle INSERTs/DELETEs/UPDATEs to navigate them to the appropriate child. Note that checks specified directly on the master table will get inherited automatically.
Am planning to do the above by using the check constraint specified for each partition. This constraint's raw_expr field ends up becoming the whereClause for the rule specific to that partition.
One question is whether we should we allow auto creation of UPDATE rules given that updates can end up spanning multiple partitions if the column on which partitioning is specified gets updated?
Also if we decide to auto - add rules for UPDATE, the raw_expr will need to be modified to refer to "OLD."col, which can be quite a headache. We do not have parsetree walker/mutator functions as far as I could see in the code.
Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
Hi, NikhilS wrote: >> The following things are TODOs: >> >> iv) Auto generate rules using the checks mentioned for the partitions, to >> handle INSERTs/DELETEs/UPDATEs to navigate them to the appropriate child. >> Note that checks specified directly on the master table will get >> inherited >> automatically. > > Am planning to do the above by using the check constraint specified for > each > partition. This constraint's raw_expr field ends up becoming the > whereClause > for the rule specific to that partition. I appreciate you efforts, but I'm not sure if this has been discussed enough. There seem to be two ideas floating around: - you are heading for automating the current kludge, which involves creating partitions and constraints by hand. AFAICT,you want to support list and range partitioning. - Simon Riggs has proposed partitioning functions, which could easily handle any type of partitioning (hash, list, rangeand any mix of those). Both proposals do not have much to do with the missing multi-table indices. It's clear to me that we have to implement those someday, anyway. AFAICT, the first proposal does not ease the task of writing correct constraints, so that we are sure that each row ends up in only exactly one partition. The second would. But the second proposal makes it hard for the planner to choose the right partitions, i.e. if you request a range of ids, the planner would have to query the partitioning function for every possible value. The first variant could use constraint exclusion for that. None of the two has gone as far as thinking about switching from one partitioning rule set to another. That gets especially hard if you consider database restarts during re-partitioning. Here are some thought I have come up with recently. This is all about how to partition and not about how to implement multi-table indices. Sorry if this got somewhat longish. And no, this is certainly not for 8.3 ;-) I don't like partitioning rules, which leave open questions, i.e. when there are values for which the system does not have an answer (and would have to fall back to a default) or even worse, where it could give multiple correct answers. Given that premise, I see only two basic partitioning types: - splits: those can be used for what's commonly known as list and range partitioning. If you want customers A-M to endup on partition 1 and customers N-Z on partition 2 you would split between M and N. (That way, the system would stillknow what to do with a customer name beginning with an @ sign, for example. The only requirement for a split isthat the underlying data type supports comparison operators.) - modulo: I think this is commonly known as hash partitioning. It requires an integer input, possibly by hashing, andcalculates the remainder of a division by n. That should give an equal distribution among n partitions. Besides the expression to work on, a split always needs one argument, the split point, and divides into two buckets. A modulo splits into two or more buckets and needs the divisor as an argument. Of course, these two types can be combined. I like to think of these combinations as trees. Let me give you a simple examlpe: table customers | | split @ name >= 'N' / \ / \ part1 part2 A combination of the two would look like: table invoices | | split @ id >= 50000 / \ / \ hash(id) modulo 3 part4 / | \ / | \ part1 part2 part3 Knowledge of these trees would allow the planner to choose more wisely, i.e. given a comparative condition (WHERE id > 100000) it could check the splits in the partitioning tree and only scan the partitions necessary. Likewise with an equality condition (WHERE id = 1234). As it's a better definition of the partitioning rules, the planner would not have to check constraints of all partitions, as the current constraint exclusion feature does. It might even be likely that querying this partitioning tree and then scanning the single-table index will be faster than an index scan on a multi-table index. At least, I cannot see why it should be any slower. Such partitioning rule sets would allow us to re-partition by adding a split node on top of the tree. The split point would have to increment together with the progress of moving around the rows among the partitions, so that the database would always be in a consistent state regarding partitioning. Additionally, it's easy to figure out, when no or only few moving around is necessary, i.e. when adding a split @ id >= 1000 to a table which only has ids < 1000. I believe that this is a well defined partitioning rule set, which has more information for the planner than a partitioning function could ever have. And it is less of a foot-gun than hand written constraints, because it does not allow the user to specify illegal partitioning rules (i.e. it's always guaranteed, that every row ends up in only one partition). Of course, it's far more work than either of the above proposals, but maybe we can go there step by step? Maybe, NikhilS proposal is more like a step towards such a beast? Feedback of any form is very welcome. Regards Markus
Hi,
Thanks Markus.
When I submitted the proposal, AFAIR there was no objection to going with the first proposal. Yes there was a lot of forward looking discussion, but since what I had proposed (atleast syntax wise) was similar/closer to Mysql, Oracle I did not see any one objecting to it. I think SQL server provides partitioning functions similar to Simon's proposal. And all along, I had maintained that I wanted to automate as far as possible, the existing mechanism for partitioning. To this too, I do not remember anyone objecting to.
Our current partitioning solution is based on inheritance. With that in mind, for 8.3 I thought an implementation based on auto rules creation would be the way to go.
Having said that, obviously I would want to go with the consensus on this list as to what we think is the *best* way to go forward with partitioning.
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
I appreciate you efforts, but I'm not sure if this has been discussed
Thanks Markus.
enough. There seem to be two ideas floating around:
- you are heading for automating the current kludge, which involves
creating partitions and constraints by hand. AFAICT, you want to
support list and range partitioning.
- Simon Riggs has proposed partitioning functions, which could easily
handle any type of partitioning (hash, list, range and any mix of
those).
When I submitted the proposal, AFAIR there was no objection to going with the first proposal. Yes there was a lot of forward looking discussion, but since what I had proposed (atleast syntax wise) was similar/closer to Mysql, Oracle I did not see any one objecting to it. I think SQL server provides partitioning functions similar to Simon's proposal. And all along, I had maintained that I wanted to automate as far as possible, the existing mechanism for partitioning. To this too, I do not remember anyone objecting to.
Our current partitioning solution is based on inheritance. With that in mind, for 8.3 I thought an implementation based on auto rules creation would be the way to go.
Having said that, obviously I would want to go with the consensus on this list as to what we think is the *best* way to go forward with partitioning.
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
On Wed, 2007-04-04 at 14:20 +0200, Markus Schiltknecht wrote: > Both proposals do not have much to do with the missing multi-table > indices. It's clear to me that we have to implement those someday, > anyway. I agree with much of your post, though this particular point caught my eye. If you'll forgive me for jumping on an isolated point in your post: Multi-table indexes sound like a good solution until you consider how big they would be. The reason we "need" a multi-table index is because we are using partitioning, which we wouldn't be doing unless the data was fairly large. So the index is going to be (Num partitions * fairly-large) in size, which means its absolutely enormous. Adding and dropping partitions also becomes a management nightmare, so overall multi-table indexes look unusable to me. Multi-table indexes also remove the possibility of loading data quickly, then building an index on the data, then adding the table as a partition - both the COPY and the CREATE INDEX would be slower with a pre-existing multi-table index. My hope is to have a mechanism to partition indexes or recognise that they are partitioned, so that a set of provably-distinct unique indexes can provide the exact same functionlity as a single large unique index, just without the management nightmare. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Wed, 2007-04-04 at 14:20 +0200, Markus Schiltknecht wrote: >> Both proposals do not have much to do with the missing multi-table >> indices. It's clear to me that we have to implement those someday, >> anyway. > > I agree with much of your post, though this particular point caught my > eye. If you'll forgive me for jumping on an isolated point in your post: > > Multi-table indexes sound like a good solution until you consider how > big they would be. Put another way, multi-table indexes defeat the whole purpose of having partitioned the table in the first place. If you could have managed a single massive index then you wouldn't have bothered partitioning. However there is a use case that can be handled by a kind of compromise index. Indexes that have leading columns which restrict all subtrees under that point to a single partition can be handled by a kind of meta-index. So you have one index which just points you to the right partition and corresponding index. That lets you enforce unique constraints as long as the partition key is part of the unique constraint. In practice people are usually pretty comfortable not having the database enforce such a constraint since it's easy to have the application enforce these types of constraints anyways. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > My hope is to have a mechanism to partition indexes or recognise that > they are partitioned, so that a set of provably-distinct unique indexes > can provide the exact same functionlity as a single large unique index, > just without the management nightmare. > > Will this address the fairly common data design problem where we need to ensure that a given value is unique across several tables (possibly siblings, possibly not)? If so, then full steam ahead. cheers andrew
Hi, NikhilS wrote: > Our current partitioning solution is based on inheritance. With that in > mind, for 8.3 I thought an implementation based on auto rules creation > would be the way to go. That's completely reasonable. And as I've said, it's probably even a step towards what I've outlined (automation of creation of partitions). Regards Markus
Hi, Simon Riggs wrote: > I agree with much of your post, though this particular point caught my > eye. If you'll forgive me for jumping on an isolated point in your post: No problem. > Multi-table indexes sound like a good solution until you consider how > big they would be. The reason we "need" a multi-table index is because > we are using partitioning, which we wouldn't be doing unless the data > was fairly large. So the index is going to be (Num partitions * > fairly-large) in size, which means its absolutely enormous. Adding and > dropping partitions also becomes a management nightmare, so overall > multi-table indexes look unusable to me. Multi-table indexes also remove > the possibility of loading data quickly, then building an index on the > data, then adding the table as a partition - both the COPY and the > CREATE INDEX would be slower with a pre-existing multi-table index. I agree. (And thanks to TOAST, we never have very wide tables with relatively few rows, right? I mean, something like pictures stored in bytea columns or some such.) > My hope is to have a mechanism to partition indexes or recognise that > they are partitioned, so that a set of provably-distinct unique indexes > can provide the exact same functionlity as a single large unique index, > just without the management nightmare. Uhm... I don't quite get what you mean by "provably-distinct unique indexes". As long as the first columns of an index are equal to all columns of the partitioning columns, there is no problem. You could easily reduce to simple per-table indexes and using the partitioning rule set to decide which index to query. But how to create an (unique) index which is completely different from the partitioning key? Regards Markus
Hi, Gregory Stark wrote: > Put another way, multi-table indexes defeat the whole purpose of having > partitioned the table in the first place. If you could have managed a single > massive index then you wouldn't have bothered partitioning. That depends very much on the implementation of the multi-table index, as you describe below. I think the major missing part is not *how* such a meta-index should work - it's easily understandable, that one could use the per-table indices - but a programming interface, similar to the current index scan or sequential scan facility, which could return a table and tuple pointer, no? > However there is a use case that can be handled by a kind of compromise index. > Indexes that have leading columns which restrict all subtrees under that point > to a single partition can be handled by a kind of meta-index. So you have one > index which just points you to the right partition and corresponding index. Yeah. > That lets you enforce unique constraints as long as the partition key is part > of the unique constraint. Is that already sufficient? That would alter the ordering of the columns in the index, no? I mean: CREATE INDEX x ON test(a, b, c); isn't the same as CRETAE INDEX x ON test(c, b, a); That's why I'd say, the first column of an index would have to be equal to all of the columns used in the partitioning key. Regards Markus
On Wed, 2007-04-04 at 16:31 +0200, Markus Schiltknecht wrote: > But how to create an (unique) index which is completely different from > the partitioning key? Don't? Most high volume tables are Fact tables with potentially more than 1 row per Object/Dimension, so the unique index isn't appropriate in those cases. When partitioning a Major Entity its much easier to regard the PK as the partitioning key + unique key, which is frequently possible, even if it does break the exhortation against intelligent keys. I wouldn't stand in the way of someone trying to add that functionality, but I would describe the use case as fairly narrow. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Hi, Simon Riggs wrote: > Most high volume tables are Fact tables with potentially more than 1 row > per Object/Dimension, so the unique index isn't appropriate in those > cases. > > When partitioning a Major Entity its much easier to regard the PK as the > partitioning key + unique key, which is frequently possible, even if it > does break the exhortation against intelligent keys. Okay, so you are saying that a general purpose multi-table index isn't needed, but instead something based on the partitioning rule set and the per table indexes should be sufficient for the vast majority of cases? Regards Markus
On Wed, Apr 04, 2007 at 10:07:39AM -0400, Andrew Dunstan wrote: > Simon Riggs wrote: > >My hope is to have a mechanism to partition indexes or recognise > >that they are partitioned, so that a set of provably-distinct > >unique indexes can provide the exact same functionlity as a single > >large unique index, just without the management nightmare. > > Will this address the fairly common data design problem where we > need to ensure that a given value is unique across several tables > (possibly siblings, possibly not)? That would be where the provably-distinct part comes in, so yes. > If so, then full steam ahead. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
"Markus Schiltknecht" <markus@bluegap.ch> writes: > CREATE INDEX x ON test(a, b, c); > > That's why I'd say, the first column of an index would have to be equal to all > of the columns used in the partitioning key. That's certainly the simple case. It would be nice to be able to create an index like that and have the system automatically recognize that the leading column is identical to the partition key of (a) and therefore build indexes on each partition on (b,c). However there are also cases such as where you have a=0..99 in one partition and a=100..199 in partition two, etc. It could still automatically build indexes on (a,b,c) on each partition and somehow note that the unique constraint is guaranteed across the whole partitioned table. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Hi, Gregory Stark wrote: > However there are also cases such as where you have a=0..99 in one partition > and a=100..199 in partition two, etc. It could still automatically build > indexes on (a,b,c) on each partition and somehow note that the unique > constraint is guaranteed across the whole partitioned table. Uhm... yes, because 'a' is the partitioning key. According to my outline for partitioning rule sets, you would have a split @ a <= 100. Probably another one @ a <= 200, etc... but none the less, 'a' is the only column needed to decide what partition a row has to end up in, so 'a' is the only column in the partitioning key. What I'm saying is, that given your example, it's not easily possible to have an index on (b,a) even if 'a' is also in the partitioning key. It's very well possible to emulate a multi-table index on (a,b), though. Brainstorming about this somewhat more: how about having multiple columns in the partitioning key, i.e. 'a' and 'b', and the following rule set (which admittedly is somewhat special): table sample | | split @ a >= 100 / \ / \ split@ b >= 100 part3 / \ / \ part1 part2 An index on (a, b) could easily be 'emulated' by having such an index on all the partitions, but can we have an index on (b, a) like that? Probably not, because at the first split, we would have to duplicate. I.e. for an index scan on 'b = 22', we would have to scan the index on part3 as well as part1. Thus one can say, that an multi-table index can only easily be 'emulated', if it has the same columns as the partitioning key, in the same order. For the above example, these ones would be possible: (a) (a,b) (a,b,...) Yet another thought: the emulation of multi-table indexes, in this case, is like concatenating the indexes of the partitions in the right order. Asking for an index scan for 'WHERE a >= 95 AND a <= 105' when having a split at a >= 100, you would have to start on the index in the left bucket (with a < 100) and return everything until the end of the index, then continue on the index in the right bucket (with a >= 100). So you also have to be able to determine an order, which is easily possible for splits, but not so simple for modulos (hash partitioning). For such a modulo node, the executor would have to start multiple index scans, i.e.: table sample | | 'id' modulo 4 / | | \ / | | \ part1 part2 part3 part4 When scanning for a range (i.e. 'WHERE id >= 5 AND id <= 17'), the planner would have to request an index scan on each of the partition, joining the results in the right order. So, why not completely emulate all multi-table index scans? The above restriction would disappear, if we could teach the planner and executor how to join multiple index scan results, no? Questioning the other way around: do we need any sort of multi-table indexes at all, or isn't it enough to teach the planner and executor how to intelligently scan through (possibly) multiple indexes to get what is requested? Regards Markus
On Wed, 2007-04-04 at 20:55 +0200, Markus Schiltknecht wrote: > Questioning the other way around: do we need any sort of multi-table > indexes at all, or isn't it enough to teach the planner and executor how > to intelligently scan through (possibly) multiple indexes to get what is > requested? No, I don't think we need multi-table indexes at all. The planner already uses the Append node to put together multiple plans. The great thing is it will put together IndexScans and SeqScans as applicable. No need for multi-scans as a special node type. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > On Wed, 2007-04-04 at 20:55 +0200, Markus Schiltknecht wrote: > >> Questioning the other way around: do we need any sort of multi-table >> indexes at all, or isn't it enough to teach the planner and executor how >> to intelligently scan through (possibly) multiple indexes to get what is >> requested? > > No, I don't think we need multi-table indexes at all. If we don't have multi-table indexes how do we enforce a primary key against a partitioned set? What about non primary keys that are just UNIQUE? What about check constraints that aren't apart of the exclusion? Joshua D. Drake > > The planner already uses the Append node to put together multiple plans. > The great thing is it will put together IndexScans and SeqScans as > applicable. No need for multi-scans as a special node type. > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
David Fetter wrote: > On Wed, Apr 04, 2007 at 10:07:39AM -0400, Andrew Dunstan wrote: > >> Simon Riggs wrote: >> >>> My hope is to have a mechanism to partition indexes or recognise >>> that they are partitioned, so that a set of provably-distinct >>> unique indexes can provide the exact same functionlity as a single >>> large unique index, just without the management nightmare. >>> >> Will this address the fairly common data design problem where we >> need to ensure that a given value is unique across several tables >> (possibly siblings, possibly not)? >> > > That would be where the provably-distinct part comes in, so yes. > > That assumes you can provide some provably distinct test. In the general case I have in mind that isn't so. cheers andrew
Hi, Joshua D. Drake wrote: > If we don't have multi-table indexes how do we enforce a primary key > against a partitioned set? The executor would have to be clever enough to not do a single index scan, but possibly scan through multiple indexes when asking for uniqueness, depending on the partitioning rule set. Regards Markus
Simon Riggs wrote: > The planner already uses the Append node to put together multiple plans. > The great thing is it will put together IndexScans and SeqScans as > applicable. No need for multi-scans as a special node type. Yes... only that mixing 'concurrent' index scans in the right order would probably save us an extra sort step in some cases. Consider this with hash partitioning on (id): SELECT * FROM test WHERE id > 1 AND id < 9999999 ORDER BY id; Every partition should have an index on (id), so we already have pretty well sorted data, we just need to mix the results of the index scan in the correct order, no? Regards Markus
Andrew Dunstan wrote: > David Fetter wrote: >> That would be where the provably-distinct part comes in, so yes. > > That assumes you can provide some provably distinct test. In the general > case I have in mind that isn't so. Could you please give a somewhat more concrete example, I'm not following here. Thanks Markus
Markus Schiltknecht wrote: > Andrew Dunstan wrote: >> David Fetter wrote: >>> That would be where the provably-distinct part comes in, so yes. >> >> That assumes you can provide some provably distinct test. In the >> general case I have in mind that isn't so. > > Could you please give a somewhat more concrete example, I'm not > following here. What I'm asking about has nothing much to do with partitioning. Say I have two tables, each with a field FKed to a field in a third table. I'd like to create the values to be unique across the referring tables. Now, there are various tricks that can be played either with custom triggers or redundant data to do this, but there's no easy way. However, a multi-table unique index would do it for me quite nicely, if we could create such a thing. However, I don't know how to set up a test for provable distinctness in this general case. I guess my point was really that multi-table indexes might have uses beyond partitioning. cheers andrew
Hi, Andrew Dunstan wrote:> I guess my point was really that multi-table indexes might have uses> beyond partitioning. Aha, now I understand. Thanks for the clarification. > Say I have two tables, each with a field FKed to a field in a third > table. I'd like to create the values to be unique across the referring > tables. Now, there are various tricks that can be played either with > custom triggers or redundant data to do this, but there's no easy way. > However, a multi-table unique index would do it for me quite nicely, if > we could create such a thing. Maybe going into a similar direction and better think of it as a multi-table uniqueness constraint, which internally uses multiple, single-table indexes? Regards Markus
On Wed, 2007-04-04 at 12:10 -0700, Joshua D. Drake wrote: > Simon Riggs wrote: > > On Wed, 2007-04-04 at 20:55 +0200, Markus Schiltknecht wrote: > > > >> Questioning the other way around: do we need any sort of multi-table > >> indexes at all, or isn't it enough to teach the planner and executor how > >> to intelligently scan through (possibly) multiple indexes to get what is > >> requested? > > > > No, I don't think we need multi-table indexes at all. > > If we don't have multi-table indexes how do we enforce a primary key > against a partitioned set? What about non primary keys that are just > UNIQUE? What about check constraints that aren't apart of the exclusion? What I've been saying is that there is a way to do this that avoids the need for multi-table indexes (MTIs), see earlier discussion. That way avoids the massive performance overheads of MTIs, and also covers most use-cases I can personally imagine. I can come up with arbitrary examples that require them, but I've not seen one that makes sense in a real business app. Calling columns a, b and c disguises the validity of the example, IMHO. I'm not against someone else writing them and I'm sure its a great intellectual challenge, but I doubt whether it is worth the trouble anytime soon because the real range of uses for them is not that wide. Sure, Oracle has them, but in my view they are welcome to them too. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Wednesday 04 April 2007 09:19, NikhilS wrote: > Our current partitioning solution is based on inheritance. With that in > mind, for 8.3 I thought an implementation based on auto rules creation > would be the way to go. > The only problem I have with this is that the shops I know with big partitioned tables favor triggers over rules for both performance reason and a cleaner implementation. Even with automated rule creation this isnt going to change afaics... not to mention we already create our rules & triggers automatically, so really this just isn't exciting to me (though it may make it easier for people getting in on the ground floor) -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
Robert Treat wrote: > On Wednesday 04 April 2007 09:19, NikhilS wrote: >> Our current partitioning solution is based on inheritance. With that in >> mind, for 8.3 I thought an implementation based on auto rules creation >> would be the way to go. >> > > The only problem I have with this is that the shops I know with big > partitioned tables favor triggers over rules for both performance reason and > a cleaner implementation. Even with automated rule creation this isnt going > to change afaics... not to mention we already create our rules & triggers > automatically, so really this just isn't exciting to me (though it may make > it easier for people getting in on the ground floor) I second this. The trigger route is much more maintainable than the rule route. IMO what really needs to happen is something more low level where there are no DBA visible changes. Triggers also have overhead, it would be nice to get a little more bare metal with this. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
"Simon Riggs" <simon@2ndquadrant.com> writes: >> If we don't have multi-table indexes how do we enforce a primary key >> against a partitioned set? What about non primary keys that are just >> UNIQUE? What about check constraints that aren't apart of the exclusion? > > I can come up with arbitrary examples that require them, but I've not > seen one that makes sense in a real business app. Calling columns a, b > and c disguises the validity of the example, IMHO. Usually it comes with a situation where you want to do something like "partition invoices by invoice_date" while simultaneously "use invoice_num" as the primary key". Normally the invoices numbers will be incrementing chronologically but there's no constraint or any mechanism to enforce that or to enforce that an old invoice number from an old partition isn't reused. In practice I think this isn't really a serious problem though. The old partitions are going to be read-only so you can just check that the invoice number doesn't already exist without worrying about race conditions. And in most cases it's being sequence-generated or something equally reliable so the constraints are really just there as a backstop; you're not depending on them for correctness. At some level not having them is actually a nice thing for DBAs. It gives them an excuse for not having the constraint that will only cause them maintenance headaches down the road. But it's dangerous to go too far down that road. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Wed, Apr 04, 2007 at 09:34:03PM +0200, Markus Schiltknecht wrote: > Joshua D. Drake wrote: > >If we don't have multi-table indexes how do we enforce a primary key > >against a partitioned set? > > The executor would have to be clever enough to not do a single index > scan, but possibly scan through multiple indexes when asking for > uniqueness, depending on the partitioning rule set. But it's not the executor that checks uniqueness, it's built into the btre code. If someone manages to crack uniqueness for GiST indexes, we'll have our answer, since it has exactly the same problem but on a different scale. (Or vice-versa, if some gets uniqueness for multiple indexes, we can do it for GiST also). Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Hi,
I had raised this issue about rules/triggers back then and the responses seemed to be evenly split as to which ones to use.
I think the broad question really is how well we want to support the current inheritance based partitioning mechanism. If we want to stick to it for a while (and to which we will stick to unless something concrete/better/"bare metal" comes up), IMHO we should try to make things simpler (by automating things if possible) to make it easier for people getting in.
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
> The only problem I have with this is that the shops I know with big
> partitioned tables favor triggers over rules for both performance reason and
> a cleaner implementation. Even with automated rule creation this isnt going
> to change afaics... not to mention we already create our rules & triggers
> automatically, so really this just isn't exciting to me (though it may make
> it easier for people getting in on the ground floor)
I second this. The trigger route is much more maintainable than the rule
route. IMO what really needs to happen is something more low level where
there are no DBA visible changes. Triggers also have overhead, it would
be nice to get a little more bare metal with this.
I had raised this issue about rules/triggers back then and the responses seemed to be evenly split as to which ones to use.
I think the broad question really is how well we want to support the current inheritance based partitioning mechanism. If we want to stick to it for a while (and to which we will stick to unless something concrete/better/"bare metal" comes up), IMHO we should try to make things simpler (by automating things if possible) to make it easier for people getting in.
Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
On Thu, 2007-04-05 at 13:59 +0530, NikhilS wrote: > Hi, > > The only problem I have with this is that the shops I know > with big > > partitioned tables favor triggers over rules for both > performance reason and > > a cleaner implementation. Even with automated rule creation > this isnt going > > to change afaics... not to mention we already create our > rules & triggers > > automatically, so really this just isn't exciting to me > (though it may make > > it easier for people getting in on the ground floor) > > I second this. The trigger route is much more maintainable > than the rule > route. IMO what really needs to happen is something more low > level where > there are no DBA visible changes. Triggers also have overhead, > it would > be nice to get a little more bare metal with this. > > I had raised this issue about rules/triggers back then and the > responses seemed to be evenly split as to which ones to use. Presumably your implementation already uses Triggers for INSERTs though, so why not use triggers for everything? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Hi,
No I am using rules for all the 3 cases. I am done with the UPDATE stuff too on which I was stuck with some help, so here is what the patch will do:
postgres=# create table test1 (a int unique , b int check (b > 0)) partition by range(a) (partition child_1 check (a < 10));
NOTICE: CREATE TABLE / UNIQUE will create implicit index "test1_a_key" for table "test1"
NOTICE: CREATE TABLE / UNIQUE will create implicit index "child_1_a_key" for table "child_1"
CREATE TABLE
A describe of the parent shows the rules added to it:
postgres=# \d test1
Table "public.test1"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | integer |
Indexes:
"test1_a_key" UNIQUE, btree (a)
Check constraints:
"test1_b_check" CHECK (b > 0)
Rules:
test1_child_1_delete AS
ON DELETE TO test1
WHERE old.a < 10 DO INSTEAD DELETE FROM child_1
WHERE child_1.a = old.a
test1_child_1_insert AS
ON INSERT TO test1
WHERE new.a < 10 DO INSTEAD INSERT INTO child_1 (a, b)
VALUES (new.a, new.b)
test1_child_1_update AS
ON UPDATE TO test1
WHERE old.a < 10 DO INSTEAD UPDATE child_1 SET a = new.a, b = new.b
WHERE child_1.a = old.a
Whereas a describe on the child shows the following:
postgres=# \d child_1
Table "public.child_1"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | integer |
Indexes:
"child_1_a_key" UNIQUE, btree (a)
Check constraints:
"child_1_a_check" CHECK (a < 10)
"test1_b_check" CHECK (b > 0)
Inherits: test1
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
> I had raised this issue about rules/triggers back then and the
> responses seemed to be evenly split as to which ones to use.
Presumably your implementation already uses Triggers for INSERTs though,
so why not use triggers for everything?
No I am using rules for all the 3 cases. I am done with the UPDATE stuff too on which I was stuck with some help, so here is what the patch will do:
postgres=# create table test1 (a int unique , b int check (b > 0)) partition by range(a) (partition child_1 check (a < 10));
NOTICE: CREATE TABLE / UNIQUE will create implicit index "test1_a_key" for table "test1"
NOTICE: CREATE TABLE / UNIQUE will create implicit index "child_1_a_key" for table "child_1"
CREATE TABLE
A describe of the parent shows the rules added to it:
postgres=# \d test1
Table "public.test1"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | integer |
Indexes:
"test1_a_key" UNIQUE, btree (a)
Check constraints:
"test1_b_check" CHECK (b > 0)
Rules:
test1_child_1_delete AS
ON DELETE TO test1
WHERE old.a < 10 DO INSTEAD DELETE FROM child_1
WHERE child_1.a = old.a
test1_child_1_insert AS
ON INSERT TO test1
WHERE new.a < 10 DO INSTEAD INSERT INTO child_1 (a, b)
VALUES (new.a, new.b)
test1_child_1_update AS
ON UPDATE TO test1
WHERE old.a < 10 DO INSTEAD UPDATE child_1 SET a = new.a, b = new.b
WHERE child_1.a = old.a
Whereas a describe on the child shows the following:
postgres=# \d child_1
Table "public.child_1"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | integer |
Indexes:
"child_1_a_key" UNIQUE, btree (a)
Check constraints:
"child_1_a_check" CHECK (a < 10)
"test1_b_check" CHECK (b > 0)
Inherits: test1
Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
On Thu, 2007-04-05 at 16:35 +0530, NikhilS wrote: > Hi, > > > I had raised this issue about rules/triggers back then and > the > > responses seemed to be evenly split as to which ones to > use. > > Presumably your implementation already uses Triggers for > INSERTs though, > so why not use triggers for everything? > > No I am using rules for all the 3 cases. So we are unable to load any of the tables using COPY. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
So we are unable to load any of the tables using COPY.
Aww, guess should have stuck to triggers as a first choice. Mea culpa, since I should have investigated some more before deciding on rules, or should have prodded you more earlier:)
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
On Wednesday 04 April 2007 21:17, Gregory Stark wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > >> If we don't have multi-table indexes how do we enforce a primary key > >> against a partitioned set? What about non primary keys that are just > >> UNIQUE? What about check constraints that aren't apart of the exclusion? > > > > I can come up with arbitrary examples that require them, but I've not > > seen one that makes sense in a real business app. Calling columns a, b > > and c disguises the validity of the example, IMHO. > > Usually it comes with a situation where you want to do something like > "partition invoices by invoice_date" while simultaneously "use invoice_num" > as the primary key". > > Normally the invoices numbers will be incrementing chronologically but > there's no constraint or any mechanism to enforce that or to enforce that > an old invoice number from an old partition isn't reused. > > In practice I think this isn't really a serious problem though. The old > partitions are going to be read-only so you can just check that the invoice > number doesn't already exist without worrying about race conditions. In practice many people need a PK on the table not just as a unique identifier for the row, but to act as a parent in a FK relationship. If you start your schema with one table and have to break it up into partitions later, this will raise a number of red flags. > And in > most cases it's being sequence-generated or something equally reliable so > the constraints are really just there as a backstop; you're not depending > on them for correctness. > With that argument why have unique constraints at all? -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
> > That lets you enforce unique constraints as long as the partition key > > is part of the unique constraint. > > Is that already sufficient? yes > That would alter the ordering of > the columns in the index, no? I mean: It produces ordered blocks of append nodes for range queries that span multiple partitions, but one unique key can still only be in exactly one of the partitions. e.g. If you range partition by b, only one partition is applicable regardless of the position of b in the index. This is sufficient for a working unique constraint with current pg versions. > CREATE INDEX x ON test(a, b, c); > > isn't the same as > > CRETAE INDEX x ON test(c, b, a); That is only a problem if you also want to avoid a sort (e.g. for an order by), it is not an issue for filtering rows. And in some cases the sort could still be avoided with some range proving logic, if you can bring the append nodes of partitions into an order that represents the order by. (an example would be a query "where c=5 and b between 0 and 20" and two partitions one for 0 <= b < 10 and a second for 10 <= b) > That's why I'd say, the first columns of an index would have > to be equal to all of the columns used in the partitioning key. No. It may change performance in some situations, but it is not needed for unique constraints. Andreas
Hi, Zeugswetter Andreas ADI SD wrote: > >> CREATE INDEX x ON test(a, b, c); >> >> isn't the same as >> >> CRETAE INDEX x ON test(c, b, a); > > That is only a problem if you also want to avoid a sort (e.g. for an > order by), ..or if you want to use that index for 'WHERE a = 5'. The first one is probably helping you, the second isn't. > (an example would be a query "where c=5 and b between 0 and 20" > and two partitions one for 0 <= b < 10 and a second for 10 <= b) Hm.. in that case, an index on (a, b, c) wouldn't help. An index on (c, b, a) would be just perfect, agreed? Now, for the partitioning: you simply have to scan two partitions in that case, no matter how you arrange your indexes. And this is where we need some sort of multi-table index scan functionality. (I'm not saying a multi-table index. Such a thing would be too large on disk. That functionality should probably better be realized by using the underlying per-table indexes). >> That's why I'd say, the first columns of an index would have >> to be equal to all of the columns used in the partitioning key. I correct my own statement somewhat, here: only in that case, a single table index can satisfy your request. For other cases, you'd have to query more than one partition's indexes and mix them correctly to maintain the right order, if required. > No. It may change performance in some situations, but it is not needed > for unique constraints. Agreed, for unique constraints. But indexes are used for some more things than just unique constraints checking. ;-) Regards Markus
Hi, Martijn van Oosterhout wrote: >> The executor would have to be clever enough to not do a single index >> scan, but possibly scan through multiple indexes when asking for >> uniqueness, depending on the partitioning rule set. > > But it's not the executor that checks uniqueness, it's built into the > btre code. Well, it's the executor calling into the btree code. Couldn't the executor choose which (btree-) indexes to query for uniqueness? > If someone manages to crack uniqueness for GiST indexes, we'll have our > answer, since it has exactly the same problem but on a different scale. > (Or vice-versa, if some gets uniqueness for multiple indexes, we can do > it for GiST also). Uh.. can you elaborate on that? AFAICS, you would simply have to query multiple btree indexes and make sure non of them is violated. How would that help making unique GiST indexes possible? What's the problem there? Regards Markus
On Thu, Apr 05, 2007 at 10:00:37PM +0200, Markus Schiltknecht wrote: > >If someone manages to crack uniqueness for GiST indexes, we'll have our > >answer, since it has exactly the same problem but on a different scale. > >(Or vice-versa, if some gets uniqueness for multiple indexes, we can do > >it for GiST also). > > Uh.. can you elaborate on that? AFAICS, you would simply have to query > multiple btree indexes and make sure non of them is violated. How would > that help making unique GiST indexes possible? What's the problem there? There's a race condition. What happens if someone else tries to insert the same key at the same time. If you know it's going to be in the same index you can lock the page. Across multiple indexes you run into deadlock issues. I'm not saying it's hard, just that it's not as easy as checking each index... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Markus Schiltknecht <markus@bluegap.ch> writes: > Uh.. can you elaborate on that? AFAICS, you would simply have to query > multiple btree indexes and make sure non of them is violated. That only works for the partition-key indexes, ie, ones where you can be sure a-priori that there cannot be duplicate keys in two different indexes. I believe the complaint here is that people would like to be able to enforce uniqueness across the whole partitioned table on columns that are not part of the partition key. (But that sounds rather like pie in the sky, actually. Which other databases can do that, and how do they do it?) regards, tom lane
On Fri, 2007-04-06 at 01:56 -0400, Tom Lane wrote: > Markus Schiltknecht <markus@bluegap.ch> writes: > > Uh.. can you elaborate on that? AFAICS, you would simply have to query > > multiple btree indexes and make sure non of them is violated. > > That only works for the partition-key indexes, ie, ones where you can be > sure a-priori that there cannot be duplicate keys in two different indexes. > I believe the complaint here is that people would like to be able to > enforce uniqueness across the whole partitioned table on columns that > are not part of the partition key. I see that as a logical modelling problem, not a physical one. If you partition on invoice_date, but have PK=invoice_id then it seems straightforward to change the model so that the PK is a compound key (invoice_id, invoice_period). This works whether or nor invoice_id is unique on its own. And this is typically the way things are modelled in the real world anyway, since such things existed from the time of paper filing systems where partitioning like that was required to quickly locate a file in a paper archive/library. If we partition on invoice_date only, there is an implication that people will search for invoices on date range only too, otherwise why not just partition on invoice_id. This still works with the compound key approach. > (But that sounds rather like pie in the sky, actually. Which other > databases can do that, and how do they do it?) Oracle does it, by building a big index. Few people use it. There are significant problems with this idea that I have already raised: - how big would the index be? - how would you add and remove partitions with any kind of performance? If we partitioned on date range, that will surely increase over time. - the index could almost certainly never be REINDEXed because of space requirements and time considerations. - if the indexed values were monotonically increasing the RHS of the index would become a significant hotspot in load performance, assuming high volume inserts into a large table My argument is that there are significant real-world disadvantages to having this feature, yet there exists a reasonable workaround to avoid ever needing it. Why would we spend time building and supporting it? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > If we partition on invoice_date only, there is an implication that > people will search for invoices on date range only too, otherwise why > not just partition on invoice_id. This still works with the compound key > approach. Well there are practical problems with partitioning on invoice_id. It's convenient to have a predictable partition definition that can be calculated in advance. I suspect what people did with paper invoices is look at the last invoice for a period and note the invoice_id down to check all future invoice_ids against. Essentially partitioning on two separate equivalent partition keys. We could do the same sort of thing since we're looking at constraints, there's nothing stopping the partitions from having two separate but effectively equivalent constraints on them. I'm not sure how to describe "partition based on this rule for dates but note the range of invoice_ids covering a partition and generate a constraint for that as well" But if we could find a way to represent that it would make a lot of common use cases much more convenient to use. >> (But that sounds rather like pie in the sky, actually. Which other >> databases can do that, and how do they do it?) > > Oracle does it, by building a big index. Few people use it. The people that use it are the people stuck by dogmatic rules about "every table must have a primary key" or "every logical constraint must be protected by a database constraint". Ie, database shops run by the CYA principle. But if a database feature is hurting you more than it's helping you then you're not doing yourself any favours by using it. The database is a tool to make your life easier, not something to flog yourself with to prove how good your database design skills are. Oracle calls these "global" indexes and imho they defeat the whole purpose behind partitioning your data in the first place. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
> > (But that sounds rather like pie in the sky, actually. Which other > > databases can do that, and how do they do it?) > > Oracle does it, by building a big index. Few people use it. And others allow a different partitioning strategy for each index, but that has the same problem of how to remove partitions without a huge amount of index reorganization. > There are significant problems with this idea that I have already > raised: > - how big would the index be? > - how would you add and remove partitions with any kind of > performance? > If we partitioned on date range, that will surely increase over time. > - the index could almost certainly never be REINDEXed because > of space requirements and time considerations. > - if the indexed values were monotonically increasing the RHS > of the index would become a significant hotspot in load > performance, assuming high volume inserts into a large table yes > My argument is that there are significant real-world > disadvantages to having this feature, yet there exists a > reasonable workaround to avoid ever needing it. I'd say a workaround can mostly be found but not always. But I agree, that the downsides of one large global index are substantial enough to not make this path attractive. > Why would we spend time building and supporting it? What I think we would like to have is putting the append nodes into an order that allows removing the sort node whenever that can be done. And maybe a merge node (that replaces the append and sort node) that can merge presorted partitions. I have one real example where I currently need one large non unique index in Informix. It is a journal table that is partitioned by client timestamp, but I need a select first 1000 (of possibly many mio rows) order by server_timestamp in a range that naturally sometimes needs more than one partition because client and server timestamps diverge. Here the merge facility would allow me to not use the global index and still avoid sorting millions of rows (which would not finish in time). Problem with the global index is, that I have to delete all rows from the oldest partition before removing it to avoid rebuilding the global index. Andreas
On Fri, 2007-04-06 at 12:47 +0200, Zeugswetter Andreas ADI SD wrote: > What I think we would like to have is putting the append nodes into an > order that allows removing the sort node whenever that can be done. > And > maybe a merge node (that replaces the append and sort node) that can > merge presorted partitions. Yes, we definitely need a way to express the ordering of partitions. We can use this in the way that you say, as well as being able to do faster inclusion/exclusion: i.e. if we have partitions for each year (2001, 2002, 2003 2004, 2005, 2006, 2007) AND we have already proved that 2005 is excluded when we have a WHERE clause saying year >= 2006, then we should be able to use the ordering to prove that partitions for 2004 and before are also automatically excluded. I'll think some more about the Merge node, but not right now. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > i.e. if we have partitions for each year (2001, 2002, 2003 2004, 2005, > 2006, 2007) AND we have already proved that 2005 is excluded when we > have a WHERE clause saying year >= 2006, then we should be able to use > the ordering to prove that partitions for 2004 and before are also > automatically excluded. Provided you've set up the right constraints, the current constraint_exclusion feature does exactly that, no? > I'll think some more about the Merge node, but not right now. I've looked at nodeAppend.c and nodeMergeJoin.c. Probably we can use much of nodeMergeJoin, just without the join? Instead returning the tuples as they are, but in the correct order. The nodeMergeJoin code can only handle two inputs (a left and a right node), but it might be beneficial to structure multiple merge nodes into a binary tree layout anyway. (I'm guessing that might reduce the amount of comparisons needed). What do you think? Regards Markus
Zeugswetter Andreas ADI SD wrote: >>> (But that sounds rather like pie in the sky, actually. Which other >>> databases can do that, and how do they do it?) >> Oracle does it, by building a big index. Few people use it. > > And others allow a different partitioning strategy for each index, > but that has the same problem of how to remove partitions without > a huge amount of index reorganization. If you removed a partition, couldn't the index be cleaned up by VACUUM? Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
> But if we could find a way to represent that it would make a lot of common use > cases much more convenient to use. > >>> (But that sounds rather like pie in the sky, actually. Which other >>> databases can do that, and how do they do it?) >> Oracle does it, by building a big index. Few people use it. > > The people that use it are the people stuck by dogmatic rules about "every > table must have a primary key" or "every logical constraint must be protected > by a database constraint". Ie, database shops run by the CYA principle. Or ones that actually believe that every table where possible should have a primary key. There are very, very few instances in good design where a table does not have a primary key. It has nothing to do with CYA. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On Fri, Apr 06, 2007 at 09:22:55AM -0700, Joshua D. Drake wrote: > >The people that use it are the people stuck by dogmatic rules about > >"every table must have a primary key" or "every logical constraint > >must be protected by a database constraint". Ie, database shops run > >by the CYA principle. > > Or ones that actually believe that every table where possible should > have a primary key. > > There are very, very few instances in good design where a table does > not have a primary key. > > It has nothing to do with CYA. That depends on what you mean by CYA. If you mean, "taking a precaution just so you can show it's not your fault when the mature hits the fan," I agree. If you mean, "taking a precaution that will actually prevent a problem from occurring in the first place," it definitely does. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
David Fetter wrote: > On Fri, Apr 06, 2007 at 09:22:55AM -0700, Joshua D. Drake wrote: > >>> The people that use it are the people stuck by dogmatic rules about >>> "every table must have a primary key" or "every logical constraint >>> must be protected by a database constraint". Ie, database shops run >>> by the CYA principle. >> Or ones that actually believe that every table where possible should >> have a primary key. >> >> There are very, very few instances in good design where a table does >> not have a primary key. >> >> It has nothing to do with CYA. > > That depends on what you mean by CYA. If you mean, "taking a > precaution just so you can show it's not your fault when the mature > hits the fan," I agree. If you mean, "taking a precaution that will > actually prevent a problem from occurring in the first place," it > definitely does. Heh, fair enough. When I think of CYA, I think of the former. Joshua D. Drake > > Cheers, > D -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
> David Fetter wrote: > > On Fri, Apr 06, 2007 at 09:22:55AM -0700, Joshua D. Drake wrote: > > > >>> The people that use it are the people stuck by dogmatic rules about > >>> "every table must have a primary key" or "every logical constraint > >>> must be protected by a database constraint". Ie, database shops run > >>> by the CYA principle. > >> Or ones that actually believe that every table where possible should > >> have a primary key. > >> > >> There are very, very few instances in good design where a table does > >> not have a primary key. > >> > >> It has nothing to do with CYA. > > > > That depends on what you mean by CYA. If you mean, "taking a > > precaution just so you can show it's not your fault when the mature > > hits the fan," I agree. If you mean, "taking a precaution that will > > actually prevent a problem from occurring in the first place," it > > definitely does. > > Heh, fair enough. When I think of CYA, I think of the former. > > Joshua D. Drake ...I was thinking the point was more on "primary key" as in syntax, as opposed to a table that has a/an attribute(s) that is acknowledged by DML coders as the appropriate way to use the stored data. That is, I may very well _not_ want the overhead of an index of any kind, forced uniqueness, etc, but might also well think of a given attribute as the primary key. Use of constraints in lieu of "primary key" come to mind... 'Course, maybe I missed the point! -smile- 'Nother thought: CYA _can_ have odeous performance costs if over-implemented. It's a matter of using actual use-cases - or observed behavior - to taylor the CYA solution to fit the need without undue overhead. Rgds, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 rtroy@ScienceTools.com, http://ScienceTools.com/
On Fri, 2007-04-06 at 16:08 +0200, Markus Schiltknecht wrote: > Simon Riggs wrote: > > i.e. if we have partitions for each year (2001, 2002, 2003 2004, 2005, > > 2006, 2007) AND we have already proved that 2005 is excluded when we > > have a WHERE clause saying year >= 2006, then we should be able to use > > the ordering to prove that partitions for 2004 and before are also > > automatically excluded. > > Provided you've set up the right constraints, the current > constraint_exclusion feature does exactly that, no? The end result yes, the mechanism, no. > > I'll think some more about the Merge node, but not right now. > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com >