Thread: RangeType internal use
Hi, I wonder why I cannot find a way to get a range type for a given (sub-) type. I would like to build a RangeType from Datum's of lower and upper bounds. Much like how construct_array() builds an ArrayType from a Datum array of elements given elements' type info. Is there some way I do not seem to know? If not, would it be worthwhile to make something like construct_range() that returns a RangeType given Datum's of lower and upper bounds and subtype info? Thanks, Amit
Hi, from nearby:) > I wonder why I cannot find a way to get a range type for a given (sub-) > type. I would like to build a RangeType from Datum's of lower and upper > bounds. Much like how construct_array() builds an ArrayType from a Datum > array of elements given elements' type info. > > Is there some way I do not seem to know? If not, would it be worthwhile > to make something like construct_range() that returns a RangeType given > Datum's of lower and upper bounds and subtype info? make_range needs the range type itself. On SQL interfalce, you can get range type coresponds to a base type by looking up the pg_range catalog. SELECT rngtypid::regtype, rngsubtype::regtypeFROM pg_range WHERE rngsubtype = 'int'::regtype; rngtypid | rngsubtype -----------+------------int4range | integer But there's only one syscache for this catalog which takes range type id. So the reverse resolution rngsubtype->rngtype seems not available. TypeCahce has only comparison function info as surely available element related to range types but this wouldn't help. I think scanning the entire cache is not allowable even if possible. Perhaps what is needed is adding RANGESUBTYPE syscache but I don't know whether it is allowable or not. Thoughts? regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Horiguchi-san, On 06-02-2015 PM 04:34, Kyotaro HORIGUCHI wrote: > Hi, from nearby:) > Thank you! >> I wonder why I cannot find a way to get a range type for a given (sub-) >> type. I would like to build a RangeType from Datum's of lower and upper >> bounds. Much like how construct_array() builds an ArrayType from a Datum >> array of elements given elements' type info. >> >> Is there some way I do not seem to know? If not, would it be worthwhile >> to make something like construct_range() that returns a RangeType given >> Datum's of lower and upper bounds and subtype info? > > make_range needs the range type itself. > > On SQL interfalce, you can get range type coresponds to a base > type by looking up the pg_range catalog. > > SELECT rngtypid::regtype, rngsubtype::regtype > FROM pg_range WHERE rngsubtype = 'int'::regtype; > > rngtypid | rngsubtype > -----------+------------ > int4range | integer > > But there's only one syscache for this catalog which takes range > type id. So the reverse resolution rngsubtype->rngtype seems not > available. TypeCahce has only comparison function info as surely > available element related to range types but this wouldn't > help. I think scanning the entire cache is not allowable even if > possible. > > Perhaps what is needed is adding RANGESUBTYPE syscache but I > don't know whether it is allowable or not. > > Thoughts? Actually, I'm wondering if there is one-to-one mapping from rangetype to subtype (and vice versa?), then this should be OK. But if not (that is designers of range types thought there is not necessarily such a mapping), then perhaps we could add, say, rngtypeisdefault flag to pg_range. Perhaps following is not too pretty: + +/* + * get_rangetype_for_type + * + * returns a TypeCacheEntry for a range type of a given (sub-) type. + */ +TypeCacheEntry * +get_rangetype_for_type(Oid subtypid) +{ + Relation relation; + SysScanDesc scan; + HeapTuple rangeTuple; + Oid rngsubtype; + Oid rngtypid = InvalidOid; + + relation = heap_open(RangeRelationId, AccessShareLock); + + scan = systable_beginscan(relation, InvalidOid, false, + NULL, 0, NULL); + + while ((rangeTuple = systable_getnext(scan)) != NULL) + { + rngsubtype = ((Form_pg_range) GETSTRUCT(rangeTuple))->rngsubtype; + + if (rngsubtype == subtypid) + { + rngtypid = ((Form_pg_range) GETSTRUCT(rangeTuple))->rngtypid; + break; + } + } + + systable_endscan(scan); + heap_close(relation, AccessShareLock); + + return(rngtypid != InvalidOid + ? lookup_type_cache(rngtypid, TYPECACHE_RANGE_INFO): NULL); +} Thanks, Amit
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > I wonder why I cannot find a way to get a range type for a given (sub-) > type. I would like to build a RangeType from Datum's of lower and upper > bounds. Much like how construct_array() builds an ArrayType from a Datum > array of elements given elements' type info. > Is there some way I do not seem to know? If not, would it be worthwhile > to make something like construct_range() that returns a RangeType given > Datum's of lower and upper bounds and subtype info? There is no good reason to assume that a range type exists at all, much less that it is unique for a subtype. (Read the CREATE TYPE documentation if you're unclear as to why not.) You have not said what you're trying to accomplish, but this seems like a dead end. regards, tom lane
On 07-02-2015 AM 12:10, Tom Lane wrote: > Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: >> I wonder why I cannot find a way to get a range type for a given (sub-) >> type. I would like to build a RangeType from Datum's of lower and upper >> bounds. Much like how construct_array() builds an ArrayType from a Datum >> array of elements given elements' type info. > >> Is there some way I do not seem to know? If not, would it be worthwhile >> to make something like construct_range() that returns a RangeType given >> Datum's of lower and upper bounds and subtype info? > > There is no good reason to assume that a range type exists at all, much > less that it is unique for a subtype. (Read the CREATE TYPE documentation > if you're unclear as to why not.) You have not said what you're trying to > accomplish, but this seems like a dead end. > We do have a number of built-in range types which we can safely assume to exist with whatever default behavior they are hard-coded with (which I guess are not 'ALTER TYPE'-able). Those existing range types cover a useful set of sub-types which are probably also the potential candidates to be used for range partitioning. Now, one may define a new range type for a given sub-type with a different subopclass, subcollation, subcanonical or subdiff, which is perhaps the problem you are mentioning. Perhaps not very appropriate a solution, but how about a rngtypisdefault (for a sub-type) flag? Sorry I didn't mention before about an idea I am trying to implement with this - it is to serialize range partition bounds as a range type value per partition key column. The serialized representation of a range partition bound for a partition then effectively becomes an anyarray of anyrange: + rangebounds = construct_array(datum, + partnatts, + ANYRANGEOID, + -1, + false, + 'd'); each element of the passed datum array is a range type value of the corresponding subtype. Now, range types may not yet be ripe for internal use but the above idea kind of works modulo the absence of infrastructure I was asking for in OP. Am I still missing something? Thanks, Amit
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > On 07-02-2015 AM 12:10, Tom Lane wrote: >> There is no good reason to assume that a range type exists at all, much >> less that it is unique for a subtype. (Read the CREATE TYPE documentation >> if you're unclear as to why not.) You have not said what you're trying to >> accomplish, but this seems like a dead end. > Sorry I didn't mention before about an idea I am trying to implement > with this - it is to serialize range partition bounds as a range type > value per partition key column. The serialized representation of a range > partition bound for a partition then effectively becomes an anyarray of > anyrange: Meh. I don't care for that much --- it sounds a lot like deciding that your problem is a nail because there is a hammer within reach. A random collection of ranges doesn't seem like a very appropriate representation to me; first because there is no simple way to enforce that it partitions the key space (no overlaps and no missing portions), and second because it provides little purchase for efficient tuple routing algorithms. The best you could possibly hope for is some log-N tree search mechanism, and that would require a fair amount of setup beforehand. I'd rather insist that range partitioning be done on the basis of an origin point and a bin width, which would allow trivial computation of which bin number any given key falls in. This will limit the set of types it can be done over, for sure, but not more than depending on built-in range types would. regards, tom lane
On 02/09/2015 03:21 AM, Tom Lane wrote: > Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: >> On 07-02-2015 AM 12:10, Tom Lane wrote: >>> There is no good reason to assume that a range type exists at all, much >>> less that it is unique for a subtype. (Read the CREATE TYPE documentation >>> if you're unclear as to why not.) You have not said what you're trying to >>> accomplish, but this seems like a dead end. > >> Sorry I didn't mention before about an idea I am trying to implement >> with this - it is to serialize range partition bounds as a range type >> value per partition key column. The serialized representation of a range >> partition bound for a partition then effectively becomes an anyarray of >> anyrange: > > Meh. I don't care for that much --- it sounds a lot like deciding that > your problem is a nail because there is a hammer within reach. A random > collection of ranges doesn't seem like a very appropriate representation > to me; first because there is no simple way to enforce that it partitions > the key space (no overlaps and no missing portions), and second because it > provides little purchase for efficient tuple routing algorithms. The best > you could possibly hope for is some log-N tree search mechanism, and that > would require a fair amount of setup beforehand. Building a tree or a sorted array of the min or max bounds of the ranges doesn't sound hard. log-N sounds fast enough. > I'd rather insist that range partitioning be done on the basis of an > origin point and a bin width, which would allow trivial computation of > which bin number any given key falls in. This will limit the set of types > it can be done over, for sure, but not more than depending on built-in > range types would. That sounds unnecessarily limiting. If there's a good reason to limit it to that, then fine, but I don't think it'd be any more difficult to make it flexible. Let's wait for the patch and see I guess. - Heikki
On 09-02-2015 AM 10:21, Tom Lane wrote: > > Meh. I don't care for that much --- it sounds a lot like deciding that > your problem is a nail because there is a hammer within reach. A random > collection of ranges doesn't seem like a very appropriate representation > to me; first because there is no simple way to enforce that it partitions > the key space (no overlaps and no missing portions), and second because it > provides little purchase for efficient tuple routing algorithms. The best > you could possibly hope for is some log-N tree search mechanism, and that > would require a fair amount of setup beforehand. > > I'd rather insist that range partitioning be done on the basis of an > origin point and a bin width, which would allow trivial computation of > which bin number any given key falls in. This will limit the set of types > it can be done over, for sure, but not more than depending on built-in > range types would. > Okay, let me back up a little and think about your suggestion which I do not seem to understand very well - it raises a few questions for me: does this mean a partitioning criteria is associated with parent (partitioned table) rather than each individual partition? I would guess that bin width is partition interval such that each bin number gives partition number (of equal-sized consecutively numbered partitions without gaps). But I don't quite understand what origin point is? Is that a key literal value from which to begin counting bins and if so, is it stored in catalog as part of the partitioning rule? Does this affect the syntax to use when defining partitioned table/partitions like: CREATE TABLE parent PARTITION BY RANGE ON (col1) EVERY '3 months'::interval; CREATE TABLE child PARTITION 1 OF parent; or some such? Or may I ask if is is just an internal representation geared towards optimizing range partitioning whereas on-disk representation is something more generalized allowing for other partitioning strategies to be implemented as well? Thanks, Amit
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > On 02/09/2015 03:21 AM, Tom Lane wrote: >> Meh. I don't care for that much --- it sounds a lot like deciding that >> your problem is a nail because there is a hammer within reach. A random >> collection of ranges doesn't seem like a very appropriate representation >> to me; first because there is no simple way to enforce that it partitions >> the key space (no overlaps and no missing portions), and second because it >> provides little purchase for efficient tuple routing algorithms. The best >> you could possibly hope for is some log-N tree search mechanism, and that >> would require a fair amount of setup beforehand. > Building a tree or a sorted array of the min or max bounds of the ranges > doesn't sound hard. log-N sounds fast enough. It's going to be complicated and probably buggy, and I think it is heading in the wrong direction altogether. If you want to partition in some arbitrary complicated fashion that the system can't reason about very effectively, we *already have that*. IMO the entire point of building a new partitioning infrastructure is to build something simple, reliable, and a whole lot faster than what you can get from inheritance relationships. And "faster" is going to come mainly from making the partitioning rules as simple as possible, not as complex as possible. Just to start with: one of the absolutely fundamental things we need out of partitioning is the ability to have uniqueness constraints that span a partitioned table set. That means the partitions have to be disjoint, and it has to be not possible to break that. The design proposed here won't enforce that without complicated (and again, possibly buggy) logic. In short, this design requires a whole lot of extra mechanism to get to places that we have to get to, and I don't believe that that extra complexity is going to buy anything useful at all. regards, tom lane
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > Okay, let me back up a little and think about your suggestion which I do > not seem to understand very well - it raises a few questions for me: > does this mean a partitioning criteria is associated with parent > (partitioned table) rather than each individual partition? Absolutely. Anything else is not scalable; it's just another flavor of the inheritance + CHECK constraint mechanism. The entire point of doing a new partitioning design IMO is to get away from that. It should be possible to determine which partition a row belongs to in O(1) time, not O(N). > I would guess > that bin width is partition interval such that each bin number gives > partition number (of equal-sized consecutively numbered partitions > without gaps). But I don't quite understand what origin point is? Is > that a key literal value from which to begin counting bins and if so, is > it stored in catalog as part of the partitioning rule? Yeah, I would think so. regards, tom lane
On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It's going to be complicated and probably buggy, and I think it is heading > in the wrong direction altogether. If you want to partition in some > arbitrary complicated fashion that the system can't reason about very > effectively, we *already have that*. IMO the entire point of building > a new partitioning infrastructure is to build something simple, reliable, > and a whole lot faster than what you can get from inheritance > relationships. And "faster" is going to come mainly from making the > partitioning rules as simple as possible, not as complex as possible. Yeah, but people expect to be able to partition on ranges that are not all of equal width. I think any proposal that we shouldn't support that is the kiss of death for a feature like this - it will be so restricted as to eliminate 75% of the use cases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It's going to be complicated and probably buggy, and I think it is heading >> in the wrong direction altogether. If you want to partition in some >> arbitrary complicated fashion that the system can't reason about very >> effectively, we *already have that*. IMO the entire point of building >> a new partitioning infrastructure is to build something simple, reliable, >> and a whole lot faster than what you can get from inheritance >> relationships. And "faster" is going to come mainly from making the >> partitioning rules as simple as possible, not as complex as possible. > Yeah, but people expect to be able to partition on ranges that are not > all of equal width. I think any proposal that we shouldn't support > that is the kiss of death for a feature like this - it will be so > restricted as to eliminate 75% of the use cases. Well, that's debatable IMO (especially your claim that variable-size partitions would be needed by a majority of users). But in any case, partitioning behavior that is emergent from a bunch of independent pieces of information scattered among N tables seems absolutely untenable from where I sit. Whatever we support, the behavior needs to be described by *one* chunk of information --- a sorted list of bin bounding values, perhaps. regards, tom lane
On Mon, Feb 9, 2015 at 12:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> It's going to be complicated and probably buggy, and I think it is heading >>> in the wrong direction altogether. If you want to partition in some >>> arbitrary complicated fashion that the system can't reason about very >>> effectively, we *already have that*. IMO the entire point of building >>> a new partitioning infrastructure is to build something simple, reliable, >>> and a whole lot faster than what you can get from inheritance >>> relationships. And "faster" is going to come mainly from making the >>> partitioning rules as simple as possible, not as complex as possible. > >> Yeah, but people expect to be able to partition on ranges that are not >> all of equal width. I think any proposal that we shouldn't support >> that is the kiss of death for a feature like this - it will be so >> restricted as to eliminate 75% of the use cases. > > Well, that's debatable IMO (especially your claim that variable-size > partitions would be needed by a majority of users). But in any case, > partitioning behavior that is emergent from a bunch of independent pieces > of information scattered among N tables seems absolutely untenable from > where I sit. Whatever we support, the behavior needs to be described by > *one* chunk of information --- a sorted list of bin bounding values, > perhaps. That's exactly the representation I had in mind. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10-02-2015 AM 02:37, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> It's going to be complicated and probably buggy, and I think it is heading >>> in the wrong direction altogether. If you want to partition in some >>> arbitrary complicated fashion that the system can't reason about very >>> effectively, we *already have that*. IMO the entire point of building >>> a new partitioning infrastructure is to build something simple, reliable, >>> and a whole lot faster than what you can get from inheritance >>> relationships. And "faster" is going to come mainly from making the >>> partitioning rules as simple as possible, not as complex as possible. > >> Yeah, but people expect to be able to partition on ranges that are not >> all of equal width. I think any proposal that we shouldn't support >> that is the kiss of death for a feature like this - it will be so >> restricted as to eliminate 75% of the use cases. > > Well, that's debatable IMO (especially your claim that variable-size > partitions would be needed by a majority of users). But in any case, > partitioning behavior that is emergent from a bunch of independent pieces > of information scattered among N tables seems absolutely untenable from > where I sit. Whatever we support, the behavior needs to be described by > *one* chunk of information --- a sorted list of bin bounding values, > perhaps. > I'm a bit confused here. I got an impression that partitioning formula as you suggest would consist of two pieces of information - an origin point & a bin width. Then routing a tuple consists of using exactly these two values to tell a bin number and hence a partition in O(1) time assuming we've made all partitions be exactly bin-width wide. You mention here a sorted list of bin bounding values which we can very well put together for a partitioned table in its relation descriptor based on whatever information we stored in catalog. That is, we can always have a *one* chunk of partitioning information as *internal* representation irrespective of how generalized we make our on-disk representation. We can get O(log N) if not O(1) from that I'd hope. In fact, that's what I had in mind about this. Did I read it wrong? Thanks, Amit
On Mon, Feb 09, 2015 at 12:37:05PM -0500, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > Yeah, but people expect to be able to partition on ranges that are not > > all of equal width. I think any proposal that we shouldn't support > > that is the kiss of death for a feature like this - it will be so > > restricted as to eliminate 75% of the use cases. > > Well, that's debatable IMO (especially your claim that variable-size > partitions would be needed by a majority of users). I don't know about user wishes directly, though I do suspect fixed partition stride would cover more than 25% of uses cases. I do know that SQL Server, Oracle and MySQL have variable-stride range partitioning, and none of them have fixed-stride range partitioning. So, like Heikki and Robert, I would bet on variable-stride range partitioning.
On Mon, Feb 9, 2015 at 7:54 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Well, that's debatable IMO (especially your claim that variable-size >> partitions would be needed by a majority of users). But in any case, >> partitioning behavior that is emergent from a bunch of independent pieces >> of information scattered among N tables seems absolutely untenable from >> where I sit. Whatever we support, the behavior needs to be described by >> *one* chunk of information --- a sorted list of bin bounding values, >> perhaps. > > I'm a bit confused here. I got an impression that partitioning formula > as you suggest would consist of two pieces of information - an origin > point & a bin width. Then routing a tuple consists of using exactly > these two values to tell a bin number and hence a partition in O(1) time > assuming we've made all partitions be exactly bin-width wide. > > You mention here a sorted list of bin bounding values which we can very > well put together for a partitioned table in its relation descriptor > based on whatever information we stored in catalog. That is, we can > always have a *one* chunk of partitioning information as *internal* > representation irrespective of how generalized we make our on-disk > representation. We can get O(log N) if not O(1) from that I'd hope. In > fact, that's what I had in mind about this. Sure, we can always assemble data into a relation descriptor from across multiple catalog entries. I think the question is whether there is any good reason to split up the information across multiple relations or whether it might not be better, as I have suggested multiple times, to serialize it using nodeToString() and stuff it in a single column in pg_class. There may be such a reason, but if you said what it was, I missed that. This thread started as a discussion about using range types, and I think it's pretty clear that's a bad idea, because: 1. There's no guarantee that a range type for the datatype exists at all. 2. If it does, there's no guarantee that it uses the same opclass that we want to use for partitioning, and I certainly think it would be strange if we refused to let the user pick the opclass she wants to use. 3. Even if there is a suitable range type available, it's a poor representational choice here, because it will be considerably more verbose than just storing a sorted list of partition bounds. In the common case where the ranges are adjacent, you'll end up storing two copies of every bound but the first and last for no discernable benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Feb 09, 2015 at 12:37:05PM -0500, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> It's going to be complicated and probably buggy, and I think it is heading > >> in the wrong direction altogether. If you want to partition in some > >> arbitrary complicated fashion that the system can't reason about very > >> effectively, we *already have that*. IMO the entire point of building > >> a new partitioning infrastructure is to build something simple, reliable, > >> and a whole lot faster than what you can get from inheritance > >> relationships. And "faster" is going to come mainly from making the > >> partitioning rules as simple as possible, not as complex as possible. > > > Yeah, but people expect to be able to partition on ranges that are not > > all of equal width. I think any proposal that we shouldn't support > > that is the kiss of death for a feature like this - it will be so > > restricted as to eliminate 75% of the use cases. > > Well, that's debatable IMO (especially your claim that variable-size > partitions would be needed by a majority of users). It's ubiquitous. Time range partition sets almost always have some sets with finite range and at least one range with infinity in it: "current end" to infinity, and somewhat less frequently in my experience, -infinity to some arbitrary start. > But in any case, partitioning behavior that is emergent from a bunch > of independent pieces of information scattered among N tables seems > absolutely untenable from where I sit. Agreed. > Whatever we support, the behavior needs to be described by *one* > chunk of information --- a sorted list of bin bounding values, > perhaps. This would work for some interesting generalization of "sorted." Maybe going back to the mathematical definition of "partition" could bear more fruit. All that's needed for that is an equivalence relation, however it's denoted. In practical terms, you'd need to have a quick way to enumerate the equivalence classes and another to establish whether equivalence holds. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2/10/15 2:04 PM, David Fetter wrote: >>> > >Yeah, but people expect to be able to partition on ranges that are not >>> > >all of equal width. I think any proposal that we shouldn't support >>> > >that is the kiss of death for a feature like this - it will be so >>> > >restricted as to eliminate 75% of the use cases. >> > >> >Well, that's debatable IMO (especially your claim that variable-size >> >partitions would be needed by a majority of users). > It's ubiquitous. > > Time range partition sets almost always have some sets with finite > range and at least one range with infinity in it: "current end" to > infinity, and somewhat less frequently in my experience, -infinity to > some arbitrary start. We could instead handle that with a generic "this doesn't fit in any other partition" capability. Presumably that would be easy if we're building this on top of inheritance features. If we exclude the issue of needing one or two oddball partitions for +/- infinity, I expect that fixed sized partitions would actually cover 80-90% of cases. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On Fri, Feb 13, 2015 at 3:13 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
If we exclude the issue of needing one or two oddball partitions for +/- infinity, I expect that fixed sized partitions would actually cover 80-90% of cases.
That would not be true in our case. The data is not at all evenly distributed over the partitioning key. We would need something more like: values a, b, and c get their own partitions and everything else goes in partition d.
On Fri, Feb 13, 2015 at 03:13:11PM -0600, Jim Nasby wrote: > On 2/10/15 2:04 PM, David Fetter wrote: > >>>> >Yeah, but people expect to be able to partition on ranges that are not > >>>> >all of equal width. I think any proposal that we shouldn't support > >>>> >that is the kiss of death for a feature like this - it will be so > >>>> >restricted as to eliminate 75% of the use cases. > >>> > >>>Well, that's debatable IMO (especially your claim that variable-size > >>>partitions would be needed by a majority of users). > >It's ubiquitous. > > > >Time range partition sets almost always have some sets with finite > >range and at least one range with infinity in it: "current end" to > >infinity, and somewhat less frequently in my experience, -infinity > >to some arbitrary start. > > We could instead handle that with a generic "this doesn't fit in any > other partition" capability. Presumably that would be easy if we're > building this on top of inheritance features. > > If we exclude the issue of needing one or two oddball partitions for > +/- infinity, I expect that fixed sized partitions would actually > cover 80-90% of cases. Is "partition the domain" really that big an ask? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2/13/15 3:34 PM, David Fetter wrote: > On Fri, Feb 13, 2015 at 03:13:11PM -0600, Jim Nasby wrote: >> On 2/10/15 2:04 PM, David Fetter wrote: >>>>>>> Yeah, but people expect to be able to partition on ranges that are not >>>>>>> all of equal width. I think any proposal that we shouldn't support >>>>>>> that is the kiss of death for a feature like this - it will be so >>>>>>> restricted as to eliminate 75% of the use cases. >>>>> >>>>> Well, that's debatable IMO (especially your claim that variable-size >>>>> partitions would be needed by a majority of users). >>> It's ubiquitous. >>> >>> Time range partition sets almost always have some sets with finite >>> range and at least one range with infinity in it: "current end" to >>> infinity, and somewhat less frequently in my experience, -infinity >>> to some arbitrary start. >> >> We could instead handle that with a generic "this doesn't fit in any >> other partition" capability. Presumably that would be easy if we're >> building this on top of inheritance features. >> >> If we exclude the issue of needing one or two oddball partitions for >> +/- infinity, I expect that fixed sized partitions would actually >> cover 80-90% of cases. > > Is "partition the domain" really that big an ask? ] Since this debate has been running for a few months, perhaps it is. I'd rather see limited partitioning get in sooner and come back to handle the less common cases (as long as we don't paint ourselves in a corner). -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com