Thread: RangeType internal use

RangeType internal use

From

Amit Langote

Date:

06 February 2015, 00:40:37

Hi,

I wonder why I cannot find a way to get a range type for a given (sub-)
type. I would like to build a RangeType from Datum's of lower and upper
bounds. Much like how construct_array() builds an ArrayType from a Datum
array of elements given elements' type info.

Is there some way I do not seem to know? If not, would it be worthwhile
to make something like construct_range() that returns a RangeType given
Datum's of lower and upper bounds and subtype info?

Thanks,
Amit

Re: RangeType internal use

From

Kyotaro HORIGUCHI

Date:

06 February 2015, 07:36:27

Hi, from nearby:)

> I wonder why I cannot find a way to get a range type for a given (sub-)
> type. I would like to build a RangeType from Datum's of lower and upper
> bounds. Much like how construct_array() builds an ArrayType from a Datum
> array of elements given elements' type info.
> 
> Is there some way I do not seem to know? If not, would it be worthwhile
> to make something like construct_range() that returns a RangeType given
> Datum's of lower and upper bounds and subtype info?

make_range needs the range type itself.

On SQL interfalce, you can get range type coresponds to a base
type by looking up the pg_range catalog.

SELECT rngtypid::regtype, rngsubtype::regtypeFROM pg_range WHERE rngsubtype = 'int'::regtype;
rngtypid  | rngsubtype 
-----------+------------int4range | integer

But there's only one syscache for this catalog which takes range
type id. So the reverse resolution rngsubtype->rngtype seems not
available. TypeCahce has only comparison function info as surely
available element related to range types but this wouldn't
help. I think scanning the entire cache is not allowable even if
possible.

Perhaps what is needed is adding RANGESUBTYPE syscache but I
don't know whether it is allowable or not.

Thoughts?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: RangeType internal use

From

Amit Langote

Date:

06 February 2015, 07:49:02

Horiguchi-san,

On 06-02-2015 PM 04:34, Kyotaro HORIGUCHI wrote:
> Hi, from nearby:)
> 

Thank you!

>> I wonder why I cannot find a way to get a range type for a given (sub-)
>> type. I would like to build a RangeType from Datum's of lower and upper
>> bounds. Much like how construct_array() builds an ArrayType from a Datum
>> array of elements given elements' type info.
>>
>> Is there some way I do not seem to know? If not, would it be worthwhile
>> to make something like construct_range() that returns a RangeType given
>> Datum's of lower and upper bounds and subtype info?
> 
> make_range needs the range type itself.
> 
> On SQL interfalce, you can get range type coresponds to a base
> type by looking up the pg_range catalog.
> 
> SELECT rngtypid::regtype, rngsubtype::regtype
>  FROM pg_range WHERE rngsubtype = 'int'::regtype;
> 
>  rngtypid  | rngsubtype 
> -----------+------------
>  int4range | integer
> 
> But there's only one syscache for this catalog which takes range
> type id. So the reverse resolution rngsubtype->rngtype seems not
> available. TypeCahce has only comparison function info as surely
> available element related to range types but this wouldn't
> help. I think scanning the entire cache is not allowable even if
> possible.
> 
> Perhaps what is needed is adding RANGESUBTYPE syscache but I
> don't know whether it is allowable or not.
> 
> Thoughts?

Actually, I'm wondering if there is one-to-one mapping from rangetype to
subtype (and vice versa?), then this should be OK. But if not (that is
designers of range types thought there is not necessarily such a
mapping), then perhaps we could add, say, rngtypeisdefault flag to pg_range.

Perhaps following is not too pretty:

+
+/*
+ * get_rangetype_for_type
+ *
+ * returns a TypeCacheEntry for a range type of a given (sub-) type.
+ */
+TypeCacheEntry *
+get_rangetype_for_type(Oid subtypid)
+{
+    Relation    relation;
+    SysScanDesc scan;
+    HeapTuple    rangeTuple;
+    Oid            rngsubtype;
+    Oid            rngtypid = InvalidOid;
+
+    relation = heap_open(RangeRelationId, AccessShareLock);
+
+    scan = systable_beginscan(relation, InvalidOid, false,
+                              NULL, 0, NULL);
+
+    while ((rangeTuple = systable_getnext(scan)) != NULL)
+    {
+        rngsubtype = ((Form_pg_range) GETSTRUCT(rangeTuple))->rngsubtype;
+
+        if (rngsubtype == subtypid)
+        {
+            rngtypid = ((Form_pg_range) GETSTRUCT(rangeTuple))->rngtypid;
+            break;
+        }
+    }
+
+    systable_endscan(scan);
+    heap_close(relation, AccessShareLock);
+
+    return(rngtypid != InvalidOid
+                ? lookup_type_cache(rngtypid, TYPECACHE_RANGE_INFO): NULL);
+}

Thanks,
Amit

Re: RangeType internal use

From

Tom Lane

Date:

06 February 2015, 15:10:52

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> I wonder why I cannot find a way to get a range type for a given (sub-)
> type. I would like to build a RangeType from Datum's of lower and upper
> bounds. Much like how construct_array() builds an ArrayType from a Datum
> array of elements given elements' type info.

> Is there some way I do not seem to know? If not, would it be worthwhile
> to make something like construct_range() that returns a RangeType given
> Datum's of lower and upper bounds and subtype info?

There is no good reason to assume that a range type exists at all, much
less that it is unique for a subtype.  (Read the CREATE TYPE documentation
if you're unclear as to why not.)  You have not said what you're trying to
accomplish, but this seems like a dead end.
        regards, tom lane

Re: RangeType internal use

From

Amit Langote

Date:

09 February 2015, 00:47:26

On 07-02-2015 AM 12:10, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> I wonder why I cannot find a way to get a range type for a given (sub-)
>> type. I would like to build a RangeType from Datum's of lower and upper
>> bounds. Much like how construct_array() builds an ArrayType from a Datum
>> array of elements given elements' type info.
> 
>> Is there some way I do not seem to know? If not, would it be worthwhile
>> to make something like construct_range() that returns a RangeType given
>> Datum's of lower and upper bounds and subtype info?
> 
> There is no good reason to assume that a range type exists at all, much
> less that it is unique for a subtype.  (Read the CREATE TYPE documentation
> if you're unclear as to why not.)  You have not said what you're trying to
> accomplish, but this seems like a dead end.
> 

We do have a number of built-in range types which we can safely assume
to exist with whatever default behavior they are hard-coded with (which
I guess are not 'ALTER TYPE'-able). Those existing range types cover a
useful set of sub-types which are probably also the potential candidates
to be used for range partitioning.

Now, one may define a new range type for a given sub-type with a
different subopclass, subcollation, subcanonical or subdiff, which is
perhaps the problem you are mentioning. Perhaps not very appropriate a
solution, but how about a rngtypisdefault (for a sub-type) flag?

Sorry I didn't mention before about an idea I am trying to implement
with this - it is to serialize range partition bounds as a range type
value per partition key column. The serialized representation of a range
partition bound for a partition then effectively becomes an anyarray of
anyrange:

+ rangebounds = construct_array(datum,
+            partnatts,
+            ANYRANGEOID,
+            -1,
+            false,
+            'd');

each element of the passed datum array is a range type value of the
corresponding subtype.

Now, range types may not yet be ripe for internal use but the above idea
kind of works modulo the absence of infrastructure I was asking for in OP.

Am I still missing something?

Thanks,
Amit

Re: RangeType internal use

From

Tom Lane

Date:

09 February 2015, 01:21:31

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 07-02-2015 AM 12:10, Tom Lane wrote:
>> There is no good reason to assume that a range type exists at all, much
>> less that it is unique for a subtype.  (Read the CREATE TYPE documentation
>> if you're unclear as to why not.)  You have not said what you're trying to
>> accomplish, but this seems like a dead end.

> Sorry I didn't mention before about an idea I am trying to implement
> with this - it is to serialize range partition bounds as a range type
> value per partition key column. The serialized representation of a range
> partition bound for a partition then effectively becomes an anyarray of
> anyrange:

Meh.  I don't care for that much --- it sounds a lot like deciding that
your problem is a nail because there is a hammer within reach.  A random
collection of ranges doesn't seem like a very appropriate representation
to me; first because there is no simple way to enforce that it partitions
the key space (no overlaps and no missing portions), and second because it
provides little purchase for efficient tuple routing algorithms.  The best
you could possibly hope for is some log-N tree search mechanism, and that
would require a fair amount of setup beforehand.

I'd rather insist that range partitioning be done on the basis of an
origin point and a bin width, which would allow trivial computation of
which bin number any given key falls in.  This will limit the set of types
it can be done over, for sure, but not more than depending on built-in
range types would.
        regards, tom lane

Re: RangeType internal use

From

Heikki Linnakangas

Date:

09 February 2015, 07:19:55

On 02/09/2015 03:21 AM, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> On 07-02-2015 AM 12:10, Tom Lane wrote:
>>> There is no good reason to assume that a range type exists at all, much
>>> less that it is unique for a subtype.  (Read the CREATE TYPE documentation
>>> if you're unclear as to why not.)  You have not said what you're trying to
>>> accomplish, but this seems like a dead end.
>
>> Sorry I didn't mention before about an idea I am trying to implement
>> with this - it is to serialize range partition bounds as a range type
>> value per partition key column. The serialized representation of a range
>> partition bound for a partition then effectively becomes an anyarray of
>> anyrange:
>
> Meh.  I don't care for that much --- it sounds a lot like deciding that
> your problem is a nail because there is a hammer within reach.  A random
> collection of ranges doesn't seem like a very appropriate representation
> to me; first because there is no simple way to enforce that it partitions
> the key space (no overlaps and no missing portions), and second because it
> provides little purchase for efficient tuple routing algorithms.  The best
> you could possibly hope for is some log-N tree search mechanism, and that
> would require a fair amount of setup beforehand.

Building a tree or a sorted array of the min or max bounds of the ranges 
doesn't sound hard. log-N sounds fast enough.

> I'd rather insist that range partitioning be done on the basis of an
> origin point and a bin width, which would allow trivial computation of
> which bin number any given key falls in.  This will limit the set of types
> it can be done over, for sure, but not more than depending on built-in
> range types would.

That sounds unnecessarily limiting. If there's a good reason to limit it 
to that, then fine, but I don't think it'd be any more difficult to make 
it flexible. Let's wait for the patch and see I guess.

- Heikki

Re: RangeType internal use

From

Amit Langote

Date:

09 February 2015, 07:21:08

On 09-02-2015 AM 10:21, Tom Lane wrote:
> 
> Meh.  I don't care for that much --- it sounds a lot like deciding that
> your problem is a nail because there is a hammer within reach.  A random
> collection of ranges doesn't seem like a very appropriate representation
> to me; first because there is no simple way to enforce that it partitions
> the key space (no overlaps and no missing portions), and second because it
> provides little purchase for efficient tuple routing algorithms.  The best
> you could possibly hope for is some log-N tree search mechanism, and that
> would require a fair amount of setup beforehand.
> 
> I'd rather insist that range partitioning be done on the basis of an
> origin point and a bin width, which would allow trivial computation of
> which bin number any given key falls in.  This will limit the set of types
> it can be done over, for sure, but not more than depending on built-in
> range types would.
> 

Okay, let me back up a little and think about your suggestion which I do
not seem to understand very well - it raises a few questions for me:
does this mean a partitioning criteria is associated with parent
(partitioned table) rather than each individual partition? I would guess
that bin width is partition interval such that each bin number gives
partition number (of equal-sized consecutively numbered partitions
without gaps). But I don't quite understand what origin point is? Is
that a key literal value from which to begin counting bins and if so, is
it stored in catalog as part of the partitioning rule? Does this affect
the syntax to use when defining partitioned table/partitions like:

CREATE TABLE parent PARTITION BY RANGE ON (col1) EVERY '3 months'::interval;

CREATE TABLE child PARTITION 1 OF parent; or some such?

Or may I ask if is is just an internal representation geared towards
optimizing range partitioning whereas on-disk representation is
something more generalized allowing for other partitioning strategies to
be implemented as well?

Thanks,
Amit

Re: RangeType internal use

From

Tom Lane

Date:

09 February 2015, 15:36:12

Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> On 02/09/2015 03:21 AM, Tom Lane wrote:
>> Meh.  I don't care for that much --- it sounds a lot like deciding that
>> your problem is a nail because there is a hammer within reach.  A random
>> collection of ranges doesn't seem like a very appropriate representation
>> to me; first because there is no simple way to enforce that it partitions
>> the key space (no overlaps and no missing portions), and second because it
>> provides little purchase for efficient tuple routing algorithms.  The best
>> you could possibly hope for is some log-N tree search mechanism, and that
>> would require a fair amount of setup beforehand.

> Building a tree or a sorted array of the min or max bounds of the ranges 
> doesn't sound hard. log-N sounds fast enough.

It's going to be complicated and probably buggy, and I think it is heading
in the wrong direction altogether.  If you want to partition in some
arbitrary complicated fashion that the system can't reason about very
effectively, we *already have that*.  IMO the entire point of building
a new partitioning infrastructure is to build something simple, reliable,
and a whole lot faster than what you can get from inheritance
relationships.  And "faster" is going to come mainly from making the
partitioning rules as simple as possible, not as complex as possible.

Just to start with: one of the absolutely fundamental things we need out
of partitioning is the ability to have uniqueness constraints that span
a partitioned table set.  That means the partitions have to be disjoint,
and it has to be not possible to break that.  The design proposed here
won't enforce that without complicated (and again, possibly buggy) logic.

In short, this design requires a whole lot of extra mechanism to get to
places that we have to get to, and I don't believe that that extra
complexity is going to buy anything useful at all.
        regards, tom lane

Re: RangeType internal use

From

Tom Lane

Date:

09 February 2015, 15:42:53

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> Okay, let me back up a little and think about your suggestion which I do
> not seem to understand very well - it raises a few questions for me:
> does this mean a partitioning criteria is associated with parent
> (partitioned table) rather than each individual partition?

Absolutely.  Anything else is not scalable; it's just another flavor of
the inheritance + CHECK constraint mechanism.  The entire point of doing a
new partitioning design IMO is to get away from that.  It should be
possible to determine which partition a row belongs to in O(1) time, not
O(N).

> I would guess
> that bin width is partition interval such that each bin number gives
> partition number (of equal-sized consecutively numbered partitions
> without gaps). But I don't quite understand what origin point is? Is
> that a key literal value from which to begin counting bins and if so, is
> it stored in catalog as part of the partitioning rule?

Yeah, I would think so.
        regards, tom lane

Re: RangeType internal use

From

Robert Haas

Date:

09 February 2015, 17:32:15

On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It's going to be complicated and probably buggy, and I think it is heading
> in the wrong direction altogether.  If you want to partition in some
> arbitrary complicated fashion that the system can't reason about very
> effectively, we *already have that*.  IMO the entire point of building
> a new partitioning infrastructure is to build something simple, reliable,
> and a whole lot faster than what you can get from inheritance
> relationships.  And "faster" is going to come mainly from making the
> partitioning rules as simple as possible, not as complex as possible.

Yeah, but people expect to be able to partition on ranges that are not
all of equal width.  I think any proposal that we shouldn't support
that is the kiss of death for a feature like this - it will be so
restricted as to eliminate 75% of the use cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: RangeType internal use

From

Tom Lane

Date:

09 February 2015, 17:37:18

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's going to be complicated and probably buggy, and I think it is heading
>> in the wrong direction altogether.  If you want to partition in some
>> arbitrary complicated fashion that the system can't reason about very
>> effectively, we *already have that*.  IMO the entire point of building
>> a new partitioning infrastructure is to build something simple, reliable,
>> and a whole lot faster than what you can get from inheritance
>> relationships.  And "faster" is going to come mainly from making the
>> partitioning rules as simple as possible, not as complex as possible.

> Yeah, but people expect to be able to partition on ranges that are not
> all of equal width.  I think any proposal that we shouldn't support
> that is the kiss of death for a feature like this - it will be so
> restricted as to eliminate 75% of the use cases.

Well, that's debatable IMO (especially your claim that variable-size
partitions would be needed by a majority of users).  But in any case,
partitioning behavior that is emergent from a bunch of independent pieces
of information scattered among N tables seems absolutely untenable from
where I sit.  Whatever we support, the behavior needs to be described by
*one* chunk of information --- a sorted list of bin bounding values,
perhaps.
        regards, tom lane

Re: RangeType internal use

From

Robert Haas

Date:

09 February 2015, 17:38:13

On Mon, Feb 9, 2015 at 12:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It's going to be complicated and probably buggy, and I think it is heading
>>> in the wrong direction altogether.  If you want to partition in some
>>> arbitrary complicated fashion that the system can't reason about very
>>> effectively, we *already have that*.  IMO the entire point of building
>>> a new partitioning infrastructure is to build something simple, reliable,
>>> and a whole lot faster than what you can get from inheritance
>>> relationships.  And "faster" is going to come mainly from making the
>>> partitioning rules as simple as possible, not as complex as possible.
>
>> Yeah, but people expect to be able to partition on ranges that are not
>> all of equal width.  I think any proposal that we shouldn't support
>> that is the kiss of death for a feature like this - it will be so
>> restricted as to eliminate 75% of the use cases.
>
> Well, that's debatable IMO (especially your claim that variable-size
> partitions would be needed by a majority of users).  But in any case,
> partitioning behavior that is emergent from a bunch of independent pieces
> of information scattered among N tables seems absolutely untenable from
> where I sit.  Whatever we support, the behavior needs to be described by
> *one* chunk of information --- a sorted list of bin bounding values,
> perhaps.

That's exactly the representation I had in mind.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: RangeType internal use

From

Amit Langote

Date:

10 February 2015, 01:03:06

On 10-02-2015 AM 02:37, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It's going to be complicated and probably buggy, and I think it is heading
>>> in the wrong direction altogether.  If you want to partition in some
>>> arbitrary complicated fashion that the system can't reason about very
>>> effectively, we *already have that*.  IMO the entire point of building
>>> a new partitioning infrastructure is to build something simple, reliable,
>>> and a whole lot faster than what you can get from inheritance
>>> relationships.  And "faster" is going to come mainly from making the
>>> partitioning rules as simple as possible, not as complex as possible.
> 
>> Yeah, but people expect to be able to partition on ranges that are not
>> all of equal width.  I think any proposal that we shouldn't support
>> that is the kiss of death for a feature like this - it will be so
>> restricted as to eliminate 75% of the use cases.
> 
> Well, that's debatable IMO (especially your claim that variable-size
> partitions would be needed by a majority of users).  But in any case,
> partitioning behavior that is emergent from a bunch of independent pieces
> of information scattered among N tables seems absolutely untenable from
> where I sit.  Whatever we support, the behavior needs to be described by
> *one* chunk of information --- a sorted list of bin bounding values,
> perhaps.
> 

I'm a bit confused here. I got an impression that partitioning formula
as you suggest would consist of two pieces of information - an origin
point & a bin width. Then routing a tuple consists of using exactly
these two values to tell a bin number and hence a partition in O(1) time
assuming we've made all partitions be exactly bin-width wide.

You mention here a sorted list of bin bounding values which we can very
well put together for a partitioned table in its relation descriptor
based on whatever information we stored in catalog. That is, we can
always have a *one* chunk of partitioning information as *internal*
representation irrespective of how generalized we make our on-disk
representation. We can get O(log N) if not O(1) from that I'd hope. In
fact, that's what I had in mind about this.

Did I read it wrong?

Thanks,
Amit

Re: RangeType internal use

From

Noah Misch

Date:

10 February 2015, 07:26:35

On Mon, Feb 09, 2015 at 12:37:05PM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Yeah, but people expect to be able to partition on ranges that are not
> > all of equal width.  I think any proposal that we shouldn't support
> > that is the kiss of death for a feature like this - it will be so
> > restricted as to eliminate 75% of the use cases.
> 
> Well, that's debatable IMO (especially your claim that variable-size
> partitions would be needed by a majority of users).

I don't know about user wishes directly, though I do suspect fixed partition
stride would cover more than 25% of uses cases.  I do know that SQL Server,
Oracle and MySQL have variable-stride range partitioning, and none of them
have fixed-stride range partitioning.  So, like Heikki and Robert, I would bet
on variable-stride range partitioning.

Re: RangeType internal use

From

Robert Haas

Date:

10 February 2015, 17:22:44

On Mon, Feb 9, 2015 at 7:54 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Well, that's debatable IMO (especially your claim that variable-size
>> partitions would be needed by a majority of users).  But in any case,
>> partitioning behavior that is emergent from a bunch of independent pieces
>> of information scattered among N tables seems absolutely untenable from
>> where I sit.  Whatever we support, the behavior needs to be described by
>> *one* chunk of information --- a sorted list of bin bounding values,
>> perhaps.
>
> I'm a bit confused here. I got an impression that partitioning formula
> as you suggest would consist of two pieces of information - an origin
> point & a bin width. Then routing a tuple consists of using exactly
> these two values to tell a bin number and hence a partition in O(1) time
> assuming we've made all partitions be exactly bin-width wide.
>
> You mention here a sorted list of bin bounding values which we can very
> well put together for a partitioned table in its relation descriptor
> based on whatever information we stored in catalog. That is, we can
> always have a *one* chunk of partitioning information as *internal*
> representation irrespective of how generalized we make our on-disk
> representation. We can get O(log N) if not O(1) from that I'd hope. In
> fact, that's what I had in mind about this.

Sure, we can always assemble data into a relation descriptor from
across multiple catalog entries.  I think the question is whether
there is any good reason to split up the information across multiple
relations or whether it might not be better, as I have suggested
multiple times, to serialize it using nodeToString() and stuff it in a
single column in pg_class.  There may be such a reason, but if you
said what it was, I missed that.  This thread started as a discussion
about using range types, and I think it's pretty clear that's a bad
idea, because:

1. There's no guarantee that a range type for the datatype exists at all.
2. If it does, there's no guarantee that it uses the same opclass that
we want to use for partitioning, and I certainly think it would be
strange if we refused to let the user pick the opclass she wants to
use.
3. Even if there is a suitable range type available, it's a poor
representational choice here, because it will be considerably more
verbose than just storing a sorted list of partition bounds.  In the
common case where the ranges are adjacent, you'll end up storing two
copies of every bound but the first and last for no discernable
benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: RangeType internal use

From

David Fetter

Date:

10 February 2015, 20:04:53

On Mon, Feb 09, 2015 at 12:37:05PM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> It's going to be complicated and probably buggy, and I think it is heading
> >> in the wrong direction altogether.  If you want to partition in some
> >> arbitrary complicated fashion that the system can't reason about very
> >> effectively, we *already have that*.  IMO the entire point of building
> >> a new partitioning infrastructure is to build something simple, reliable,
> >> and a whole lot faster than what you can get from inheritance
> >> relationships.  And "faster" is going to come mainly from making the
> >> partitioning rules as simple as possible, not as complex as possible.
> 
> > Yeah, but people expect to be able to partition on ranges that are not
> > all of equal width.  I think any proposal that we shouldn't support
> > that is the kiss of death for a feature like this - it will be so
> > restricted as to eliminate 75% of the use cases.
> 
> Well, that's debatable IMO (especially your claim that variable-size
> partitions would be needed by a majority of users).

It's ubiquitous.

Time range partition sets almost always have some sets with finite
range and at least one range with infinity in it: "current end" to
infinity, and somewhat less frequently in my experience, -infinity to
some arbitrary start.

> But in any case, partitioning behavior that is emergent from a bunch
> of independent pieces of information scattered among N tables seems
> absolutely untenable from where I sit.

Agreed.

> Whatever we support, the behavior needs to be described by *one*
> chunk of information --- a sorted list of bin bounding values,
> perhaps.

This would work for some interesting generalization of "sorted."

Maybe going back to the mathematical definition of "partition" could
bear more fruit.  All that's needed for that is an equivalence
relation, however it's denoted.  In practical terms, you'd need to
have a quick way to enumerate the equivalence classes and another to
establish whether equivalence holds.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: RangeType internal use

From

Jim Nasby

Date:

13 February 2015, 21:13:32

On 2/10/15 2:04 PM, David Fetter wrote:
>>> > >Yeah, but people expect to be able to partition on ranges that are not
>>> > >all of equal width.  I think any proposal that we shouldn't support
>>> > >that is the kiss of death for a feature like this - it will be so
>>> > >restricted as to eliminate 75% of the use cases.
>> >
>> >Well, that's debatable IMO (especially your claim that variable-size
>> >partitions would be needed by a majority of users).
> It's ubiquitous.
>
> Time range partition sets almost always have some sets with finite
> range and at least one range with infinity in it: "current end" to
> infinity, and somewhat less frequently in my experience, -infinity to
> some arbitrary start.

We could instead handle that with a generic "this doesn't fit in any 
other partition" capability. Presumably that would be easy if we're 
building this on top of inheritance features.

If we exclude the issue of needing one or two oddball partitions for +/- 
infinity, I expect that fixed sized partitions would actually cover 
80-90% of cases.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: RangeType internal use

From

Mike Blackwell

Date:

13 February 2015, 21:27:15

On Fri, Feb 13, 2015 at 3:13 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

If we exclude the issue of needing one or two oddball partitions for +/- infinity, I expect that fixed sized partitions would actually cover 80-90% of cases.

That would not be true in our case. The data is not at all evenly distributed over the partitioning key. We would need something more like: values a, b, and c get their own partitions and everything else goes in partition d.

Re: RangeType internal use

From

David Fetter

Date:

13 February 2015, 21:34:32

On Fri, Feb 13, 2015 at 03:13:11PM -0600, Jim Nasby wrote:
> On 2/10/15 2:04 PM, David Fetter wrote:
> >>>> >Yeah, but people expect to be able to partition on ranges that are not
> >>>> >all of equal width.  I think any proposal that we shouldn't support
> >>>> >that is the kiss of death for a feature like this - it will be so
> >>>> >restricted as to eliminate 75% of the use cases.
> >>>
> >>>Well, that's debatable IMO (especially your claim that variable-size
> >>>partitions would be needed by a majority of users).
> >It's ubiquitous.
> >
> >Time range partition sets almost always have some sets with finite
> >range and at least one range with infinity in it: "current end" to
> >infinity, and somewhat less frequently in my experience, -infinity
> >to some arbitrary start.
> 
> We could instead handle that with a generic "this doesn't fit in any
> other partition" capability. Presumably that would be easy if we're
> building this on top of inheritance features.
> 
> If we exclude the issue of needing one or two oddball partitions for
> +/- infinity, I expect that fixed sized partitions would actually
> cover 80-90% of cases.

Is "partition the domain" really that big an ask?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: RangeType internal use

From

Jim Nasby

Date:

14 February 2015, 20:37:45

On 2/13/15 3:34 PM, David Fetter wrote:
> On Fri, Feb 13, 2015 at 03:13:11PM -0600, Jim Nasby wrote:
>> On 2/10/15 2:04 PM, David Fetter wrote:
>>>>>>> Yeah, but people expect to be able to partition on ranges that are not
>>>>>>> all of equal width.  I think any proposal that we shouldn't support
>>>>>>> that is the kiss of death for a feature like this - it will be so
>>>>>>> restricted as to eliminate 75% of the use cases.
>>>>>
>>>>> Well, that's debatable IMO (especially your claim that variable-size
>>>>> partitions would be needed by a majority of users).
>>> It's ubiquitous.
>>>
>>> Time range partition sets almost always have some sets with finite
>>> range and at least one range with infinity in it: "current end" to
>>> infinity, and somewhat less frequently in my experience, -infinity
>>> to some arbitrary start.
>>
>> We could instead handle that with a generic "this doesn't fit in any
>> other partition" capability. Presumably that would be easy if we're
>> building this on top of inheritance features.
>>
>> If we exclude the issue of needing one or two oddball partitions for
>> +/- infinity, I expect that fixed sized partitions would actually
>> cover 80-90% of cases.
>
> Is "partition the domain" really that big an ask?
]
Since this debate has been running for a few months, perhaps it is. I'd 
rather see limited partitioning get in sooner and come back to handle 
the less common cases (as long as we don't paint ourselves in a corner).
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com