Home > mailing lists

Re: RangeType internal use - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: RangeType internal use
Date	February 10, 2015 20:22:44
Msg-id	CA+Tgmoahntvx112+FWVFdy0w0VHosA-EqcSN9WN_5DPEj8qbcA@mail.gmail.com Whole thread Raw
In response to	Re: RangeType internal use (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List	pgsql-hackers

Tree view

On Mon, Feb 9, 2015 at 7:54 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Well, that's debatable IMO (especially your claim that variable-size
>> partitions would be needed by a majority of users).  But in any case,
>> partitioning behavior that is emergent from a bunch of independent pieces
>> of information scattered among N tables seems absolutely untenable from
>> where I sit.  Whatever we support, the behavior needs to be described by
>> *one* chunk of information --- a sorted list of bin bounding values,
>> perhaps.
>
> I'm a bit confused here. I got an impression that partitioning formula
> as you suggest would consist of two pieces of information - an origin
> point & a bin width. Then routing a tuple consists of using exactly
> these two values to tell a bin number and hence a partition in O(1) time
> assuming we've made all partitions be exactly bin-width wide.
>
> You mention here a sorted list of bin bounding values which we can very
> well put together for a partitioned table in its relation descriptor
> based on whatever information we stored in catalog. That is, we can
> always have a *one* chunk of partitioning information as *internal*
> representation irrespective of how generalized we make our on-disk
> representation. We can get O(log N) if not O(1) from that I'd hope. In
> fact, that's what I had in mind about this.

Sure, we can always assemble data into a relation descriptor from
across multiple catalog entries.  I think the question is whether
there is any good reason to split up the information across multiple
relations or whether it might not be better, as I have suggested
multiple times, to serialize it using nodeToString() and stuff it in a
single column in pg_class.  There may be such a reason, but if you
said what it was, I missed that.  This thread started as a discussion
about using range types, and I think it's pretty clear that's a bad
idea, because:

1. There's no guarantee that a range type for the datatype exists at all.
2. If it does, there's no guarantee that it uses the same opclass that
we want to use for partitioning, and I certainly think it would be
strange if we refused to let the user pick the opclass she wants to
use.
3. Even if there is a suitable range type available, it's a poor
representational choice here, because it will be considerably more
verbose than just storing a sorted list of partition bounds.  In the
common case where the ranges are adjacent, you'll end up storing two
copies of every bound but the first and last for no discernable
benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Robert Haas
Date: 10 February 2015, 20:13:16
Subject: Re: The return value of allocate_recordbuf()

From: Bruce Momjian
Date: 10 February 2015, 20:52:43
Subject: pg_upgrade bug in handling postgres/template1 databases

Re: RangeType internal use - Mailing list pgsql-hackers

Previous

Next