Re: [HACKERS] Multi column range partition table - Mailing list pgsql-hackers
| From | Dean Rasheed |
|---|---|
| Subject | Re: [HACKERS] Multi column range partition table |
| Date | |
| Msg-id | CAEZATCU00n6=9dFiW57Hi1wX815Vrek-v9njC29XgVKfMw7q7g@mail.gmail.com Whole thread |
| In response to | Re: [HACKERS] Multi column range partition table (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
| Responses |
Re: [HACKERS] Multi column range partition table
|
| List | pgsql-hackers |
On 5 July 2017 at 10:43, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> So the more I think about this, the more I think that a cleaner design
>> would be as follows:
>>
>> 1). Don't allow UNBOUNDED, except in the first column, where it can
>> keep it's current meaning.
>>
>> 2). Allow the partition bounds to have fewer columns than the
>> partition definition, and have that mean the same as it would have
>> meant if you were partitioning by that many columns. So, for
>> example, if you were partitioning by (col1,col2), you'd be allowed
>> to define a partition like so:
>>
>> FROM (x) TO (y)
>>
>> and it would mean
>>
>> x <= col1 < y
>>
>> Or you'd be able to define a partition like
>>
>> FROM (x1,x2) TO (y)
>>
>> which would mean
>>
>> (col1 > x1) OR (col1 = x1 AND col2 >= x2) AND col1 < y
>>
>> 3). Don't allow any value after UNBOUNDED (i.e., only specify
>> UNBOUNDED once in a partition bound).
>
> I assume we don't need the ability of specifying ABOVE/BELOW in this design.
>
Yes that's right.
> In retrospect, that sounds like something that was implemented in the
> earlier versions of the patch, whereby there was no ability to specify
> UNBOUNDED on a per-column basis. So the syntax was:
>
> FROM { (x [, ...]) | UNBOUNDED } TO { (y [, ...]) | UNBOUNDED }
>
Yes, that's where I ended up too.
> But, it was pointed out to me [1] that that doesn't address the use case,
> for example, where part1 goes up to (10, 10) and part2 goes from (10, 10)
> up to (10, unbounded).
>
> The new design will limit the usage of unbounded range partitions at the
> tail ends.
>
True, but I don't think that's really a problem. When the first column
is a discrete type, an upper bound of (10, unbounded) can be rewritten
as (11) in the new design. When it's a continuous type, e.g. floating
point, it can no longer be represented, because (10.0, unbounded)
really means (col1 <= 10.0). But we've already decided not to support
anything other than inclusive lower bounds and exclusive upper bounds,
so allowing this upper bound goes against that design choice.
>> Of course, it's pretty late in the day to be proposing this kind of
>> redesign, but I fear that if we don't tackle it now, it will just be
>> harder to deal with in the future.
>>
>> Actually, a quick, simple hacky implementation might be to just fill
>> in any omitted values in a partition bound with negative infinity
>> internally, and when printing a bound, omit any values after an
>> infinite value. But really, I think we'd want to tidy up the
>> implementation, and I think a number of things would actually get much
>> simpler. For example, get_qual_for_range() could simply stop when it
>> reached the end of the list of values for the bound, and it wouldn't
>> need to worry about an unbounded value following a bounded one.
>>
>> Thoughts?
>
> I cooked up a patch for the "hacky" implementation for now, just as you
> described in the above paragraph. Will you be willing to give it a look?
> I will also think about the non-hacky way of implementing this.
>
OK, I'll take a look.
Meanwhile, I already had a go at the "non-hacky" implementation (WIP
patch attached). The more I worked on it, the simpler things got,
which I think is a good sign.
Part-way through, I realised that the PartitionRangeDatum Node type is
no longer needed, because each bound value is now necessarily finite,
so the lowerdatums and upperdatums lists in a PartitionBoundSpec can
now be made into lists of Const nodes, making them match the
listdatums field used for LIST partitioning, and then a whole lot of
related code gets simplified.
It needed a little bit more code in partition.c to track individual
bound sizes, but there were a number of other places that could be
simplified, so overall this represents a reduction in the code size
and complexity.
It's not complete (e.g., no doc updates yet), but it passes all the
tests, and so far seems to work as I would expect.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: