Re: Multi-Column List Partitioning - Mailing list pgsql-hackers

From Nitin Jadhav
Subject Re: Multi-Column List Partitioning
Date
Msg-id CAMm1aWaA1dQpzwhf9AhpAqEtaGGnpM2hUwJ14ojQkk4J09zM_g@mail.gmail.com
Whole thread Raw
In response to Re: Multi-Column List Partitioning  (Zhihong Yu <zyu@yugabyte.com>)
List pgsql-hackers
Thanks for reviewing.

> +isListBoundDuplicated(List *list_bounds, List *new_bound)
>
> +           Const   *value1 = castNode(Const, list_nth(elem, i));
> +           Const   *value2 = castNode(Const, list_nth(new_bound, i));
>
> Should the upper bound for index i take into account the length of new_bound ?
> If the length of new_bound is always the same as that for elem, please add an assertion.

The length of 'elem' should be same as length of 'new_bound'. Added
assert statement for the same.


> For transformPartitionListBounds():
> +               deparse_expression((Node *) list_nth(partexprs, j),
> +                                  deparse_context_for(RelationGetRelationName(parent),
> +                                                      RelationGetRelid(parent)),
>
> Please consider calling RelationGetRelationName(parent) and RelationGetRelid(parent) (and assigning to local
variables)outside the loop.
 

I don't think this is an issue as 'RelationGetRelationName' and
'RelationGetRelid' are macros. Please let me know if your opinion is
different.


> +get_list_datum_count(PartitionBoundSpec **boundspecs, int nparts)
>
> get_list_datum_count -> get_list_datums_count

There was a function earlier with the name
'get_non_null_list_datum_count()'. So now this has changed to
'get_list_datum_count()' as we are not separating the non null datums
from the list. The new name is inline with the old function name which
was already accepted by the community. So I feel it is better to not
change.


> For partition_bounds_equal():
>
> +               if (b1->isnulls)
> +                   b1_isnull = b1->isnulls[i][j];
> +               if (b2->isnulls)
> +                   b2_isnull = b2->isnulls[i][j];
>
> Should the initialization of b1_isnull and b2_isnull be done inside the loop (so that they don't inherit value from
previousiteration) ?
 

Nice catch. Fixed.


> In get_qual_for_list(), it would be better if repetitive code can be extracted into a helper method:

I have removed the repetitive code and made a common function named
'get_qual_for_list_datums()'.


> For match_clause_to_partition_key():
>
> +       if (part_scheme->strategy != PARTITION_STRATEGY_LIST)
> +       {
> +           *clause_is_not_null = (nulltest->nulltesttype == IS_NOT_NULL);
> +           return PARTCLAUSE_MATCH_NULLNESS;
> +       }
> +       else
>
> Since the if block ends with return, the 'else' is not needed - else block can be indented to the left.

Fixed.


> get_min_and_max_off(): I think get_min_and_max_offset as method name would be more informative.

Fixed.


> +   Assert(0 == partition_lbound_datum_cmp(partsupfunc, partcollation,
> +                                          boundinfo->datums[off],
> +                                          boundinfo->isnulls[off],
> +                                          values, isnulls, nvalues));
>
> If the 'while (off >= 1)' loop exits without modifying off, is the above assertion always true (can
boundinfo->datums[off]be accessed without checking bound) ?
 

Yes. The assertion holds good even though the control doesn't enter
the loop. In that case the 'off' can be directly considered as minoff
or maxoff. Since we are considering it as valid, the assertion is
needed.

Thanks & Regards,
Nitin Jadhav

On Fri, Oct 22, 2021 at 9:30 PM Zhihong Yu <zyu@yugabyte.com> wrote:
>
>
>
> On Fri, Oct 22, 2021 at 3:50 AM Zhihong Yu <zyu@yugabyte.com> wrote:
>>
>>
>>
>> On Fri, Oct 22, 2021 at 2:48 AM Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote:
>>>
>>> > While testing further I got a crash with partition wise join enabled for multi-col list partitions. please find
testcase & stack-trace below.
 
>>>
>>> Thanks for sharing. I have fixed the issue in the attached patch.
>>>
>>> Thanks & Regards,
>>> Nitin Jadhav
>>>
>>>>>>>>
>>>>>>>>
>> Hi,
>>
>> +isListBoundDuplicated(List *list_bounds, List *new_bound)
>>
>> +           Const   *value1 = castNode(Const, list_nth(elem, i));
>> +           Const   *value2 = castNode(Const, list_nth(new_bound, i));
>>
>> Should the upper bound for index i take into account the length of new_bound ?
>> If the length of new_bound is always the same as that for elem, please add an assertion.
>>
>> For transformPartitionListBounds():
>> +               deparse_expression((Node *) list_nth(partexprs, j),
>> +                                  deparse_context_for(RelationGetRelationName(parent),
>> +                                                      RelationGetRelid(parent)),
>>
>> Please consider calling RelationGetRelationName(parent) and RelationGetRelid(parent) (and assigning to local
variables)outside the loop.
 
>>
>> +get_list_datum_count(PartitionBoundSpec **boundspecs, int nparts)
>>
>> get_list_datum_count -> get_list_datums_count
>>
>> For partition_bounds_equal():
>>
>> +               if (b1->isnulls)
>> +                   b1_isnull = b1->isnulls[i][j];
>> +               if (b2->isnulls)
>> +                   b2_isnull = b2->isnulls[i][j];
>>
>> Should the initialization of b1_isnull and b2_isnull be done inside the loop (so that they don't inherit value from
previousiteration) ?
 
>>
>> Cheers
>
>
> Hi,
> Continuing review.
>
> +            * For the multi-column case, we must make an BoolExpr that
>
> an BoolExpr -> a BoolExpr
>
> In get_qual_for_list(), it would be better if repetitive code can be extracted into a helper method:
>
> +               if (val->constisnull)
> +               {
> +                   NullTest   *nulltest = makeNode(NullTest);
> +
> +                   key_is_null[j] = true;
> +
> +                   nulltest->arg = keyCol[j];
> +                   nulltest->nulltesttype = IS_NULL;
> +                   nulltest->argisrow = false;
> +                   nulltest->location = -1;
> +
> +                   if (key->partnatts > 1)
> +                       and_args = lappend(and_args, nulltest);
> +                   else
> +                       is_null_test = (Expr *) nulltest;
> +               }
> +               else
> +               {
> +                   if (key->partnatts > 1)
> +                   {
> +                       Expr *opexpr =
> +                           make_partition_op_expr(key, j,
> +                                                  BTEqualStrategyNumber,
> +                                                  keyCol[j],
> +                                                  (Expr *) val);
> +                       and_args = lappend(and_args, opexpr);
> +                   }
> +                   else
> +                       datum_elem = (Expr *) val;
> +               }
>
> For match_clause_to_partition_key():
>
> +       if (part_scheme->strategy != PARTITION_STRATEGY_LIST)
> +       {
> +           *clause_is_not_null = (nulltest->nulltesttype == IS_NOT_NULL);
> +           return PARTCLAUSE_MATCH_NULLNESS;
> +       }
> +       else
>
> Since the if block ends with return, the 'else' is not needed - else block can be indented to the left.
>
> get_min_and_max_off(): I think get_min_and_max_offset as method name would be more informative.
>
> +   Assert(0 == partition_lbound_datum_cmp(partsupfunc, partcollation,
> +                                          boundinfo->datums[off],
> +                                          boundinfo->isnulls[off],
> +                                          values, isnulls, nvalues));
>
> If the 'while (off >= 1)' loop exits without modifying off, is the above assertion always true (can
boundinfo->datums[off]be accessed without checking bound) ?
 
>
> Cheers

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: removing global variable ThisTimeLineID
Next
From: Nitin Jadhav
Date:
Subject: Re: Multi-Column List Partitioning