Re: Multi-Column List Partitioning - Mailing list pgsql-hackers
From | Amul Sul |
---|---|
Subject | Re: Multi-Column List Partitioning |
Date | |
Msg-id | CAAJ_b97fTnKngnnskZ4rBar4yczHLVjvYoxEQ7THw7asT7MaUQ@mail.gmail.com Whole thread Raw |
In response to | Multi-Column List Partitioning (Nitin Jadhav <nitinjadhavpostgres@gmail.com>) |
Responses |
Re: Multi-Column List Partitioning
|
List | pgsql-hackers |
On Tue, Dec 21, 2021 at 6:34 PM Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote: > > --- > > > + if (isnulls && isnulls[i]) > > + cmpval = 0; /* NULL "=" NULL */ > > + else > > + cmpval = 1; /* NULL ">" not-NULL */ > > + } > > + else if (isnulls && isnulls[i]) > > + cmpval = -1; /* not-NULL "<" NULL */ > > > > I really doubt this assumption is correct; aren't those strict operators? > > Now there are possibilities of multiple NULL values. We should have a > mechanism to sort it when the bound values contain Non NULL and NULL > values. As per the above logic we put the NULL values at the end. > Please let me know if I am wrong. Ok, but I am not sure about the comparison approach, let's see what others think. > --- [...] > > > typedef struct PartitionBoundInfoData > > { > > char strategy; /* hash, list or range? */ > > + int partnatts; /* number of partition key columns */ > > int ndatums; /* Length of the datums[] array */ > > Datum **datums; > > + bool **isnulls; > > > > Adding "partnatts" to this struct seems to be unnecessary, AFAIUC, > > added that for partition_bound_accepts_nulls(), but we can easily get > > that value from the partitioning key & pass an additional argument. > > Also, no information about the length of the "isnulls" array. > > This is required during merge_list_bounds(). AFAIK partition key > information is not available here. > You can get that as an argument, see merge_range_bounds(). > > I think it would be helpful if you could split the patch: one for > > multi-value list partitioning and another for the partition wise join, thanks. > > I have split the patch into 2 patches. One is for the multi column > list partitioning core changes and the other is for partition-wise > join support. Each patch has its respective test cases in the > regression suit and regression tests run successfully on each patch. > Kindly let me know if any other changes are required here. > Thanks, for the slit that is much helpful, I have a few comments for the 0001 patch as follow: + char **colname = (char **) palloc0(partnatts * sizeof(char *)); palloc0 is unnecessary. --- + foreach(cell2, rowexpr->args) + { + int idx = foreach_current_index(cell2); + Node *expr = lfirst(cell2); + Const *val = + transformPartitionBoundValue(pstate, expr, colname[i], + get_partition_col_typid(key, idx), + get_partition_col_typmod(key, idx), + get_partition_col_collation(key, idx)); + + values = lappend(values, val); + } Array index for colname should be "idx". --- result->scan_default = partition_bound_has_default(boundinfo); + return result; ... /* Always include the default partition if any. */ result->scan_default = partition_bound_has_default(boundinfo); - return result; ... else result->scan_default = partition_bound_has_default(boundinfo); + return result; ... - /* Add columns specified to SET NULL or SET DEFAULT if provided. */ + /* + * Add columns specified to SET NULL or SET DEFAULT if + * provided. + */ spurious change -- look like something not related to your patch. -- - * For range partitioning, we must only perform pruning with values - * for either all partition keys or a prefix thereof. + * For range partitioning and list partitioning, we must only perform + * pruning with values for either all partition keys or a prefix + * thereof. */ - if (keyno > nvalues && context->strategy == PARTITION_STRATEGY_RANGE) + if (keyno > nvalues && (context->strategy == PARTITION_STRATEGY_RANGE || + context->strategy == PARTITION_STRATEGY_LIST)) break; I think this is not true for multi-value list partitions, we might still want prune partitions for e.g. (100, IS NULL, 20). Correct me if I am missing something here. --- /* - * For range partitioning, if we have no clauses for the current key, - * we can't consider any later keys either, so we can stop here. + * For range partitioning and list partitioning, if we have no clauses + * for the current key, we can't consider any later keys either, so we + * can stop here. */ - if (part_scheme->strategy == PARTITION_STRATEGY_RANGE && + if ((part_scheme->strategy == PARTITION_STRATEGY_RANGE || + part_scheme->strategy == PARTITION_STRATEGY_LIST) && clauselist == NIL) break Similarly, why would this be true for list partitioning? How can we prune partitions if values is for e.g. (100, <not given>, 20). -- - if (bms_is_member(keyno, opstep->nullkeys)) + if (bms_is_member(keyno, opstep->nullkeys) && + context->strategy != PARTITION_STRATEGY_LIST) continue; Will that prune for all NULL partitioning key values? --- + appendStringInfoString + (buf, get_list_partbound_value_string(lfirst(cell))); Formatting is not quite right. -- +/* + * get_min_and_max_offset + * + * Fetches the minimum and maximum offset of the matching partitions. + */ ... +/* + * get_min_or_max_off + * + * Fetches either minimum or maximum offset of the matching partitions + * depending on the value of is_min parameter. + */ I am not sure we really have to have separate functions but if needed then I would prefer to have a separate function for each min and max rather than combining. --- + if (part_scheme->strategy != PARTITION_STRATEGY_LIST) + { + *clause_is_not_null = (nulltest->nulltesttype == IS_NOT_NULL); + return PARTCLAUSE_MATCH_NULLNESS; + } + + expr = makeConst(UNKNOWNOID, -1, InvalidOid, -2, (Datum) 0, true, false); + partclause = (PartClauseInfo *) palloc(sizeof(PartClauseInfo)); + + partclause->keyno = partkeyidx; + partclause->expr = (Expr *) expr; + partclause->is_null = true; + + if (nulltest->nulltesttype == IS_NOT_NULL) + { + partclause->op_is_ne = true; + partclause->op_strategy = InvalidStrategy; + } + else + { + partclause->op_is_ne = false; + partclause->op_strategy = BTEqualStrategyNumber; + } - return PARTCLAUSE_MATCH_NULLNESS; + *pc = partclause; + return PARTCLAUSE_MATCH_CLAUSE; I still believe considering NULL value for match clause is not a fundamentally correct thing. And that is only for List partitioning which isn't aligned with the other partitioning. --- Regards, Amul
pgsql-hackers by date: