Re: Multi-Column List Partitioning - Mailing list pgsql-hackers

From Amul Sul
Subject Re: Multi-Column List Partitioning
Date
Msg-id CAAJ_b96_1B+Qi9qqpQT46vuH1M=iGGvs=VQG=3OPcr0njxw_jQ@mail.gmail.com
Whole thread Raw
In response to Re: Multi-Column List Partitioning  (Nitin Jadhav <nitinjadhavpostgres@gmail.com>)
List pgsql-hackers
On Mon, Dec 6, 2021 at 7:27 PM Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
>
> Thank you for reviewing the patch.
>
> > partbounds.c: In function ‘get_qual_for_list.isra.18’:
> > partbounds.c:4284:29: warning: ‘boundinfo’ may be used uninitialized
> > in this function [-Wmaybe-uninitialized]
> >         datumCopy(bound_info->datums[i][j],
> >                   ~~~~~~~~~~^~~~~~~~
> > partbounds.c:4335:21: note: ‘boundinfo’ was declared here
> >  PartitionBoundInfo boundinfo;
> >                     ^~~~~~~~~
> > partbounds.c: In function ‘partition_bounds_merge’:
> > partbounds.c:1305:12: warning: ‘inner_isnull’ may be used
> > uninitialized in this function [-Wmaybe-uninitialized]
> >   bool    *inner_isnull;
> >            ^~~~~~~~~~~~
> > partbounds.c:1304:12: warning: ‘outer_isnull’ may be used
> > uninitialized in this function [-Wmaybe-uninitialized]
> >   bool    *outer_isnull;
> >           ^~~~~~~~~~~~
>
> Fixed.
>
> > This function is unnecessarily complicated, I think you can avoid
> > inner for loops; simply replace for-loop-block with  "if
> > (equal(lfirst(cell), new_bound)) return true".
>
> Thank you for the suggestion. Fixed.
>
> > + char   **colname = (char **) palloc0(partnatts * sizeof(char *));
> > + Oid    *coltype = palloc0(partnatts * sizeof(Oid));
> > + int32    *coltypmod = palloc0(partnatts * sizeof(int));
> > + Oid    *partcollation = palloc0(partnatts * sizeof(Oid));
> > +
> > This allocation seems to be worthless, read ahead.
> >
> > I think there is no need for this separate loop inside
> > transformPartitionListBounds, you can do that same in the next loop as
> > well. And instead of  get_partition_col_* calling and storing, simply
> > use that directly as an argument to transformPartitionBoundValue().
>
> Yes. The loop can be avoided and content of the above loop can be
> included in the next loop but the next loop iterates over a list of
> multi column datums. For each iteration, we need the information of
> all the columns. The above data (colname, coltype, coltypmod and
> partcollation) remains same for each iteration of the loop, If we
> modify as suggested, then the function to fetch these information has
> to be called every-time. To avoid this situation I have made a
> separate loop outside which only runs as many number of columns and
> stores in a variable which can be reused later. Please let me correct
> if I am wrong.
>

Ok, colname can be fetched in advance but I don't think it worth it to
fetch coltype, coltypmod & partcollation;  and, store in the
explicitly allocated memory, instead, you can directly call
get_partition_col_* inline functions.

> > I think this should be inside the "else" block after "!IsA(rowexpr,
> > RowExpr)" error and you can avoid IsA() check too.
>
> This is required to handle the situation when one partition key is
> mentioned and multiple values are provided in the partition bound
> specification.
>
> > Looks difficult to understand at first glance, how about the following:
> >
> > if (b1->isnulls != b2->isnulls)
> >    return false;
> >
> > if (b1->isnulls)
> > {
> >    if (b1->isnulls[i][j] != b2->isnulls[i][j])
> >        return false;
> >    if (b1->isnulls[i][j])
> >        continue;
> > }
> >
> > See how range partitioning infinite values are handled. Also, place
> > this before the comment block that was added for the "!datumIsEqual()"
> > case.
>
> Fixed. I feel the 'continue' block is not required and hence removed it.
>
> > Nothing wrong with this but if we could have checked "dest->isnulls"
> > instead of "src->isnulls" would be much better.
>
> Here we are copying the data from 'src' to 'dest'. If there is no data
> in 'src', it is unnecessary to copy. Hence checking 'src'.
>

I am not sure how that makes a difference since you do allocate 'dest'
based on 'src'; anyway, I leave that choice to you.

> > Condition "key->strategy != PARTITION_STRATEGY_LIST" seems to be unnecessary.
>
> Fixed.
>
> > Can't be a single loop?
>
> Yes. Fixed.
>

Thanks, will have a look.

Regards,
Amul



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: pg_get_publication_tables() output duplicate relid
Next
From: Andrew Dunstan
Date:
Subject: Re: MSVC SSL test failure