Re: Parallel CREATE INDEX for GIN indexes - Mailing list pgsql-hackers

From Kirill Reshke
Subject Re: Parallel CREATE INDEX for GIN indexes
Date
Msg-id CALdSSPhNjVzZXQQFiBWZBX8xO4JOu4PDc5XvjH=19JqzEmBiFQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel CREATE INDEX for GIN indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Tue, 8 Oct 2024 at 17:06, Tomas Vondra <tomas@vondra.me> wrote:
>
> On 10/8/24 04:03, Michael Paquier wrote:
> >
> > _gin_parallel_build_main() is introduced in 0001.  Please make sure to
> > pass down a query ID.
>
> Thanks for the ping. Here's an updated patch doing that, and also fixing
> a couple whitespace issues. No other changes, but I plan to get back to
> this patch soon - before the next CF.
>
>
> regards
>
> --
> Tomas Vondra

Hi! I was looking through this series of patches because thread of
GIN&GIST amcheck patch references it.

I have spotted this in gininsert.c:
1)
>/*
>* Store shared tuplesort-private state, for which we reserved space.
>* Then, initialize opaque state using tuplesort routine.
>*/
>sharedsort = (Sharedsort *) shm_toc_allocate(pcxt->toc, estsort);
>tuplesort_initialize_shared(sharedsort, scantuplesortstates,>
pcxt->seg);
>/*
>* Store shared tuplesort-private state, for which we reserved space.
>* Then, initialize opaque state using tuplesort routine.
>*/

Is it necessary to duplicate the entire comment?

And, while we are here, isn't it " initialize the opaque state "?

2) typo :
* the TID array, and returning false if it's too large (more thant work_mem,

3) in _gin_build_tuple:

....
else if (typlen == -2)
    keylen = strlen(DatumGetPointer(key)) + 1;
else
   elog(ERROR, "invalid typlen");


Maybe `elog(ERROR, "invalid typLen: %d", typLen); ` as in `datumGetSize`?

4) in _gin_compare_tuples:

> if ((a->category == GIN_CAT_NORM_KEY) &&
        (b->category == GIN_CAT_NORM_KEY))

maybe just a->category == GIN_CAT_NORM_KEY? a->category is already
equal to b->category because of previous if statements.

5) In _gin_partition_sorted_data:

>char    fname[128];
>sprintf(fname, "worker-%d", i);

Other places use MAXPGPATH in similar cases.

Also, code `sprintf(fname, "worker-%d",...);` duplicates. This might
be error-prone. Should we have a macro/inline function for this?

I will take another look later, maybe reporting real problems, not nit-picks.

-- 
Best regards,
Kirill Reshke



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Missing INFO on client_min_messages
Next
From: David Rowley
Date:
Subject: Re: Use or not record count on examples