I have a couple of comments, I have only reviewed 0001 so far.
Thank you for reviewing!
You may need to run pgindent, it makes some changes.
Attached v4-patch has been updated after running pgindent.
+ * If table is shared, calculate the offset at which to find the the + * first partition of elements + */ + + nsegs = compute_buckets_and_segs(nelem, &nbuckets, hctl->num_partitions, hctl->ssize);
Blank line between the comment and the code.
Removed this.
/* * allocate some new elements and link them into the indicated free list */ -static bool -element_alloc(HTAB *hashp, int nelem, int freelist_idx) +static HASHELEMENT * +element_alloc(HTAB *hashp, int nelem)
Comment needs an update. This function no longer links elements into the free list.
Updated this and few other comments in the attached v4-patch.
+static int +compute_buckets_and_segs(long nelem, int *nbuckets, long num_partitions, long ssize) +{ ... + /* + * In a partitioned table, nbuckets must be at least equal to + * num_partitions; were it less, keys with apparently different partition + * numbers would map to the same bucket, breaking partition independence. + * (Normally nbuckets will be much bigger; this is just a safety check.) + */ + while ((*nbuckets) < num_partitions) + (*nbuckets) <<= 1;
I have some worries about this function, I am not sure what I said below has real life implications as you already said 'Normally nbuckets will be much bigger; this is just a safety check.'.
1- num_partitions is long and nbuckets is int, so could there be any case where num_partition is bigger than MAX_INT and cause an infinite loop? 2- Although we assume both nbuckets and num_partition initialized as the same type, (*nbuckets) <<= 1 will cause an infinite loop if num_partition is bigger than MAX_TYPE / 2.
So I think that the solution is to confirm that num_partition < MAX_NBUCKETS_TYPE / 2, what do you think?
Your concern is valid. This has been addressed in the existing code by calling next_pow2_int() on num_partitions before running the function. Additionally, I am not adding any new code to the compute_buckets_and_segs function. I am simply moving part of the init_tab() code into a separate function for reuse.
Please find attached the updated and rebased patches.