Re: [POC] A better way to expand hash indexes. - Mailing list pgsql-hackers

From Mithun Cy
Subject Re: [POC] A better way to expand hash indexes.
Date
Msg-id CAD__OujNpOBrPkoD=sHnPU83RTpoExs1P5p=Vmz4joV16ho=Ug@mail.gmail.com
Whole thread Raw
In response to [HACKERS] [POC] A better way to expand hash indexes.  (Mithun Cy <mithun.cy@enterprisedb.com>)
Responses Re: [POC] A better way to expand hash indexes.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Thanks Robert, I have tried to fix the comments given as below.

On Thu, Mar 30, 2017 at 9:19 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I think that the macros in hash.h need some more work:
>
> - Pretty much any time you use the argument of a macro, you need to
> parenthesize it in the macro definition to avoid surprises if the
> macros is called using an expression.  That isn't done consistently
> here.

--I have tried to fix same in the latest patch.

> - The macros make extensive use of magic numbers like 1, 2, and 3.  I
> suggest something like:
>
> #define SPLITPOINT_PHASE_BITS 2
> #define SPLITPOINT_PHASES_PER_GROUP (1 << SPLITPOINT_PHASE_BITS)
>
> And then use SPLITPOINT_PHASE_BITS any place where you're currently
> saying 2.  The reference to 3 is really SPLITPOINT_PHASE_BITS + 1.

-- Taken modified same in the latest patch.

> - Many of these macros are only used in one place.  Maybe just move
> the computation to that place and get rid of the macro.  For example,
> _hash_spareindex() could be written like this:
>
> if (splitpoint_group < SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE)
>     return splitpoint_group;
>
> /* account for single-phase groups */
> splitpoint = SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE;
>
> /* account for completed groups */
> splitpoint += (splitpoint_group -
> SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE) << SPLITPOINT_PHASE_BITS;
>
> /* account for phases within current group */
> splitpoint += (bucket_num >> (SPLITPOINT_PHASE_BITS + 1)) &
> SPLITPOINT_PHASE_MASK;
>
> return splitpoint;
>
> That eliminates the only use of two complicated macros and is in my
> opinion more clear than what you've currently got.

-- Taken, also rewrote _hash_get_totalbuckets in similar lines.

With that, we will end up with only 2 macros which have some computing code
+/* defines max number of splitpoit phases a hash index can have */
+#define HASH_MAX_SPLITPOINT_GROUP 32
+#define HASH_MAX_SPLITPOINTS \
+ (((HASH_MAX_SPLITPOINT_GROUP - HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) * \
+  HASH_SPLITPOINT_PHASES_PER_GRP) + \
+ HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE)
+
+/* given a splitpoint phase get its group */
+#define HASH_SPLITPOINT_PHASE_TO_SPLITPOINT_GRP(sp_phase) \
+ (((sp_phase) < HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) ? \
+ (sp_phase) : \
+ ((((sp_phase) - HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE) >> \
+   HASH_SPLITPOINT_PHASE_BITS) + \
+  HASH_SPLITPOINT_GROUPS_WITH_ONE_PHASE))

> - Some of these macros lack clear comments explaining their purpose.

-- I have written some comments to explain the use of the macros.

> - Some of them don't include HASH anywhere in the name, which is
> essential for a header that may easily be included by non-hash index
> code.

-- Fixed, all MACROS are prefixed with HASH

> - The names don't all follow a consistent format.  Maybe that's too
> much to hope for at some level, but I think they could be more
> consistent than they are.

-- Fixed, apart from old HASH_MAX_SPLITPOINTS rest all have a prefix
HASH_SPLITPOINT.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Monitoring roles patch
Next
From: Tomas Vondra
Date:
Subject: strange parallel query behavior after OOM crashes