Home > mailing lists

Re: Speed up transaction completion faster after many relations are accessed in a transaction - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Speed up transaction completion faster after many relations are accessed in a transaction
Date	July 12, 2021 09:55:42
Msg-id	CAApHDvqQcX-G30=mRtzC5OOZOOStXhhCXq_uAdzEra8fZGKqgw@mail.gmail.com Whole thread Raw
In response to	Re: Speed up transaction completion faster after many relations are accessed in a transaction (Zhihong Yu <zyu@yugabyte.com>)
List	pgsql-hackers

Tree view

Thanks for having a look at this.

On Mon, 21 Jun 2021 at 05:02, Zhihong Yu <zyu@yugabyte.com> wrote:
> + *   GH_ELEMENT_TYPE defines the data type that the hashtable stores.  Each
> + *   instance of GH_ELEMENT_TYPE which is stored in the hash table is done so
> + *   inside a GH_SEGMENT.
>
> I think the second sentence can be written as (since done means stored, it is redundant):

I've rewords this entire paragraph slightly.

> Each instance of GH_ELEMENT_TYPE is stored in the hash table inside a GH_SEGMENT.
>
> + * Macros for translating a bucket's index into the segment and another to
> + * determine the item number within the segment.
> + */
> +#define GH_INDEX_SEGMENT(i)    (i) / GH_ITEMS_PER_SEGMENT
>
> into the segment -> into the segment number (in the code I see segidx but I wonder if segment index may cause slight
confusion).

I've adjusted this comment

> +   GH_BITMAP_WORD used_items[GH_BITMAP_WORDS]; /* A 1-bit for each used item
> +                                                * in the items array */
>
> 'A 1-bit' -> One bit (A and 1 mean the same)

I think you might have misread this. We're storing a 1-bit for each
used item rather than a 0-bit.  If I remove the 'A' then it's not
clear what the meaning of each bit's value is.

> +   uint32      first_free_segment;
>
> Since the segment may not be totally free, maybe name the field first_segment_with_free_slot

I don't really like that. It feels too long to me.

> + * This is similar to GH_NEXT_ONEBIT but flips the bits before operating on
> + * each GH_BITMAP_WORD.
>
> It seems the only difference from GH_NEXT_ONEBIT is in this line:
>
> +       GH_BITMAP_WORD word = ~(words[wordnum] & mask); /* flip bits */
>
> If a 4th parameter is added to signify whether the flipping should be done, these two functions can be unified.

I don't want to do that.  I'd rather have them separate to ensure the
compiler does not create any additional needless branching. Those
functions are pretty hot.

> +    * next insert will store in this segment.  If it's already pointing to an
> +    * earlier segment, then leave it be.
>
> The last sentence is confusing: first_free_segment cannot point to some segment and earlier segment at the same
time.
> Maybe drop the last sentence.

I've adjusted this comment to become:

* Check if we need to lower the next_free_segment.  We want new inserts
* to fill the lower segments first, so only change the first_free_segment
* if the removed entry was stored in an earlier segment.

Thanks for having a look at this.

I'll attach an updated patch soon.

David

pgsql-hackers by date:

From: Michael Paquier
Date: 12 July 2021, 09:42:15
Subject: Re: Introduce pg_receivewal gzip compression tests

From: David Rowley
Date: 12 July 2021, 10:23:57
Subject: Re: Speed up transaction completion faster after many relations are accessed in a transaction

Re: Speed up transaction completion faster after many relations are accessed in a transaction - Mailing list pgsql-hackers

Previous

Next