Re: pglz compression performance, take two - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: pglz compression performance, take two
Date
Msg-id CAAhFRxhFvuO-yUu8VuT3k6jKeJUjzA0ff3rJS6qtPLHdq3Epzg@mail.gmail.com
Whole thread Raw
In response to Re: pglz compression performance, take two  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: pglz compression performance, take two
List pgsql-hackers
Hi Tomas,

On Sun, Nov 27, 2022 at 8:02 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> 1) For PGLZ_HISTORY_SIZE it uses literal 0x0fff, with the explanation:
>
>     /* to avoid compare in iteration */
>
> which I think means intent to use this value as a bit mask, but then the
> only place using PGLZ_HISTORY_SIZE does
>
>     if (hist_next == PGLZ_HISTORY_SIZE) ...
>
> i.e. a comparison. Maybe I misunderstand the comment, though.
>

As far as I recollect, it's a leftover from an attempt to optimize the
code into branchless version
I.e. instead of
if(hist_next>=PGLZ_HISTORY_SIZE)
 hist_next = 1;
use something like hist_next = hist_next & PGLZ_HISTORY_SIZE.
But the optimization did not show any measurable impact and was
improperly rolled back.

>
> 2) PGLZ_HistEntry was modified and replaces links (pointers) with
> indexes, but the comments still talk about "links", so maybe that needs
> to be changed.

The offsets still form a "linked list"... however I removed some
mentions of pointers, since they are not pointers anymore.

> Also, I wonder why next_id is int16 while hist_idx is
> uint16 (and also why id vs. idx)?

+1. I'd call them next and hash.

int16 next; /* instead of next_d */
uint16 hash; /* instead of hist_idx */

What do you think?
hist_idx comes from the function name... I'm not sure how far renaming
should go here.


>
> 3) minor formatting of comments
>
> 4) the comment in pglz_find_match about traversing the history seems too
> early - it's before handling invalid entries and cleanup, but it does
> not talk about that at all, and the actual while loop is after that.

Yes, this seems right for me.

PFA review fixes (step 1 is unchanged).
I did not include next_id->next and hist_idx -> hash rename.

Thank you!


Best regards, Andrey Borodin.

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Next
From: Andrey Borodin
Date:
Subject: Re: Use fadvise in wal replay