Re: pglz compression performance, take two - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: pglz compression performance, take two
Date
Msg-id 120DD821-48F4-446C-865A-093E560A2143@yandex-team.ru
Whole thread Raw
In response to Re: pglz compression performance, take two  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: pglz compression performance, take two
List pgsql-hackers
Thanks for the review Mark! Sorry it took too long to reply on my side.

> 28 июня 2021 г., в 21:05, Mark Dilger <mark.dilger@enterprisedb.com> написал(а):
>
>> #define PGLZ_HISTORY_SIZE       0x0fff - 1  /* to avoid compare in iteration */
> ...
>> static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE + 1];
> ...
>>    if (hist_next == PGLZ_HISTORY_SIZE + 1)
>
> These are the only uses of PGLZ_HISTORY_SIZE.  Perhaps you could just defined the symbol as 0x0fff and skip the -1
and+1 business? 
Fixed.

>> /* ----------
>> * pglz_compare -
>> *
>> *      Compares 4 bytes at pointers
>> * ----------
>> */
>> static inline bool
>> pglz_compare32(const void *ptr1, const void *ptr2)
>> {
>>    return memcmp(ptr1, ptr2, 4) == 0;
>> }
>
> The comment function name differs from the actual function name.
>
> Also, pglz_compare returns an offset into the string, whereas pglz_compare32 returns a boolean.  This is fairly
unintuitive. The "32" part of pglz_compare32 implies doing the same thing as pglz_compare but where the string is known
tobe 4 bytes in length.  Given that pglz_compare32 is dissimilar to pglz_compare, perhaps avoid using /pglz_compare/ in
itsname? 
I've removed pglz_compare32 entirely. It was a simple substitution for memcmp().

>
>>        /*
>>         * Determine length of match. A better match must be larger than the
>>         * best so far. And if we already have a match of 16 or more bytes,
>>         * it's worth the call overhead to use memcmp()
>
> This comment is hard to understand, given the code that follows.  The first block calls memcmp(), which seems to be
thefunction overhead you refer to.  The second block calls the static inline function pglz_compare32, which internally
callsmemcmp().  Superficially, there seems to be a memcmp() function call either way.  The difference is that in the
firstblock's call to memcmp(), the length is a runtime value, and in the second block, it is a compile-time known
value. If you are depending on the compiler to notice this distinction and optimize the second call, perhaps you can
mentionthat explicitly?  Otherwise, reading and understanding the comment takes more effort. 
I've updated comment for second branch with fixed-size memcmp(). Frankly, I'm not sure "if (memcmp(input_pos, hist_pos,
4)== 0)" worth the complexity,  internals of "pglz_compare(0, len_bound, input_pos + 0, hist_pos + 0);" would do almost
sameinstructions. 

>
> I took a quick look for other places in the code that try to beat the performance of memcmp on short strings.  In
varlena.c,rest_of_char_same() seems to do so.  We also use comparisons on NameData, which frequently contains strings
shorterthan 16 bytes.  Is it worth sharting a static inline function that uses your optimization in other places?  How
confidentare you that your optimization really helps? 
Honestly, I do not know. The overall patch effect consists of stacking up many small optimizations. They have a net
effect,but are too noisy to measure independently. That's mostly the reason why I didn't know what to reply for so
long.


> 5 нояб. 2021 г., в 01:47, Tomas Vondra <tomas.vondra@enterprisedb.com> написал(а):
>
> Andrey, can you update the patch per Mark's review? I'll do my best to get it committed sometime in this CF.

Cool! Here's the patch.

Best regards, Andrey Borodin.


Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: parallel vacuum comments
Next
From: Bharath Rupireddy
Date:
Subject: consistently use "ProcSignal" instead of "procsignal" in code comments