Hello Andrey,
On 2019-11-02 12:30, Andrey Borodin wrote:
>> 1 нояб. 2019 г., в 18:48, Alvaro Herrera <alvherre@2ndquadrant.com>
>> написал(а):
> PFA two patches:
> v4-0001-Use-memcpy-in-pglz-decompression.patch (known as 'hacked' in
> test_pglz extension)
> v4-0001-Use-memcpy-in-pglz-decompression-for-long-matches.patch (known
> as 'hacked8')
Looking at the patches, it seems only the case of a match is changed.
But when we observe a literal byte, this is copied byte-by-byte with:
else
{
* An unset control bit means LITERAL BYTE. So we just
* copy one from INPUT to OUTPUT.
*/
*dp++ = *sp++;
}
Maybe we can optimize this, too. For instance, you could just increase a
counter:
else
{
/*
* An unset control bit means LITERAL BYTE. We count
* these and copy them later.
*/
literal_bytes ++;
}
and in the case of:
if (ctrl & 1)
{
/* First copy all the literal bytes */
if (literal_bytes > 0)
{
memcpy( sp, dp, literal_bytes);
sp += literal_bytes;
dp += literal_bytes;
literal_bytes = 0;
}
(Code untested!)
The same would need to be done at the very end, if the input ends
without any new CTRL-byte.
Wether that gains us anything depends on how common literal bytes are.
It might be that highly compressible input has almost none, while input
that is a mix of incompressible strings and compressible ones might have
longer stretches. One example would be something like an SHA-256, that
is repeated twice. The first instance would be incompressible, the
second one would be just a copy. This might not happens that often in
practical inputs, though.
I wonder if you agree and what would happen if you try this variant on
your corpus tests.
Best regards,
Tels