Re: pglz performance - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: pglz performance
Date
Msg-id 4df61f6c-5800-30f0-7f84-cef4847e8a07@2ndquadrant.com
Whole thread Raw
In response to Re: pglz performance  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 04/08/2019 21:20, Andres Freund wrote:
> On 2019-08-04 02:41:24 +0200, Petr Jelinek wrote:
>> Same here.
>>
>> Just so that we don't idly talk, what do you think about the attached?
> 
> Cool!
> 
>> It:
>> - adds new GUC compression_algorithm with possible values of pglz (default)
>> and lz4 (if lz4 is compiled in), requires SIGHUP
> 
> As Tomas remarked, I think it shouldn't be SIGHUP but USERSET. And I
> think lz4 should be preferred, if available.  I could see us using a
> list style guc, so we could set it to lz4, pglz, and the first available
> one would be used.
> 

Sounds reasonable.

>> - adds 1 byte header to the compressed data where we currently store the
>> algorithm kind, that leaves us with 254 more to add :) (that's an extra
>> overhead compared to the current state)
> 
> Hm. Why do we need an additional byte?  IIRC my patch added that only
> for the case we would run out of space for compression formats without
> extending any sizes?
> 

Yeah your patch worked differently (I didn't actually use any code from 
it). The main reason why I add the byte is that I am storing the 
algorithm in the compressed value itself, not in varlena header. I was 
mainly trying to not have every caller care about storing and loading 
the compression algorithm. I also can't say I particularly like that 
hack in your patch.

However if we'd want to have separate GUCs for TOAST and WAL then we'll 
have to do that anyway so maybe it does not matter anymore (we can't use 
similar hack there AFAICS though).

> 
>> - changes the rawsize in TOAST header to 31 bits via bit packing
>> - uses the extra bit to differentiate between old and new format
> 
> Hm. Wouldn't it be easier to just use a different vartag for this?
> 

That would only work for external TOAST pointers right? The compressed 
varlena can also be stored inline and potentially in index tuple.

> 
>> - I expect my changes to configure.in are not the greatest as I don't have
>> pretty much zero experience with autoconf
> 
> FWIW the configure output changes are likely because you used a modified
> version of autoconf. Unfortunately debian/ubuntu ship one with vendor
> patches.
> 

Yeah, Ubuntu here, that explains.

-- 
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: idea: log_statement_sample_rate - bottom limit for sampling
Next
From: Paul A Jungwirth
Date:
Subject: Re: SQL:2011 PERIODS vs Postgres Ranges?