On Sun, Apr 14, 2019 at 09:45:10AM -0700, Andres Freund wrote:
>Hi,
>
>On 2019-04-14 18:36:18 +0200, Tomas Vondra wrote:
>> I think those comparisons are cute and we did a fair amount of them when
>> considering a drop-in replacement for pglz, but ultimately it might be a
>> bit pointless because:
>>
>> (a) it very much depends on the dataset (one algorithm may work great on
>> one type of data, suck on another)
>>
>> (b) different systems may require different trade-offs (high ingestion
>> rate vs. best compression ratio)
>>
>> (c) decompression speed may be much more important
>>
>> What I'm trying to say is that we shouldn't obsess about picking one
>> particular algorithm too much, because it's entirely pointless. Instead,
>> we should probably design the system to support different compression
>> algorithms, ideally at column level.
>
>I think we still need to pick a default algorithm, and realistically
>that's going to be used by like 95% of the users.
>
True. Do you expect it to be specific to the column store, or should be
set per-instance default (even for regular heap)?
FWIW I think the conclusion from past dev meetings was we're unlikely to
find anything better than lz4. I doubt that changed very much.
regard
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services