Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows) - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)
Date
Msg-id 87wsd8j2pf.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)  ("Robert Haas" <robertmhaas@gmail.com>)
List pgsql-hackers
> "Robert Haas" <robertmhaas@gmail.com> writes:
>
>> not compressing very small datums (< 256 bytes) also seems smart,
>> since that could end up producing a lot of extra compression attempts,
>> most of which will end up saving little or no space.

That was presumably the rationale for the original logic. However experience
shows that there are certainly databases that store a lot of compressible
short strings. 

Obviously databases with CHAR(n) desperately need us to compress them. But
even plain text data are often moderately compressible even with our fairly
weak compression algorithm.

One other thing that bothers me about our toast mechanism is that it only
kicks in for tuples that are "too large". It seems weird that the same column
is worth compressing or not depending on what other columns are in the same
tuple.

If you store a 2000 byte tuple that's all spaces we don't try to compress it
at all. But if you added one more attribute we would go to great lengths
compressing and storing attributes externally -- not necessarily the attribute
you just added, the ones that were perfectly fine previously -- to try to get
it under 2k.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!


pgsql-hackers by date:

Previous
From: "Pavel Stehule"
Date:
Subject: Re: ERROR: failed to find conversion function from "unknown" to text
Next
From: KaiGai Kohei
Date:
Subject: Updates of SE-PostgreSQL 8.4devel patches (r1389)