Re: lztext and compression ratios... - Mailing list pgsql-general

From Tom Lane
Subject Re: lztext and compression ratios...
Date
Msg-id 3701.962820757@sss.pgh.pa.us
Whole thread Raw
In response to lztext and compression ratios...  (Jeffery Collins <collins@onyx-technologies.com>)
List pgsql-general
Jeffery Collins <collins@onyx-technologies.com> writes:
> My experience with attempting to compress such a relatively small
> (around 1K) text string is that the compression ration is not very
> good.  This is because the string is not long enough for the LZ
> compression algorithm to establish really good compression patterns and
> the fact that the de-compression table has to be built into each
> record.  What I have done in the past to get around these problems is
> that I have "taught" the compression algorithm the patterns ahead of
> time and stored the de-compression patterns in an external table.  Using
> this technique, I have achieved *much* better compression ratios.

(Puts on compression-guru hat...)

There is much in what you say.  Perhaps we should consider keeping the
lztext type around (currently it's slated for doom in 7.1, since the
TOAST feature will make plain text do everything lztext does and more)
and having it be different from text in that a training sample is
supplied when the column is defined.  Not quite sure how that should
look or where to store the sample, but it could be a big win for tables
having a large number of moderate-sized text entries.

            regards, tom lane

pgsql-general by date:

Previous
From: Jeffery Collins
Date:
Subject: lztext and compression ratios...
Next
From: Charles Tassell
Date:
Subject: Re[2]: Interface Question