Re: Select count(*), the sequel - Mailing list pgsql-performance

From Kenneth Marshall
Subject Re: Select count(*), the sequel
Date
Msg-id 20101027204115.GB27429@aart.is.rice.edu
Whole thread Raw
In response to Re: Select count(*), the sequel  ("Pierre C" <lists@peufeu.com>)
Responses Re: Select count(*), the sequel
Re: Select count(*), the sequel
List pgsql-performance
On Wed, Oct 27, 2010 at 09:52:49PM +0200, Pierre C wrote:
>> Even if somebody had a
>> great idea that would make things smaller without any other penalty,
>> which I'm not sure I believe either.
>
> I'd say that the only things likely to bring an improvement significant
> enough to warrant the (quite large) hassle of implementation would be :
>
> - read-only / archive tables (get rid of row header overhead)
> - in-page compression using per-column delta storage for instance (no
> random access penalty, but hard to implement, maybe easier for read-only
> tables)
> - dumb LZO-style compression (license problems, needs parallel
> decompressor, random access penalty, hard to implement too)
>

Different algorithms have been discussed before. A quick search turned
up:

quicklz - GPL or commercial
fastlz - MIT works with BSD okay
zippy - Google - no idea about the licensing
lzf - BSD-type
lzo - GPL or commercial
zlib - current algorithm

Of these lzf can compress at almost 3.7X of zlib and decompress at 1.7X
and fastlz can compress at 3.1X of zlib and decompress at 1.9X. The same
comparison put lzo at 3.0X for compression and 1.8X decompress. The block
design of lzl/fastlz may be useful to support substring access to toasted
data among other ideas that have been floated here in the past.

Just keeping the hope alive for faster compression.

Cheers,
Ken

pgsql-performance by date:

Previous
From: Jon Nelson
Date:
Subject: Re: temporary tables, indexes, and query plans
Next
From: Thomas Kellerer
Date:
Subject: Re: Select count(*), the sequel