Re: Select count(*), the sequel - Mailing list pgsql-performance

From Thomas Kellerer
Subject Re: Select count(*), the sequel
Date
Msg-id iaa3hl$pkk$1@dough.gmane.org
Whole thread Raw
In response to Re: Select count(*), the sequel  (Kenneth Marshall <ktm@rice.edu>)
List pgsql-performance
Kenneth Marshall, 27.10.2010 22:41:
> Different algorithms have been discussed before. A quick search turned
> up:
>
> quicklz - GPL or commercial
> fastlz - MIT works with BSD okay
> zippy - Google - no idea about the licensing
> lzf - BSD-type
> lzo - GPL or commercial
> zlib - current algorithm
>
> Of these lzf can compress at almost 3.7X of zlib and decompress at 1.7X
> and fastlz can compress at 3.1X of zlib and decompress at 1.9X. The same
> comparison put lzo at 3.0X for compression and 1.8X decompress. The block
> design of lzl/fastlz may be useful to support substring access to toasted
> data among other ideas that have been floated here in the past.
>
> Just keeping the hope alive for faster compression.

What about a dictionary based compression (like DB2 does)?

In a nutshell: it creates a list of "words" in a page. For each word, the occurance in the db-block are stored and the
actualword is removed from the page/block itself. This covers all rows on a page and can give a very impressive overall
compression.
This compression is not done only on disk but in-memory as well (the page is loaded with the dictionary into memory).

I believe Oracle 11 does something similar.

Regards
Thomas

pgsql-performance by date:

Previous
From: Kenneth Marshall
Date:
Subject: Re: Select count(*), the sequel
Next
From: André Volpato
Date:
Subject: Re: AIX slow buffer reads