Home > mailing lists

Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: Compression and on-disk sorting
Date	May 17, 2006 03:48:39
Msg-id	871wuts456.fsf@stark.xeocode.com Whole thread Raw
In response to	Re: Compression and on-disk sorting (Andrew Piskorski <atp@piskorski.com>)
Responses	Re: Compression and on-disk sorting Re: Compression and on-disk sorting
List	pgsql-hackers

Tree view

Andrew Piskorski <atp@piskorski.com> writes:

> The main tricks seem to be:  One, EXTREMELY lightweight compression
> schemes - basically table lookups designed to be as cpu friendly as
> posible.  Two, keep the data compressed in RAM as well so that you can
> also cache more of the data, and indeed keep it the compressed until
> as late in the CPU processing pipeline as possible.
> 
> A corrolary of that is forget compression schemes like gzip - it
> reduces data size nicely but is far too slow on the cpu to be
> particularly useful in improving overall throughput rates.

There are some very fast decompression algorithms:

http://www.oberhumer.com/opensource/lzo/

I think most of the mileage from "lookup tables" would be better implemented
at a higher level by giving tools to data modellers that let them achieve
denser data representations. Things like convenient enum data types, 1-bit
boolean data types, short integer data types, etc.

-- 
greg

pgsql-hackers by date:

From: Tom Lane
Date: 17 May 2006, 03:48:23
Subject: Re: audit table containing Select statements submitted

From: Tom Lane
Date: 17 May 2006, 03:52:06
Subject: Re: PL/pgSQL 'i = i + 1' Syntax

Re: Compression and on-disk sorting - Mailing list pgsql-hackers

Previous

Next