Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Compression and on-disk sorting
Date
Msg-id 871wuts456.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Compression and on-disk sorting  (Andrew Piskorski <atp@piskorski.com>)
Responses Re: Compression and on-disk sorting
Re: Compression and on-disk sorting
List pgsql-hackers
Andrew Piskorski <atp@piskorski.com> writes:

> The main tricks seem to be:  One, EXTREMELY lightweight compression
> schemes - basically table lookups designed to be as cpu friendly as
> posible.  Two, keep the data compressed in RAM as well so that you can
> also cache more of the data, and indeed keep it the compressed until
> as late in the CPU processing pipeline as possible.
> 
> A corrolary of that is forget compression schemes like gzip - it
> reduces data size nicely but is far too slow on the cpu to be
> particularly useful in improving overall throughput rates.

There are some very fast decompression algorithms:

http://www.oberhumer.com/opensource/lzo/


I think most of the mileage from "lookup tables" would be better implemented
at a higher level by giving tools to data modellers that let them achieve
denser data representations. Things like convenient enum data types, 1-bit
boolean data types, short integer data types, etc.

-- 
greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: audit table containing Select statements submitted
Next
From: Tom Lane
Date:
Subject: Re: PL/pgSQL 'i = i + 1' Syntax