Home > mailing lists

Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From	Zeugswetter Andreas DCP SD
Subject	Re: Compression and on-disk sorting
Date	May 17, 2006 11:47:29
Msg-id	E1539E0ED7043848906A8FF995BDA5790105450D@m0143.s-mxs.net Whole thread Raw
In response to	Compression and on-disk sorting ("Jim C. Nasby" <jnasby@pervasive.com>)
List	pgsql-hackers

Tree view

> Unfortunatly, the interface provided by pg_lzcompress.c is probably
> insufficient for this purpose. You want to be able to compress tuples
> as they get inserted and start a new block once the output reaches a

I don't think anything that compresses single tuples without context is
going to be a win under realistic circumstances.

I would at least compress whole pages. Allow a max ratio of 1:n,
have the pg buffercache be uncompressed, and only compress on write
(filesystem cache then holds compressed pages).

The tricky part is predicting whether a tuple still fits in a n*8k
uncompressed
8k compressed page, but since lzo is fast you might even test it in
corner cases.
(probably logic that needs to also be in the available page freespace
calculation)
Choosing a good n is also tricky, probably 2 (or 3 ?) is good.

You probably also want to always keep the header part of the page
uncompressed.

Andreas

pgsql-hackers by date:

From: "Andrew Dunstan"
Date: 17 May 2006, 11:17:10
Subject: Re: PL/pgSQL 'i = i + 1' Syntax

From: "Jonah H. Harris"
Date: 17 May 2006, 12:26:06
Subject: Re: Compression and on-disk sorting

Re: Compression and on-disk sorting - Mailing list pgsql-hackers

Previous

Next