Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From Zeugswetter Andreas DCP SD
Subject Re: Compression and on-disk sorting
Date
Msg-id E1539E0ED7043848906A8FF995BDA5790105450D@m0143.s-mxs.net
Whole thread Raw
In response to Compression and on-disk sorting  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-hackers
> Unfortunatly, the interface provided by pg_lzcompress.c is probably
> insufficient for this purpose. You want to be able to compress tuples
> as they get inserted and start a new block once the output reaches a

I don't think anything that compresses single tuples without context is
going to be a win under realistic circumstances.

I would at least compress whole pages. Allow a max ratio of 1:n,
have the pg buffercache be uncompressed, and only compress on write
(filesystem cache then holds compressed pages).

The tricky part is predicting whether a tuple still fits in a n*8k
uncompressed
8k compressed page, but since lzo is fast you might even test it in
corner cases.
(probably logic that needs to also be in the available page freespace
calculation)
Choosing a good n is also tricky, probably 2 (or 3 ?) is good.

You probably also want to always keep the header part of the page
uncompressed.

Andreas


pgsql-hackers by date:

Previous
From: "Andrew Dunstan"
Date:
Subject: Re: PL/pgSQL 'i = i + 1' Syntax
Next
From: "Jonah H. Harris"
Date:
Subject: Re: Compression and on-disk sorting