Update on sort-compression stuff - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Update on sort-compression stuff
Date
Msg-id 20060522192630.GI24404@svana.org
Whole thread Raw
Responses Re: Update on sort-compression stuff  ("Jim C. Nasby" <jnasby@pervasive.com>)
Re: Update on sort-compression stuff  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Update on sort-compression stuff  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I'm going to be offline for a few days but there are some things I've
tested in the meantime.

Once the compression level drops below 4-to-1 the overhead of zlib
becomes overwhelming compared to the savings. I'm not sure how common
that is, I basically filled a table for random data to get it that low.

I implemented a basic implementation using pg_lzcompress. It appears
that pg_lzcompress is very, very slow. I was afraid that I'd made an
infinite loop, but it was just really slow. Mind you, the overhead of
each call might have been the problem, it was being called on every
32KB block.

My suggestions at this point are:

- Test a way of storing tuples with less overhead than a HeapTuple
header. If you could do it for in-memory sorts, that'd mean you could
fit more tuples in memory before spilling to disk. Given the
"compression" in that case is extremely cheap, it'd be much more likely
to be beneficial.

- Consider replacing pg_lzcompress with zlib if available. Or at least
test pg_lzcompress in a more realistic environment, because it seems
quite slow.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: error-free disabling of individual child partition tables
Next
From: Andrew Dunstan
Date:
Subject: Re: error-free disabling of individual child partition