Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Compression and on-disk sorting
Date
Msg-id 20060518201746.GQ64371@pervasive.com
Whole thread Raw
In response to Re: Compression and on-disk sorting  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Compression and on-disk sorting
List pgsql-hackers
On Thu, May 18, 2006 at 08:32:10PM +0200, Martijn van Oosterhout wrote:
> On Thu, May 18, 2006 at 11:22:46AM -0500, Jim C. Nasby wrote:
> > AFAIK logtape currently reads in much less than 256k blocks. Of course
> > if you get lucky you'll read from one tape for some time before
> > switching to another, which should have a sort-of similar effect if the
> > drives aren't very busy with other things.
> 
> Logtape works with BLCKSZ blocks. Whether it's advisable or not, I
> don't know. One thing I'm noticing with this compression-in-logtape is
> that the memory cost per tape increases considerably. Currently we rely
> heavily on the OS to cache for us.
> 
> Lets say we buffer 256KB per tape, and we assume a large sort with many
> tapes, you're going to blow all your work_mem on buffers. If using
> compression uses more memory so that you can't have as many tapes and
> thus need multiple passes, well, we need to test if this is still a
> win.

So you're sticking with 8K blocks on disk, after compression? And then
decompressing blocks as they come in?

Actually, I guess the amount of memory used for zlib's lookback buffer
(or whatever they call it) could be pretty substantial, and I'm not sure
if there would be a way to combine that across all tapes.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Question about casts
Next
From: David Fetter
Date:
Subject: Re: [OT] MySQL is bad, but THIS bad?