Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Compression and on-disk sorting
Date
Msg-id 20060524201230.GI59464@pervasive.com
Whole thread Raw
In response to Re: Compression and on-disk sorting  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Compression and on-disk sorting
List pgsql-hackers
Finally completed testing of a dataset that doesn't fit in memory with
compression enabled. Results are at
http://jim.nasby.net/misc/pgsqlcompression .

Summary:                   work_mem    compressed  not compressed  gain
in-memory           20000       400.1       797.7           49.8%
in-memory           2000        371.4       805.7           53.9%
not in-memory       20000       8537        17436           51.0%
not in-memory       2000        8152        17820           54.3%

I find it very interesting that the gains are identical even when the
tapes should fit in memory. My guess is that for some reason the OS is
flushing those to disk anyway. In fact, watching gstat during a run, I
do see write activity hitting the drives. So if there was some way to
tune that behavior, the in-memory case would probably be much, much
faster. Anyone know FreeBSD well enough to suggest how to change this?
Anyone want to test on linux and see if the results are the same? This
could indicate that it might be advantageous to attempt an in-memory
sort with compressed data before spilling that compressed data to
disk...

As for CPU utilization, it was ~33% with compression and ~13% without.
That tells me that CPU could become a factor if everything was truely in
memory (including the table we were reading from), but if that's the
case there's a good chance that we wouldn't even be switching to an
on-disk sort. If everything isn't in memory then you're likely to be IO
bound anyway...
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: error-free disabling of individual child partition
Next
From: korry
Date:
Subject: Re: file-locking and postmaster.pid