On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote:
> On Thu, May 18, 2006 at 10:02:44PM -0500, Jim C. Nasby wrote:
> > http://jim.nasby.net/misc/compress_sort.txt is preliminary results.
> > I've run into a slight problem in that even at a compression level of
> > -3, zlib is cutting the on-disk size of sorts by 25x. So my pgbench sort
> > test with scale=150 that was producing a 2G on-disk sort is now
> > producing a 80M sort, which obviously fits in memory. And cuts sort
> > times by more than half.
>
> I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost
> unbeleiveable. What's in the table? It would seem to imply that our
> tuple format is far more compressable than we expected.
It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a;
If the tape routines were actually storing visibility information, I'd
expect that to be pretty compressible in this case since all the tuples
were presumably created in a single transaction by pgbench.
If needs be, I could try the patch against http://stats.distributed.net,
assuming that it would apply to REL_8_1.
> Do you have any stats on CPU usage? Memory usage?
I've only been taking a look at vmstat from time-to-time, and I have yet
to see the machine get CPU-bound. Haven't really paid much attention to
memory. Is there anything in partucular you're looking for? I can log
vmstat for the next set of runs (with a scaling factor of 10000). I plan
on doing those runs tonight...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461