Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Compression and on-disk sorting
Date
Msg-id 20060518161850.GI64371@pervasive.com
Whole thread Raw
In response to Re: Compression and on-disk sorting  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Thu, May 18, 2006 at 11:34:51AM +0100, Simon Riggs wrote:
> On Tue, 2006-05-16 at 15:42 -0500, Jim C. Nasby wrote:
> > On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote:
> > > In any case, my curiousity is aroused, so I'm currently benchmarking
> > > pgbench on both a compressed and uncompressed $PGDATA/base. I'll also do
> > > some benchmarks with pg_tmp compressed.
> >  
> > Results: http://jim.nasby.net/bench.log
> > 
> > As expected, compressing $PGDATA/base was a loss. But compressing
> > pgsql_tmp and then doing some disk-based sorts did show an improvement,
> > from 366.1 seconds to 317.3 seconds, an improvement of 13.3%. This is on
> > a Windows XP laptop (Dell Latitude D600) with 512MB, so it's somewhat of
> > a worst-case scenario. On the other hand, XP's compression algorithm
> > appears to be pretty aggressive, as it cut the size of the on-disk sort
> > file from almost 700MB to 82MB. There's probably gains to be had from a
> > different compression algorithm.
> 
> Can you re-run these tests using "SELECT aid FROM accounts ..."
> "SELECT 1 ... " is of course highly compressible.
> 
> I also note that the compressed file fits within memory and may not even
> have been written out at all. That's good, but this sounds like the very
> best case of what we can hope to achieve. We need to test a whole range
> of cases to see if it is generally applicable, or only in certain cases
> - and if so which ones.
> 
> Would you be able to write up some extensive testing of Martijn's patch?
> He's followed through on your idea and written it (on -patches now...)

Yes, I'm working on that. I'd rather test his stuff than XP's
compression anyway.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: pg_dump and backslash escapes
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Compression and on-disk sorting