Re: Client/Server compression? - Mailing list pgsql-hackers

From Greg Copeland
Subject Re: Client/Server compression?
Date
Msg-id 1016218040.24597.15.camel@mouse.copelandconsulting.net
Whole thread Raw
In response to Re: Client/Server compression?  ("Arguile" <arguile@lucentstudios.com>)
Responses Re: Client/Server compression?  (Jan Wieck <janwieck@yahoo.com>)
Re: Client/Server compression?  (Kyle <kaf@nwlink.com>)
List pgsql-hackers
On Thu, 2002-03-14 at 14:03, Arguile wrote:

[snip]

> I'm sceptical of the benefit such compressions would provide in this setting
> though. We're dealing with sets that would have to be compressed every time
> (no caching) which might be a bit expensive on a database server. Having it
> as a default off option for psql migtht be nice, but I wonder if it's worth
> the time, effort, and cpu cycles.
>

I dunno.  That's a good question.  For now, I'm making what tends to be
a safe assumption (opps...that word), that most database servers will be
I/O bound rather than CPU bound.  *IF* that assumption hold true, it
sounds like it may make even more sense to implement this.  I do know
that in the past, I've seen 90+% compression ratios on many databases
and 50% - 90% compression ratios on result sets using tunneled
compression schemes (which were compressing things other than datasets
which probably hurt overall compression ratios).  Depending on the work
load and the available resources on a database system, it's possible
that latency could actually be reduced depending on where you measure
this.  That is, do you measure latency as first packet back to remote or
last packet back to remote.  If you use last packet, compression may
actually win.

My current thoughts are to allow for enabled/disabled compression and
variable compression settings (1-9) within a database configuration.
Worse case, it may be fun to implement and I'm thinking there may
actually be some surprises as an end result if it's done properly.

In looking at the communication code, it looks like only an 8k buffer is
used.  I'm currently looking to bump this up to 32k as most OS's tend to
have a sweet throughput spot with buffer sizes between 32k and 64k.
Others, depending on the devices in use, like even bigger buffers.
Because of the fact that this may be a minor optimization, especially on
a heavily loaded server, we may want to consider making this a
configurable parameter.

Greg




pgsql-hackers by date:

Previous
From: "Mikheev, Vadim"
Date:
Subject: Re: [BUGS] Bug #613: Sequence values fall back to previously chec
Next
From: Greg Copeland
Date:
Subject: Re: Client/Server compression?