Re: Client/Server compression? - Mailing list pgsql-hackers

From Greg Copeland
Subject Re: Client/Server compression?
Date
Msg-id 1016251761.24597.66.camel@mouse.copelandconsulting.net
Whole thread Raw
In response to Re: Client/Server compression?  (Kyle <kaf@nwlink.com>)
Responses Re: Client/Server compression?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, 2002-03-15 at 19:44, Kyle wrote:
[snip]

> Wouldn't Tom's suggestion of riding on top of ssh would give similar
> results?  Anyway, it'd probably be a good proof of concept of whether
> or not it's worth the effort.  And that brings up the question: how
> would you measure the benefit?  I'd assume you'd get a good cut in
> network traffic, but you'll take a hit in cpu time.  What's an
> acceptable tradeoff?

Good question.  I've been trying to think of meaningful testing methods,
however, I can still think of reasons all day long where it's not an
issue of a "tradeoff".  Simply put, if you have a low bandwidth
connection, as long as there are extra cycles available on the server,
who really cares...except for the guy at the end of the slow connection.

As for SSH, well, that should be rather obvious.  It often is simply not
available.  While SSH is nice, I can think of many situations this is a
win/win.  At least in business settings...where I'm assuming the goal is
to get Postgres into.  Also, along those lines, if SSH is the answer,
then surely the SSL support should be removed too...as SSH provides for
encryption too.  Simply put, removing SSL support makes about as much
sense as asserting that SSH is the final compression solution.

Also, it keeps being stated that a tangible tradeoff between CPU and
bandwidth must be realized.  This is, of course, a false assumption.
Simply put, if you need bandwidth, you need bandwidth.  Its need is not
a function of CPU, rather, it's a lack of bandwidth.  Having said that,
I of course would still like to have something meaningful which reveals
the impact on CPU and bandwidth.

I'm talking about something that would be optional.  So, what's the cost
of having a little extra optional code in place?  The only issue, best I
can tell, is can it be implemented in a backward compatible manner.

>
> That's one reason I was thinking about the toast stuff.  If the
> backend could serve toast, you'd get an improvement in server to
> client network traffic without the server spending cpu time on
> compression since the data has previously compressed.
>
> Let me know if this is feasible (or slap me if this is how things
> already are): when the backend detoasts data, keep both copies in
> memory.  When it comes time to put data on the wire, instead of
> putting the whole enchilada down give the client the compressed toast
> instead.  And yeah, I guess this would require a protocol change to
> flag the compressed data.  But it seems like a way to leverage work
> already done.
>

I agree with that, however, I'm guessing that implementation would
require a significantly larger effort than what I'm suggesting...then
again, probably because I'm not aware of all the code yet.  Pretty much,
the basic implementation could be in place by the end of this weekend
with only a couple hours worth of work...and then, mostly because I
still don't know lots of the code.  The changes you are talking about is
going to require not only protocol changes but changes at several layers
within the engine.

Of course, something else to keep in mind is that using the TOAST
solution requires that TOAST already be in use.  What I'm suggesting
benefits (size wise) all types of data being sent back to a client.

Greg


pgsql-hackers by date:

Previous
From: "Lance Ellinghaus"
Date:
Subject: Re: User Level Lock question
Next
From: Greg Copeland
Date:
Subject: Re: User Level Lock question