Thread: Question: merit / feasibility of compressing frontend <--> backend transfers w/ zlib

Hello,
   I'm new to the list, and just started working as an intern at
commandprompt.com.

   As one of my first projects I'm been asked to compress with zlib
(www.gzip.org/zlib ) data flowing from postgres clients to and
especially from the backend server.  Our first idea was to write a sort
of 'compression proxy' with a frontend and backend of its own. The
postgres client would connect to the compression frontend on their local
machine which would compress and  transfer to the compresss backend on
the server.  Decompressed requests would be forwared to the postgres
server. This idea was abandoned since: 1.) it means existing clients
would have to be reconfigured to talk to their local machine, and 2.) it
destroys host based authentication since all packets arriving at the
sever would be from the local decompressor.

   The current idea is to rewrite parts of postgres itself, both the
frontend libpq and the backend,  so that a "compress" option could be
passed by the client.  After the startup packet and authentication  all
subsequent queries and responses would be compressed (and decompressed
when received).

My questions are:  Is there any merit to this idea? i.e  would
compressing large result sets decrease the transfer time?  and,  How
easy or difficult would it be to incorporate such change into the
postgres frontend and backend source?

Any help appreciated,

Robert Flory
using psql-general@commandprompt.com



Re: Question: merit / feasibility of compressing frontend <--> backend transfers w/ zlib

From
nconway@klamath.dyndns.org (Neil Conway)
Date:
On Mon, Jul 15, 2002 at 12:01:03PM -0700, pgsql-general wrote:
>   As one of my first projects I'm been asked to compress with zlib
> (www.gzip.org/zlib ) data flowing from postgres clients to and
> especially from the backend server.  Our first idea was to write a sort
> of 'compression proxy' with a frontend and backend of its own. The
> postgres client would connect to the compression frontend on their local
> machine which would compress and  transfer to the compresss backend on
> the server.  Decompressed requests would be forwared to the postgres
> server. This idea was abandoned since: 1.) it means existing clients
> would have to be reconfigured to talk to their local machine, and 2.) it
> destroys host based authentication since all packets arriving at the
> sever would be from the local decompressor.

It also strikes me as inefficient and unnecessarily complicated.

> My questions are:  Is there any merit to this idea? i.e  would
> compressing large result sets decrease the transfer time?

I'm not too keen about it (as was Tom Lane when someone suggested it
earlier, IIRC). The vast majority of PostgreSQL installations place
both the clients and the RDBMS on the same LAN, so I'd expect
that few people would find it useful. And among those that would,
you can get that functionality in other ways (e.g. ssh forwarding,
a generic zlib tunnel if one exists -- similar to stunnel for SSL),
without needing to bloat PostgreSQL.

> How easy or difficult would it be to incorporate such change into the
> postgres frontend and backend source?

Doesn't seem like it would be very difficult, IMHO.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Does the ODBC or JDBC interface use compression?  I think these
are more likely to be used over a non-LAN connection.

The other use for compression would be for a data sync between
two database installations that are geographically distributed.The idea
is that two offices would each have a local DBMS but the link
between them is slow.  Compression could help in that case.

Compression is not all that hard to set up using port forwarding
proxies
like you thought.  In fact ssh can do it already if you specify the
"-C" option.


--- Neil Conway <nconway@klamath.dyndns.org> wrote:
> On Mon, Jul 15, 2002 at 12:01:03PM -0700, pgsql-general wrote:
> >   As one of my first projects I'm been asked to compress with zlib
> > (www.gzip.org/zlib ) data flowing from postgres clients to and
> > especially from the backend server.  Our first idea was to write a
> sort
> > of 'compression proxy' with a frontend and backend of its own. The
> > postgres client would connect to the compression frontend on their
<SNIP>

=====
Chris Albertson
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com

Re: Question: merit / feasibility of compressing frontend

From
"Joshua D. Drake"
Date:
Hello,

  Without getting into a huge debate over which implementation is better. I can suffice to say that we have seen significant demand for this solution without the obnoxiousness of ssh. SSH is great for lots of stuff, but you are adding an addition user layer application to manage. Our implementation will make it so that you literally just say compression=yes in the connection string and boom.... it's compressed.

   There is a real commercial need, when dealing with VPN's, remote users, and web based distributed applications for something like this.

Sincerely,

Joshua Drake


Chris Albertson wrote:
Does the ODBC or JDBC interface use compression?  I think these
are more likely to be used over a non-LAN connection.

The other use for compression would be for a data sync between
two database installations that are geographically distributed.The idea
is that two offices would each have a local DBMS but the link
between them is slow.  Compression could help in that case.

Compression is not all that hard to set up using port forwarding
proxies
like you thought.  In fact ssh can do it already if you specify the
"-C" option.


--- Neil Conway <nconway@klamath.dyndns.org> wrote: 
On Mon, Jul 15, 2002 at 12:01:03PM -0700, pgsql-general wrote:   
  As one of my first projects I'm been asked to compress with zlib 
(www.gzip.org/zlib ) data flowing from postgres clients to and 
especially from the backend server.  Our first idea was to write a     
sort    
of 'compression proxy' with a frontend and backend of its own. The 
postgres client would connect to the compression frontend on their     
<SNIP>

=====
Chris Albertson  Home:   310-376-1029  chrisalbertson90278@yahoo.com Cell:   310-990-7550 Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org 

Re: Question: merit / feasibility of compressing frontend

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
>    There is a real commercial need, when dealing with VPN's, remote
> users, and web based distributed applications for something like this.

This unsubstantiated opinion doesn't really do much to change my
opinion.  We have seen maybe two or three prior requests for compression
(which does not qualify as a groundswell); furthermore they were all "it
would be nice if..." handwaving, with no backup data to convince anyone
that any real performance gain would emerge in common scenarios.  So I'm
less than eager to buy into the portability and interoperability
pitfalls that are likely to emerge from requiring clients and servers to
have zlib.

            regards, tom lane

Re: Question: merit / feasibility of compressing frontend

From
Bruce Momjian
Date:
Has anyone run any tests to see if it is faster/slower.

---------------------------------------------------------------------------

Tom Lane wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
> >    There is a real commercial need, when dealing with VPN's, remote
> > users, and web based distributed applications for something like this.
>
> This unsubstantiated opinion doesn't really do much to change my
> opinion.  We have seen maybe two or three prior requests for compression
> (which does not qualify as a groundswell); furthermore they were all "it
> would be nice if..." handwaving, with no backup data to convince anyone
> that any real performance gain would emerge in common scenarios.  So I'm
> less than eager to buy into the portability and interoperability
> pitfalls that are likely to emerge from requiring clients and servers to
> have zlib.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Question: merit / feasibility of compressing frontend

From
jd@crazypenguins.postgresql.org
Date:
Hello,

  All due respect Tom, I am not asking you to. We (CMD) have specific
instances of projects that will require this feature. I have also spoke
with others that have requested that we do something like this for their
projects, although we will not benefit from them. This is why I have
authorized my programmer to implement the feature.

  We see a benefit, in compressing result sets for transfer to clients. In
a lot of instances it would take less time to compress and decompress a
result set, than to actually transfer the result set across the wire in
plain text.

  If you are dealing with 1 meg of text, across a distributed application
where the client connect via a VPN at 56k, we are talking 4 minutes. If we
compress and send it across that could be 30 seconds (mileage will vary).

  Besides, we are not asking the PostgreSQL team to implement the feature,
just to help us understand the existing code a little better (which I
realize now, my budding programmer did not word very well), so that we may
implement it within our code base.

Sincerely,

Joshua D. Drake



We are not asking the PostgreSQL team to do so.

On Tue, 16 Jul 2002, Tom Lane wrote:

> "Joshua D. Drake" <jd@commandprompt.com> writes:
> >    There is a real commercial need, when dealing with VPN's, remote
> > users, and web based distributed applications for something like this.
>
> This unsubstantiated opinion doesn't really do much to change my
> opinion.  We have seen maybe two or three prior requests for compression
> (which does not qualify as a groundswell); furthermore they were all "it
> would be nice if..." handwaving, with no backup data to convince anyone
> that any real performance gain would emerge in common scenarios.  So I'm
> less than eager to buy into the portability and interoperability
> pitfalls that are likely to emerge from requiring clients and servers to
> have zlib.
>
>             regards, tom lane
>