Thread: Re: Question: merit / feasibility of compressing frontend

Re: Question: merit / feasibility of compressing frontend

From
"Joshua D. Drake"
Date:
Hello,

  All due respect Tom, I am not asking you to. We (CMD) have specific
instances of projects that will require this feature. I have also spoke
with others that have requested that we do something like this for their
projects, although we will not benefit from them. This is why I have
authorized my programmer to implement the feature.

  We see a benefit, in compressing result sets for transfer to clients. In
a lot of instances it would take less time to compress and decompress a
result set, than to actually transfer the result set across the wire in
plain text.

  If you are dealing with 1 meg of text, across a distributed application
where the client connect via a VPN at 56k, we are talking 4 minutes. If we
compress and send it across that could be 30 seconds (mileage will vary).

  Besides, we are not asking the PostgreSQL team to implement the feature,
just to help us understand the existing code a little better (which I
realize now, my budding programmer did not word very well), so that we may
implement it within our code base.

Sincerely,

Joshua D. Drake



We are not asking the PostgreSQL team to do so.

On Tue, 16 Jul 2002, Tom Lane wrote:

> "Joshua D. Drake" <jd@commandprompt.com> writes:
> >    There is a real commercial need, when dealing with VPN's, remote
> > users, and web based distributed applications for something like this.
>
> This unsubstantiated opinion doesn't really do much to change my
> opinion.  We have seen maybe two or three prior requests for compression
> (which does not qualify as a groundswell); furthermore they were all "it
> would be nice if..." handwaving, with no backup data to convince anyone
> that any real performance gain would emerge in common scenarios.  So I'm
> less than eager to buy into the portability and interoperability
> pitfalls that are likely to emerge from requiring clients and servers to
> have zlib.
>
>             regards, tom lane
>




Re: Question: merit / feasibility of compressing frontend

From
Bruno Wolff III
Date:
On Tue, Jul 16, 2002 at 01:59:10 -0700,
  "Joshua D. Drake" <jd@commandprompt.com> wrote:
>
>   If you are dealing with 1 meg of text, across a distributed application
> where the client connect via a VPN at 56k, we are talking 4 minutes. If we
> compress and send it across that could be 30 seconds (mileage will vary).

Shouldn't the VPN be doing compression?

Re: Question: merit / feasibility of compressing frontend

From
Doug McNaught
Date:
Bruno Wolff III <bruno@wolff.to> writes:

> On Tue, Jul 16, 2002 at 01:59:10 -0700,
>   "Joshua D. Drake" <jd@commandprompt.com> wrote:
> >
> >   If you are dealing with 1 meg of text, across a distributed application
> > where the client connect via a VPN at 56k, we are talking 4 minutes. If we
> > compress and send it across that could be 30 seconds (mileage will vary).
>
> Shouldn't the VPN be doing compression?

Most VPNs (eg ones based on IPsec) work at the IP packet level, with
no knowledge of the streams at higher levels.  I don't think the IPsec
standard addresses compression at all--that's supposed to be handled
at the link layer (eg PPP) or at higher levels.

Even if it were there, packet-by-packet compression, or that provided
by a 56K modem link, isn't going to give you nearly as big a win as
compressing at the TCP stream level, where there is much more
redundancy to take advantage of, and you don't have things like packet
headers polluting the compression dictionary.

I'm not advocating zlib-in-PG, but it does seem that some people would
find it useful.

-Doug

Re: Question: merit / feasibility of compressing frontend

From
Bruno Wolff III
Date:
On Tue, Jul 16, 2002 at 12:13:14 -0400,
  Doug McNaught <doug@wireboard.com> wrote:
>
> Most VPNs (eg ones based on IPsec) work at the IP packet level, with
> no knowledge of the streams at higher levels.  I don't think the IPsec
> standard addresses compression at all--that's supposed to be handled
> at the link layer (eg PPP) or at higher levels.

That can't be right. Once the data is encrypted, you won't be able to
compress it. That is why it is useful for the VPN software to be able
to do it.

> Even if it were there, packet-by-packet compression, or that provided
> by a 56K modem link, isn't going to give you nearly as big a win as
> compressing at the TCP stream level, where there is much more
> redundancy to take advantage of, and you don't have things like packet
> headers polluting the compression dictionary.

Maybe a generic compression tool could be put into the path without having
to change either Postgres or your VPN software.

Re: Question: merit / feasibility of compressing frontend

From
Doug McNaught
Date:
Bruno Wolff III <bruno@wolff.to> writes:

> On Tue, Jul 16, 2002 at 12:13:14 -0400,
>   Doug McNaught <doug@wireboard.com> wrote:
> >
> > Most VPNs (eg ones based on IPsec) work at the IP packet level, with
> > no knowledge of the streams at higher levels.  I don't think the IPsec
> > standard addresses compression at all--that's supposed to be handled
> > at the link layer (eg PPP) or at higher levels.
>
> That can't be right. Once the data is encrypted, you won't be able to
> compress it. That is why it is useful for the VPN software to be able
> to do it.

True enough, but my point below still stands--it just makes a lot more
sense to do it up at the stream level, if you have one.

> > Even if it were there, packet-by-packet compression, or that provided
> > by a 56K modem link, isn't going to give you nearly as big a win as
> > compressing at the TCP stream level, where there is much more
> > redundancy to take advantage of, and you don't have things like packet
> > headers polluting the compression dictionary.
>
> Maybe a generic compression tool could be put into the path without having
> to change either Postgres or your VPN software.

SSH with compression enabled works fairly well for this, but the OP
didn't see the point of using it when he already had a VPN going.

The idea of a generic "compression tunnel" (without the SSH overhead)
is nice, but I've never seen one.  Wouldn't be that hard to write, I'd
think.

I think the big obstacle to putting compression into PG is needing to
extend the FE/BE protocol for negotiating compression, and the possible
client compatibility issues that raises.  We already have SSL
negotiation working, though...

-Doug

Re: Question: merit / feasibility of compressing frontend

From
Tom Lane
Date:
Doug McNaught <doug@wireboard.com> writes:
> I think the big obstacle to putting compression into PG is needing to
> extend the FE/BE protocol for negotiating compression, and the possible
> client compatibility issues that raises.  We already have SSL
> negotiation working, though...

Yup.  Seems like a more useful exercise would be to lobby the SSL people
to include compression as an option in SSL connections.  That would
solve the problem not only for PG, but every other application that uses
SSL ...

            regards, tom lane

Re: Question: merit / feasibility of compressing frontend

From
Justin Clift
Date:
Hi Tom,

Tom Lane wrote:
>
> Doug McNaught <doug@wireboard.com> writes:
> > I think the big obstacle to putting compression into PG is needing to
> > extend the FE/BE protocol for negotiating compression, and the possible
> > client compatibility issues that raises.  We already have SSL
> > negotiation working, though...
>
> Yup.  Seems like a more useful exercise would be to lobby the SSL people
> to include compression as an option in SSL connections.  That would
> solve the problem not only for PG, but every other application that uses
> SSL ...

We can all see the merits of having a compressed data stream, especially
in those situations where the byte count if more important than a CPU
cost.

However, I'd like to point out that SSL isn't feasible to use in all
situations, so having to enable SSL to gain compression would be a pain.

If someone's willing to put the time into this, then compression without
SSL feels like a good idea.  Not everyone uses SSL.  Bad network latency
has a very undesirable effect on the establishment of SSL connections,
and this is especially of interest in those cases where people need to
get short "bursty" amounts of SQL data across a connection as fast as
possible.  Aka, client using a frontend app to remote databases over a
modem, and not using persistent connections.

Establishment of an individual SSL session using OpenSSL can take over a
second in this case.  Not consistently-always, but I had to time it (on
fast hardware too) for a contract recently when deciding on network
layer transports.

Hope this gives some decent food for thought.

:-)

Regards and best wishes,

Justin Clift

>                         regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
   - Indira Gandhi

Re: Question: merit / feasibility of compressing frontend

From
"Joshua D. Drake"
Date:
> If someone's willing to put the time into this, then compression without
> SSL feels like a good idea.  Not everyone uses SSL.  Bad network latency

Well, we are already putting the time into it ;). I expect to have it
complete by the end of the week. If people like we can keep in touch about
it.

Sincerely,

Joshua Drake



> has a very undesirable effect on the establishment of SSL connections,
> and this is especially of interest in those cases where people need to
> get short "bursty" amounts of SQL data across a connection as fast as
> possible.  Aka, client using a frontend app to remote databases over a
> modem, and not using persistent connections.
>
> Establishment of an individual SSL session using OpenSSL can take over a
> second in this case.  Not consistently-always, but I had to time it (on
> fast hardware too) for a contract recently when deciding on network
> layer transports.
>
> Hope this gives some decent food for thought.
>
> :-)
>
> Regards and best wishes,
>
> Justin Clift
>
> >                         regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>


Completed Compression front end

From
"Joshua D. Drake"
Date:
Hello,

  We have successfully completed the rewrite of the connection functions
(frontend and backend) to enable compression. After testing (which I
will provide numbers soon) we have found that compression is quite
usable and increases performnce for most connections. In fact unless you
are running on a 10Mb or higher it will probably help you.  We still
need to run some tests on connections that are above 384k but it is
looking quite good.

  We did not break compatibility and compression is a dynamic option
that can be used in the connection string.

Sincerely,

Joshua Drake
Command Prompt, Inc.