Thread: Re: Question: merit / feasibility of compressing frontend
Hello, All due respect Tom, I am not asking you to. We (CMD) have specific instances of projects that will require this feature. I have also spoke with others that have requested that we do something like this for their projects, although we will not benefit from them. This is why I have authorized my programmer to implement the feature. We see a benefit, in compressing result sets for transfer to clients. In a lot of instances it would take less time to compress and decompress a result set, than to actually transfer the result set across the wire in plain text. If you are dealing with 1 meg of text, across a distributed application where the client connect via a VPN at 56k, we are talking 4 minutes. If we compress and send it across that could be 30 seconds (mileage will vary). Besides, we are not asking the PostgreSQL team to implement the feature, just to help us understand the existing code a little better (which I realize now, my budding programmer did not word very well), so that we may implement it within our code base. Sincerely, Joshua D. Drake We are not asking the PostgreSQL team to do so. On Tue, 16 Jul 2002, Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: > > There is a real commercial need, when dealing with VPN's, remote > > users, and web based distributed applications for something like this. > > This unsubstantiated opinion doesn't really do much to change my > opinion. We have seen maybe two or three prior requests for compression > (which does not qualify as a groundswell); furthermore they were all "it > would be nice if..." handwaving, with no backup data to convince anyone > that any real performance gain would emerge in common scenarios. So I'm > less than eager to buy into the portability and interoperability > pitfalls that are likely to emerge from requiring clients and servers to > have zlib. > > regards, tom lane >
On Tue, Jul 16, 2002 at 01:59:10 -0700, "Joshua D. Drake" <jd@commandprompt.com> wrote: > > If you are dealing with 1 meg of text, across a distributed application > where the client connect via a VPN at 56k, we are talking 4 minutes. If we > compress and send it across that could be 30 seconds (mileage will vary). Shouldn't the VPN be doing compression?
Bruno Wolff III <bruno@wolff.to> writes: > On Tue, Jul 16, 2002 at 01:59:10 -0700, > "Joshua D. Drake" <jd@commandprompt.com> wrote: > > > > If you are dealing with 1 meg of text, across a distributed application > > where the client connect via a VPN at 56k, we are talking 4 minutes. If we > > compress and send it across that could be 30 seconds (mileage will vary). > > Shouldn't the VPN be doing compression? Most VPNs (eg ones based on IPsec) work at the IP packet level, with no knowledge of the streams at higher levels. I don't think the IPsec standard addresses compression at all--that's supposed to be handled at the link layer (eg PPP) or at higher levels. Even if it were there, packet-by-packet compression, or that provided by a 56K modem link, isn't going to give you nearly as big a win as compressing at the TCP stream level, where there is much more redundancy to take advantage of, and you don't have things like packet headers polluting the compression dictionary. I'm not advocating zlib-in-PG, but it does seem that some people would find it useful. -Doug
On Tue, Jul 16, 2002 at 12:13:14 -0400, Doug McNaught <doug@wireboard.com> wrote: > > Most VPNs (eg ones based on IPsec) work at the IP packet level, with > no knowledge of the streams at higher levels. I don't think the IPsec > standard addresses compression at all--that's supposed to be handled > at the link layer (eg PPP) or at higher levels. That can't be right. Once the data is encrypted, you won't be able to compress it. That is why it is useful for the VPN software to be able to do it. > Even if it were there, packet-by-packet compression, or that provided > by a 56K modem link, isn't going to give you nearly as big a win as > compressing at the TCP stream level, where there is much more > redundancy to take advantage of, and you don't have things like packet > headers polluting the compression dictionary. Maybe a generic compression tool could be put into the path without having to change either Postgres or your VPN software.
Bruno Wolff III <bruno@wolff.to> writes: > On Tue, Jul 16, 2002 at 12:13:14 -0400, > Doug McNaught <doug@wireboard.com> wrote: > > > > Most VPNs (eg ones based on IPsec) work at the IP packet level, with > > no knowledge of the streams at higher levels. I don't think the IPsec > > standard addresses compression at all--that's supposed to be handled > > at the link layer (eg PPP) or at higher levels. > > That can't be right. Once the data is encrypted, you won't be able to > compress it. That is why it is useful for the VPN software to be able > to do it. True enough, but my point below still stands--it just makes a lot more sense to do it up at the stream level, if you have one. > > Even if it were there, packet-by-packet compression, or that provided > > by a 56K modem link, isn't going to give you nearly as big a win as > > compressing at the TCP stream level, where there is much more > > redundancy to take advantage of, and you don't have things like packet > > headers polluting the compression dictionary. > > Maybe a generic compression tool could be put into the path without having > to change either Postgres or your VPN software. SSH with compression enabled works fairly well for this, but the OP didn't see the point of using it when he already had a VPN going. The idea of a generic "compression tunnel" (without the SSH overhead) is nice, but I've never seen one. Wouldn't be that hard to write, I'd think. I think the big obstacle to putting compression into PG is needing to extend the FE/BE protocol for negotiating compression, and the possible client compatibility issues that raises. We already have SSL negotiation working, though... -Doug
Doug McNaught <doug@wireboard.com> writes: > I think the big obstacle to putting compression into PG is needing to > extend the FE/BE protocol for negotiating compression, and the possible > client compatibility issues that raises. We already have SSL > negotiation working, though... Yup. Seems like a more useful exercise would be to lobby the SSL people to include compression as an option in SSL connections. That would solve the problem not only for PG, but every other application that uses SSL ... regards, tom lane
Hi Tom, Tom Lane wrote: > > Doug McNaught <doug@wireboard.com> writes: > > I think the big obstacle to putting compression into PG is needing to > > extend the FE/BE protocol for negotiating compression, and the possible > > client compatibility issues that raises. We already have SSL > > negotiation working, though... > > Yup. Seems like a more useful exercise would be to lobby the SSL people > to include compression as an option in SSL connections. That would > solve the problem not only for PG, but every other application that uses > SSL ... We can all see the merits of having a compressed data stream, especially in those situations where the byte count if more important than a CPU cost. However, I'd like to point out that SSL isn't feasible to use in all situations, so having to enable SSL to gain compression would be a pain. If someone's willing to put the time into this, then compression without SSL feels like a good idea. Not everyone uses SSL. Bad network latency has a very undesirable effect on the establishment of SSL connections, and this is especially of interest in those cases where people need to get short "bursty" amounts of SQL data across a connection as fast as possible. Aka, client using a frontend app to remote databases over a modem, and not using persistent connections. Establishment of an individual SSL session using OpenSSL can take over a second in this case. Not consistently-always, but I had to time it (on fast hardware too) for a contract recently when deciding on network layer transports. Hope this gives some decent food for thought. :-) Regards and best wishes, Justin Clift > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
> If someone's willing to put the time into this, then compression without > SSL feels like a good idea. Not everyone uses SSL. Bad network latency Well, we are already putting the time into it ;). I expect to have it complete by the end of the week. If people like we can keep in touch about it. Sincerely, Joshua Drake > has a very undesirable effect on the establishment of SSL connections, > and this is especially of interest in those cases where people need to > get short "bursty" amounts of SQL data across a connection as fast as > possible. Aka, client using a frontend app to remote databases over a > modem, and not using persistent connections. > > Establishment of an individual SSL session using OpenSSL can take over a > second in this case. Not consistently-always, but I had to time it (on > fast hardware too) for a contract recently when deciding on network > layer transports. > > Hope this gives some decent food for thought. > > :-) > > Regards and best wishes, > > Justin Clift > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > >
Hello, We have successfully completed the rewrite of the connection functions (frontend and backend) to enable compression. After testing (which I will provide numbers soon) we have found that compression is quite usable and increases performnce for most connections. In fact unless you are running on a 10Mb or higher it will probably help you. We still need to run some tests on connections that are above 384k but it is looking quite good. We did not break compatibility and compression is a dynamic option that can be used in the connection string. Sincerely, Joshua Drake Command Prompt, Inc.