Thread: libpq compression
Hi hackers,
One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.
Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing libpq traffic.
I have implemented some prototype implementation of it (patch is attached).
To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.
Below are some results:
Compression ratio (raw->compressed):
There is no mistyping: libzstd compress COPY data about 10 times better than libz, with wonderful compression ratio 63.
Influence on execution time is minimal (I have tested local configuration when client and server are at the same host):
One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.
Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing libpq traffic.
I have implemented some prototype implementation of it (patch is attached).
To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.
Below are some results:
Compression ratio (raw->compressed):
libz (level=1) | libzstd (level=1) | |
pgbench -i -s 10 | 16997209->2536330 | 16997209->268077 |
pgbench -t 100000 -S | 6289036->1523862 6600338<-900293 | 6288933->1777400 6600338<-1000318 |
There is no mistyping: libzstd compress COPY data about 10 times better than libz, with wonderful compression ratio 63.
Influence on execution time is minimal (I have tested local configuration when client and server are at the same host):
no compression | libz (level=1) | libzstd (level=1) | |
pgbench -i -s 10 | 1.552 | 1.572 | 1.611 |
pgbench -t 100000 -S | 4.482 | 4.926 | 4.877 |
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
> On 30 March 2018 at 14:53, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
> Hi hackers,
> One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
> and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
> I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.
>
> Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing libpq traffic.
>
> I have implemented some prototype implementation of it (patch is attached).
> To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
> I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.
I'm a bit confused why there was no reply to this. I mean, it wasn't sent on
1st April, the patch still can be applied on top of the master branch and looks
like it even works.
I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the data stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering why it
didn't start at least a discussion about how it can be implemented. Do I miss
something?
On 15.05.2018 13:23, Dmitry Dolgov wrote:
> On 30 March 2018 at 14:53, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:> Hi hackers,> One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions> and query results are very large because of JSON columns. Them actually do not need encryption, just compression.> I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.>> Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing libpq traffic.>> I have implemented some prototype implementation of it (patch is attached).> To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.> I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.I'm a bit confused why there was no reply to this. I mean, it wasn't sent on1st April, the patch still can be applied on top of the master branch and lookslike it even works.I assume the main concern her is that it's implemented in a rather notextensible way. Also, if I understand correctly, it compresses the data streamin both direction server <-> client, not sure if there is any value incompressing what a client sends to a server. But still I'm wondering why itdidn't start at least a discussion about how it can be implemented. Do I misssomething?
Implementation of libpq compression will be included in next release of PgProEE.
Looks like community is not so interested in this patch. Frankly speaking I do not understand why.
Compression of libpq traffic can significantly increase speed of:
1. COPY
2. Replication (both streaming and logical)
3. Queries returning large results sets (for example JSON) through slow connections.
It is possible to compress libpq traffic using SSL compression. But SSL compression is unsafe and deteriorated feature.
Yes, this patch is not extensible: it can use either zlib either zstd. Unfortunately internal Postgres compression pglz doesn't provide streaming API.
May be it is good idea to combine it with Ildus patch (custom compression methods): https://commitfest.postgresql.org/18/1294/
In this case it will be possible to use any custom compression algorithm. But we need to design and implement streaming API for pglz and other compressors.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 05/15/2018 08:53 AM, Konstantin Knizhnik wrote: > > > On 15.05.2018 13:23, Dmitry Dolgov wrote: >> > On 30 March 2018 at 14:53, Konstantin Knizhnik >> <k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote: >> > Hi hackers, >> > One of our customers was managed to improve speed about 10 times by >> using SSL compression for the system where client and servers are >> located in different geographical regions >> > and query results are very large because of JSON columns. Them >> actually do not need encryption, just compression. >> > I expect that it is not the only case where compression of libpq >> protocol can be useful. Please notice that Postgres replication is >> also using libpq protocol. >> > >> > Taken in account that vulnerability was found in SSL compression >> and so SSLComppression is considered to be deprecated and insecure >> (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), >> it will be nice to have some alternative mechanism of reducing libpq >> traffic. >> > >> > I have implemented some prototype implementation of it (patch is >> attached). >> > To use zstd compression, Postgres should be configured with >> --with-zstd. Otherwise compression will use zlib unless it is >> disabled by --without-zlib option. >> > I have added compression=on/off parameter to connection string and >> -Z option to psql and pgbench utilities. >> >> I'm a bit confused why there was no reply to this. I mean, it wasn't >> sent on >> 1st April, the patch still can be applied on top of the master branch >> and looks >> like it even works. >> >> I assume the main concern her is that it's implemented in a rather not >> extensible way. Also, if I understand correctly, it compresses the >> data stream >> in both direction server <-> client, not sure if there is any value in >> compressing what a client sends to a server. But still I'm wondering >> why it >> didn't start at least a discussion about how it can be implemented. >> Do I miss >> something? > > Implementation of libpq compression will be included in next release > of PgProEE. > Looks like community is not so interested in this patch. Frankly > speaking I do not understand why. > Compression of libpq traffic can significantly increase speed of: > 1. COPY > 2. Replication (both streaming and logical) > 3. Queries returning large results sets (for example JSON) through > slow connections. > > It is possible to compress libpq traffic using SSL compression. But > SSL compression is unsafe and deteriorated feature. > > Yes, this patch is not extensible: it can use either zlib either zstd. > Unfortunately internal Postgres compression pglz doesn't provide > streaming API. > May be it is good idea to combine it with Ildus patch (custom > compression methods): https://commitfest.postgresql.org/18/1294/ > In this case it will be possible to use any custom compression > algorithm. But we need to design and implement streaming API for pglz > and other compressors. > > I'm sure there is plenty of interest in this. However, you guys need to understand where we are in the development cycle. We're trying to wrap up Postgres 11, which was feature frozen before this patch ever landed. So it's material for Postgres 12. That means it will probably need to wait a little while before it gets attention. It doesn't mean nobody is interested. I disagree with Dmitry about compressing in both directions - I can think of plenty of good cases where we would want to compress traffic from the client. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 15 May 2018 at 21:36, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:
> To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
> I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.
I'm a bit confused why there was no reply to this. I mean, it wasn't sent on
1st April, the patch still can be applied on top of the master branch and looks
like it even works.
I assume the main concern her is that it's implemented in a rather not
extensible way. Also, if I understand correctly, it compresses the data stream
in both direction server <-> client, not sure if there is any value in
compressing what a client sends to a server. But still I'm wondering why it
didn't start at least a discussion about how it can be implemented. Do I miss
something?
Implementation of libpq compression will be included in next release of PgProEE.
Looks like community is not so interested in this patch. Frankly speaking I do not understand why.
I'm definitely very interested, and simply missed the post.
I'll talk with some team mates as we're doing some PG12 planning now.
Yes, this patch is not extensible: it can use either zlib either zstd. Unfortunately internal Postgres compression pglz doesn't provide streaming API.
May be it is good idea to combine it with Ildus patch (custom compression methods): https://commitfest.postgresql.org/18/1294/
Given the history of issues with attempting custom/pluggable compression for toast etc, I really wouldn't want to couple those up.
pglz wouldn't make much sense for protocol compression anyway, except maybe for fast local links where it was worth a slight compression overhead but not the cpu needed for gzip. I don't think it's too exciting. zlib/gzip is likely the sweet spot for the reasonable future for protocol compression, or a heck of a lot better than what we have, anyway.
We should make sure the protocol part is extensible, but the implementation doesn't need to be pluggable.
In this case it will be possible to use any custom compression algorithm. But we need to design and implement streaming API for pglz and other compressors.
I'm sure there is plenty of interest in this. However, you guys need to understand where we are in the development cycle. We're trying to wrap up Postgres 11, which was feature frozen before this patch ever landed. So it's material for Postgres 12. That means it will probably need to wait a little while before it gets attention. It doesn't mean nobody is interested.
I disagree with Dmitry about compressing in both directions - I can think of plenty of good cases where we would want to compress traffic from the client.
Agreed. The most obvious case being COPY, but there's also big bytea values, etc.
2018-05-15 9:53 GMT-03:00 Konstantin Knizhnik <k.knizhnik@postgrespro.ru>: > Looks like community is not so interested in this patch. Frankly speaking I > do not understand why. > AFAICS the lack of replies is due to feature freeze. I'm pretty sure people are interested in this topic (at least I am). Did you review a previous discussion [1] about this? I did a prototype a few years ago. I didn't look at your patch yet. I'll do in a few weeks. Please add your patch to the next CF [2]. [1] https://www.postgresql.org/message-id/4FD9698F.2090407%40timbira.com [2] https://commitfest.postgresql.org/18/ -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Hello!
I have noticed that psql --help lack -Z|--compression option.
Also it would be nice to have option like --compression-level in psql and pgbench.
On 03/30/2018 03:53 PM, Konstantin Knizhnik wrote:
Hi hackers,
One of our customers was managed to improve speed about 10 times by using SSL compression for the system where client and servers are located in different geographical regions
and query results are very large because of JSON columns. Them actually do not need encryption, just compression.
I expect that it is not the only case where compression of libpq protocol can be useful. Please notice that Postgres replication is also using libpq protocol.
Taken in account that vulnerability was found in SSL compression and so SSLComppression is considered to be deprecated and insecure (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), it will be nice to have some alternative mechanism of reducing libpq traffic.
I have implemented some prototype implementation of it (patch is attached).
To use zstd compression, Postgres should be configured with --with-zstd. Otherwise compression will use zlib unless it is disabled by --without-zlib option.
I have added compression=on/off parameter to connection string and -Z option to psql and pgbench utilities.
Below are some results:
Compression ratio (raw->compressed):
libz (level=1) libzstd (level=1) pgbench -i -s 10 16997209->2536330 16997209->268077 pgbench -t 100000 -S 6289036->1523862
6600338<-9002936288933->1777400
6600338<-1000318
There is no mistyping: libzstd compress COPY data about 10 times better than libz, with wonderful compression ratio 63.
Influence on execution time is minimal (I have tested local configuration when client and server are at the same host):
no compression libz (level=1) libzstd (level=1) pgbench -i -s 10 1.552 1.572 1.611 pgbench -t 100000 -S 4.482 4.926 4.877 -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
-- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 16.05.2018 18:09, Grigory Smolkin wrote: > > Hello! > I have noticed that psql --help lack -Z|--compression option. > Also it would be nice to have option like --compression-level in psql > and pgbench. > Thank you for this notice. Updated and rebased patch is attached. Concerning specification of compression level: I have made many experiments with different data sets and both zlib/zstd and in both cases using compression level higher than default doesn't cause some noticeable increase of compression ratio, but quite significantly reduce speed. Moreover, for "pgbench -i" zstd provides better compression ratio (63 times!) with compression level 1 than with with largest recommended compression level 22! This is why I decided not to allow user to choose compression level.
Attachment
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Thank you for this notice. > Updated and rebased patch is attached. Hi Konstantin, Seems very useful. +1. + rc = inflate(&zs->rx, Z_SYNC_FLUSH); + if (rc != Z_OK) + { + return ZPQ_DECOMPRESS_ERROR; + } Does this actually guarantee that zs->rx.msg is set to a string? I looked at some documentation here: https://www.zlib.net/manual.html It looks like return value Z_DATA_ERROR means that msg is set, but for the other error codes Z_STREAM_ERROR, Z_BUF_ERROR, Z_MEM_ERROR it doesn't explicitly say that. From a casual glance at https://github.com/madler/zlib/blob/master/inflate.c I think it might be set to Z_NULL and then never set to a string except in the mode = BAD paths that produce the Z_DATA_ERROR return code. That's interesting because later we do this: + if (r == ZPQ_DECOMPRESS_ERROR) + { + ereport(COMMERROR, + (errcode_for_socket_access(), + errmsg("Failed to decompress data: %s", zpq_error(PqStream)))); + return EOF; ... where zpq_error() returns zs->rx.msg. That might crash or show "(null)" depending on libc. Also, message style: s/F/f/ +ssize_t zpq_read(ZpqStream* zs, void* buf, size_t size, size_t* processed) Code style: We write "Type *foo", not "Type* var". We put the return type of a function definition on its own line. It looks like there is at least one place where zpq_stream.o's symbols are needed but it isn't being linked in, so the build fails in some ecpg stuff reached by make check-world: gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -pthread -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS test1.o -L../../../../../src/port -L../../../../../src/common -L../../ecpglib -lecpg -L../../pgtypeslib -lpgtypes -L../../../../../src/interfaces/libpq -lpq -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags -lpgcommon -lpgport -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm -o test1 ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_free' ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_error' ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_read' ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_buffered' ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_create' ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_write' -- Thomas Munro http://www.enterprisedb.com
On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Concerning specification of compression level: I have made many experiments > with different data sets and both zlib/zstd and in both cases using > compression level higher than default doesn't cause some noticeable increase > of compression ratio, but quite significantly reduce speed. Moreover, for > "pgbench -i" zstd provides better compression ratio (63 times!) with > compression level 1 than with with largest recommended compression level 22! > This is why I decided not to allow user to choose compression level. Speaking of configuration, are you planning to support multiple compression libraries at the same time? It looks like the current patch implicitly requires client and server to use the same configure option, without any attempt to detect or negotiate. Do I guess correctly that a library mismatch would produce an incomprehensible corrupt stream message? -- Thomas Munro http://www.enterprisedb.com
On Tue, Jun 05, 2018 at 06:04:21PM +1200, Thomas Munro wrote: > On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik > Speaking of configuration, are you planning to support multiple > compression libraries at the same time? It looks like the current > patch implicitly requires client and server to use the same configure > option, without any attempt to detect or negotiate. Do I guess > correctly that a library mismatch would produce an incomprehensible > corrupt stream message? I just had a quick look at this patch, lured by the smell of your latest messages... And it seems to me that this patch needs a heavy amount of work as presented. There are a couple of things which are not really nice, like forcing the presentation of the compression option in the startup packet to begin with. The high-jacking around secure_read() is not nice either as it is aimed at being a rather high-level API on top of the method used with the backend. On top of adding some documentation, I think that you could get some inspiration from the recent GSSAPI encription patch which has been submitted again for v12 cycle, which has spent a large amount of time designing its set of options. -- Michael
Attachment
On 05.06.2018 08:26, Thomas Munro wrote: > On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Thank you for this notice. >> Updated and rebased patch is attached. > Hi Konstantin, > > Seems very useful. +1. > > + rc = inflate(&zs->rx, Z_SYNC_FLUSH); > + if (rc != Z_OK) > + { > + return ZPQ_DECOMPRESS_ERROR; > + } > > Does this actually guarantee that zs->rx.msg is set to a string? I > looked at some documentation here: > > https://www.zlib.net/manual.html > > It looks like return value Z_DATA_ERROR means that msg is set, but for > the other error codes Z_STREAM_ERROR, Z_BUF_ERROR, Z_MEM_ERROR it > doesn't explicitly say that. From a casual glance at > https://github.com/madler/zlib/blob/master/inflate.c I think it might > be set to Z_NULL and then never set to a string except in the mode = > BAD paths that produce the Z_DATA_ERROR return code. That's > interesting because later we do this: > > + if (r == ZPQ_DECOMPRESS_ERROR) > + { > + ereport(COMMERROR, > + (errcode_for_socket_access(), > + errmsg("Failed to decompress data: %s", zpq_error(PqStream)))); > + return EOF; > > ... where zpq_error() returns zs->rx.msg. That might crash or show > "(null)" depending on libc. > > Also, message style: s/F/f/ > > +ssize_t zpq_read(ZpqStream* zs, void* buf, size_t size, size_t* processed) > > Code style: We write "Type *foo", not "Type* var". We put the return > type of a function definition on its own line. > > It looks like there is at least one place where zpq_stream.o's symbols > are needed but it isn't being linked in, so the build fails in some > ecpg stuff reached by make check-world: > > gcc -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels > -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing > -fwrapv -fexcess-precision=standard -g -O2 -pthread -D_REENTRANT > -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS test1.o > -L../../../../../src/port -L../../../../../src/common -L../../ecpglib > -lecpg -L../../pgtypeslib -lpgtypes > -L../../../../../src/interfaces/libpq -lpq -Wl,--as-needed > -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags -lpgcommon > -lpgport -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm -o test1 > ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_free' > ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_error' > ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_read' > ../../../../../src/interfaces/libpq/libpq.so: undefined reference to > `zpq_buffered' > ../../../../../src/interfaces/libpq/libpq.so: undefined reference to > `zpq_create' > ../../../../../src/interfaces/libpq/libpq.so: undefined reference to `zpq_write' > Hi Thomas, Thank you for review. Updated version of the patch fixing all reported problems is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 05.06.2018 09:04, Thomas Munro wrote: > On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Concerning specification of compression level: I have made many experiments >> with different data sets and both zlib/zstd and in both cases using >> compression level higher than default doesn't cause some noticeable increase >> of compression ratio, but quite significantly reduce speed. Moreover, for >> "pgbench -i" zstd provides better compression ratio (63 times!) with >> compression level 1 than with with largest recommended compression level 22! >> This is why I decided not to allow user to choose compression level. > Speaking of configuration, are you planning to support multiple > compression libraries at the same time? It looks like the current > patch implicitly requires client and server to use the same configure > option, without any attempt to detect or negotiate. Do I guess > correctly that a library mismatch would produce an incomprehensible > corrupt stream message? > Frankly speaking I am not sure that support of multiple compression libraries at the same time is actually needed. If we build Postgres from sources, then both frontend and backend libraries will use the same compression library. zlib is available almost everywhere and Postgres in any case is using it. zstd is faster and provides better compression ratio. So in principle it may be useful to try first to use zstd and if it is not available then use zlib. It will require dynamic loading of this libraries. libpq stream compression is not the only place where compression is used in Postgres. So I think that the problem of choosing compression algorithm and supporting custom compression methods should be fixed at some upper level. There is patch for custom compression method at commit fest. May be it should be combined with this one. Right now if client and server libpq libraries were built with different compression libraries, then decompress error will be reported. Supporting multiple compression methods will require more sophisticated handshake protocol so that client and server can choose compression method which is supported by both of them. But certainly it can be done. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 05.06.2018 10:09, Michael Paquier wrote: > On Tue, Jun 05, 2018 at 06:04:21PM +1200, Thomas Munro wrote: >> On Thu, May 17, 2018 at 3:54 AM, Konstantin Knizhnik >> Speaking of configuration, are you planning to support multiple >> compression libraries at the same time? It looks like the current >> patch implicitly requires client and server to use the same configure >> option, without any attempt to detect or negotiate. Do I guess >> correctly that a library mismatch would produce an incomprehensible >> corrupt stream message? > I just had a quick look at this patch, lured by the smell of your latest > messages... And it seems to me that this patch needs a heavy amount of > work as presented. There are a couple of things which are not really > nice, like forcing the presentation of the compression option in the > startup packet to begin with. The high-jacking around secure_read() is > not nice either as it is aimed at being a rather high-level API on top > of the method used with the backend. On top of adding some > documentation, I think that you could get some inspiration from the > recent GSSAPI encription patch which has been submitted again for v12 > cycle, which has spent a large amount of time designing its set of > options. > -- > Michael Thank you for feedback, I have considered this patch mostly as prototype to estimate efficiency of libpq protocol compression and compare it with SSL compression. So I agree with you that there are a lot of things which should be improved. But can you please clarify what is wrong with "forcing the presentation of the compression option in the startup packet to begin with"? Do you mean that it will be better to be able switch on/off compression during session? Also I do not completely understand what do you mean by "high-jacking around secure_read()". I looked at GSSAPI patch. It does injection in secure_read: +#ifdef ENABLE_GSS + if (port->gss->enc) + { + n = be_gssapi_read(port, ptr, len); + waitfor = WL_SOCKET_READABLE; + } + else But the main difference between encryption and compression is that encryption is not changing data size, while compression does. To be able to use streaming compression, I need to specify some function for reading data from the stream. I am using secure_read for this purpose: PqStream = zpq_create((zpq_tx_func)secure_write, (zpq_rx_func)secure_read, MyProcPort); Can you please explain what is the problem with it? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 6/5/18 03:09, Michael Paquier wrote: > I just had a quick look at this patch, lured by the smell of your latest > messages... And it seems to me that this patch needs a heavy amount of > work as presented. There are a couple of things which are not really > nice, like forcing the presentation of the compression option in the > startup packet to begin with. Yeah, at this point we will probably need a discussion and explanation of the protocol behavior this is adding, such as how to negotiate different compression settings. Unrelatedly, I suggest skipping the addition of -Z options to various client-side tools. This is unnecessary, since generic connection options can already be specified via -d typically, and it creates confusion because -Z is already used to specify output compression by some programs. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 5 June 2018 at 13:06, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 6/5/18 03:09, Michael Paquier wrote:
> I just had a quick look at this patch, lured by the smell of your latest
> messages... And it seems to me that this patch needs a heavy amount of
> work as presented. There are a couple of things which are not really
> nice, like forcing the presentation of the compression option in the
> startup packet to begin with.
Yeah, at this point we will probably need a discussion and explanation
of the protocol behavior this is adding, such as how to negotiate
different compression settings.
Unrelatedly, I suggest skipping the addition of -Z options to various
client-side tools. This is unnecessary, since generic connection
options can already be specified via -d typically, and it creates
confusion because -Z is already used to specify output compression by
some programs.
As the maintainer of the JDBC driver I would think we would also like to leverage this as well.
There are a few other drivers that implement the protocol as well and I'm sure they would want in as well.
I haven't looked at the patch but if we get to the point of negotiating compression please let me know.
Thanks,
On Wed, Jun 6, 2018 at 2:06 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Thank you for review. Updated version of the patch fixing all reported > problems is attached. Small problem on Windows[1]: C:\projects\postgresql\src\include\common/zpq_stream.h(17): error C2143: syntax error : missing ')' before '*' [C:\projects\postgresql\libpq.vcxproj] 2395 You used ssize_t in zpq_stream.h, but Windows doesn't have that type. We have our own typedef in win32_port.h. Perhaps zpq_stream.c should include postgres.h/postgres_fe.h (depending on FRONTEND) like the other .c files in src/common, before it includes zpq_stream.h? Instead of "c.h". [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1106 -- Thomas Munro http://www.enterprisedb.com
On Tue, Jun 05, 2018 at 06:58:42PM +0300, Konstantin Knizhnik wrote: > I have considered this patch mostly as prototype to estimate efficiency of > libpq protocol compression and compare it with SSL compression. > So I agree with you that there are a lot of things which should be > improved. Cool. It seems that there is some meaning for such a feature with environments with spare CPU and network limitations. > But can you please clarify what is wrong with "forcing the presentation of > the compression option in the startup packet to begin with"? Sure, I am referring to that in your v4: if (conn->replication && conn->replication[0]) ADD_STARTUP_OPTION("replication", conn->replication); + if (conn->compression && conn->compression[0]) + ADD_STARTUP_OPTION("compression", conn->compression); There is no point in adding that as a mandatory field of the startup packet. > Do you mean that it will be better to be able switch on/off compression > during session? Not really, I get that this should be defined when the session is established and remain until the session finishes. You have a couple of restrictions like what to do with the first set of messages exchanged but that could be delayed until the negotiation is done. > But the main difference between encryption and compression is that > encryption is not changing data size, while compression does. > To be able to use streaming compression, I need to specify some function for > reading data from the stream. I am using secure_read for this purpose: > > PqStream = zpq_create((zpq_tx_func)secure_write, > (zpq_rx_func)secure_read, MyProcPort); > > Can you please explain what is the problem with it? Likely I have not looked at your patch sufficiently, but the point I am trying to make is that secure_read or pqsecure_read are entry points which switch method depending on the connection details. The GSSAPI encryption patch does that. Yours does not with stuff like that: retry4: - nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd, - conn->inBufSize - conn->inEnd); This makes the whole error handling more consistent, and the new option layer as well more consistent with what happens in SSL, except that you want to be able to combine SSL and compression as well so you need an extra process which decompresses/compresses the data after doing a "raw" or "ssl" read/write. I have not actually looked much at your patch, but I am wondering if it could not be possible to make the whole footprint less invasive which really worries me as now shaped. As you need to register functions with say zpq_create(), it would be instinctively nicer to do the handling directly in secure_read() and such. Just my 2c. -- Michael
Attachment
On 06.06.2018 10:53, Michael Paquier wrote: > On Tue, Jun 05, 2018 at 06:58:42PM +0300, Konstantin Knizhnik wrote: >> I have considered this patch mostly as prototype to estimate efficiency of >> libpq protocol compression and compare it with SSL compression. >> So I agree with you that there are a lot of things which should be >> improved. > Cool. It seems that there is some meaning for such a feature with > environments with spare CPU and network limitations. > >> But can you please clarify what is wrong with "forcing the presentation of >> the compression option in the startup packet to begin with"? > Sure, I am referring to that in your v4: > if (conn->replication && conn->replication[0]) > ADD_STARTUP_OPTION("replication", conn->replication); > + if (conn->compression && conn->compression[0]) > + ADD_STARTUP_OPTION("compression", conn->compression); > There is no point in adding that as a mandatory field of the startup > packet. Sorry, but ADD_STARTUP_OPTION is not adding manatory field of startup package. This option any be omitted. There are a lto of other options registered using ADD_STARTUP_OPTION, for example all environment-driven GUCs: /* Add any environment-driven GUC settings needed */ for (next_eo = options; next_eo->envName; next_eo++) { if ((val = getenv(next_eo->envName)) != NULL) { if (pg_strcasecmp(val, "default") != 0) ADD_STARTUP_OPTION(next_eo->pgName, val); } } So I do not understand what is wrong here registering "compression" as option of startup package and what is the alternative for it... >> Do you mean that it will be better to be able switch on/off compression >> during session? > Not really, I get that this should be defined when the session is > established and remain until the session finishes. You have a couple of > restrictions like what to do with the first set of messages exchanged > but that could be delayed until the negotiation is done. > >> But the main difference between encryption and compression is that >> encryption is not changing data size, while compression does. >> To be able to use streaming compression, I need to specify some function for >> reading data from the stream. I am using secure_read for this purpose: >> >> PqStream = zpq_create((zpq_tx_func)secure_write, >> (zpq_rx_func)secure_read, MyProcPort); >> >> Can you please explain what is the problem with it? > Likely I have not looked at your patch sufficiently, but the point I am > trying to make is that secure_read or pqsecure_read are entry points > which switch method depending on the connection details. The GSSAPI > encryption patch does that. Yours does not with stuff like that: > > retry4: > - nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd, > - conn->inBufSize - conn->inEnd); > > This makes the whole error handling more consistent, and the new option > layer as well more consistent with what happens in SSL, except that you > want to be able to combine SSL and compression as well so you need an > extra process which decompresses/compresses the data after doing a "raw" > or "ssl" read/write. I have not actually looked much at your patch, but > I am wondering if it could not be possible to make the whole footprint > less invasive which really worries me as now shaped. As you need to > register functions with say zpq_create(), it would be instinctively > nicer to do the handling directly in secure_read() and such. Once again sorry, but i still do not understand the problem here. If compression is anabled, then I am using zpq_read instead of secure_read/pqsecure_read. But zpq_read is in turn calling secure_read/pqsecure_read to fetch more raw data. So if "ecure_read or pqsecure_read are entry points which switch method depending on the connection details", then compression is not preventing them from making this choice. Compression should be done prior to encryption otherwise compression will have no sense. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 06.06.2018 02:03, Thomas Munro wrote: > On Wed, Jun 6, 2018 at 2:06 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Thank you for review. Updated version of the patch fixing all reported >> problems is attached. > Small problem on Windows[1]: > > C:\projects\postgresql\src\include\common/zpq_stream.h(17): error > C2143: syntax error : missing ')' before '*' > [C:\projects\postgresql\libpq.vcxproj] > 2395 > > You used ssize_t in zpq_stream.h, but Windows doesn't have that type. > We have our own typedef in win32_port.h. Perhaps zpq_stream.c should > include postgres.h/postgres_fe.h (depending on FRONTEND) like the > other .c files in src/common, before it includes zpq_stream.h? > Instead of "c.h". > > [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1106 > Thank you very much for reporting the problem. I attached new patch with include of postgres_fe.h added to zpq_stream.c -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 05.06.2018 20:06, Peter Eisentraut wrote: > On 6/5/18 03:09, Michael Paquier wrote: >> I just had a quick look at this patch, lured by the smell of your latest >> messages... And it seems to me that this patch needs a heavy amount of >> work as presented. There are a couple of things which are not really >> nice, like forcing the presentation of the compression option in the >> startup packet to begin with. > Yeah, at this point we will probably need a discussion and explanation > of the protocol behavior this is adding, such as how to negotiate > different compression settings. > > Unrelatedly, I suggest skipping the addition of -Z options to various > client-side tools. This is unnecessary, since generic connection > options can already be specified via -d typically, and it creates > confusion because -Z is already used to specify output compression by > some programs. > Sorry, psql is using '-d' option for specifying database name and pgbench is using '-d' option for toggling debug output. So may be there is some other way to pass generic connection option, but in any case it seems to be less convenient for users. Also I do not see any contradiction with using -Z option in some other tools (pg_basebackup, pg_receivewal, pg_dump) for enabling output compression. It will be bad if that option has contradictory meaning in different tools. But if it is used for toggling compression (doesn't matter at which level), then I do not see that it can be source of confusion. The only problem is with pg_dump which establish connection with server to fetch data from the database and is able to compress output data. So here we may need two options: compress input and compress output. But I do not think that because of it -Z option should be removed from psql and pgbench. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 06.06.2018 19:33, Konstantin Knizhnik wrote: > > > On 05.06.2018 20:06, Peter Eisentraut wrote: >> On 6/5/18 03:09, Michael Paquier wrote: >>> I just had a quick look at this patch, lured by the smell of your >>> latest >>> messages... And it seems to me that this patch needs a heavy amount of >>> work as presented. There are a couple of things which are not really >>> nice, like forcing the presentation of the compression option in the >>> startup packet to begin with. >> Yeah, at this point we will probably need a discussion and explanation >> of the protocol behavior this is adding, such as how to negotiate >> different compression settings. >> >> Unrelatedly, I suggest skipping the addition of -Z options to various >> client-side tools. This is unnecessary, since generic connection >> options can already be specified via -d typically, and it creates >> confusion because -Z is already used to specify output compression by >> some programs. >> > > Sorry, psql is using '-d' option for specifying database name and > pgbench is using '-d' option for toggling debug output. > So may be there is some other way to pass generic connection option, > but in any case it seems to be less convenient for users. > Also I do not see any contradiction with using -Z option in some other > tools (pg_basebackup, pg_receivewal, pg_dump) > for enabling output compression. It will be bad if that option has > contradictory meaning in different tools. But if it is used for > toggling compression > (doesn't matter at which level), then I do not see that it can be > source of confusion. > > The only problem is with pg_dump which establish connection with > server to fetch data from the database and is able to compress output > data. > So here we may need two options: compress input and compress output. > But I do not think that because of it -Z option should be removed from > psql and pgbench. > > Well, psql really allows to specify complete connection string with -d options (although it is not mentioned in help). But still I think that it is inconvenient to require user to write complete connection string to be able to specify compression option, while everybody prefer to use -h, -U, -p options to specify correspondent components of connection string. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 06/06/2018 10:20 AM, Konstantin Knizhnik wrote: > Well, psql really allows to specify complete connection string with -d > options (although it is not mentioned in help). > But still I think that it is inconvenient to require user to write > complete connection string to be able to specify compression option, > while everybody prefer to use -h, -U, -p options to specify > correspondent components of connection string. From a barrier to entry and simplicity sake I agree. We should have a standard flag. JD -- Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc *** A fault and talent of mine is to tell it exactly how it is. *** PostgreSQL centered full stack support, consulting and development. Advocate: @amplifypostgres || Learn: https://postgresconf.org ***** Unless otherwise stated, opinions are my own. *****
On 6/6/18 13:20, Konstantin Knizhnik wrote: > Well, psql really allows to specify complete connection string with -d > options (although it is not mentioned in help). > But still I think that it is inconvenient to require user to write > complete connection string to be able to specify compression option, > while everybody prefer to use -h, -U, -p options to specify > correspondent components of connection string. I recommend that you avoid derailing your effort by hinging it on this issue. You can always add command-line options after the libpq support is in. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wednesday, June 6, 2018, Peter Eisentraut <peter.eisentraut@2ndquadrant. com> wrote:
On 6/6/18 13:20, Konstantin Knizhnik wrote:
> Well, psql really allows to specify complete connection string with -d
> options (although it is not mentioned in help).
> But still I think that it is inconvenient to require user to write
> complete connection string to be able to specify compression option,
> while everybody prefer to use -h, -U, -p options to specify
> correspondent components of connection string.
I recommend that you avoid derailing your effort by hinging it on this
issue. You can always add command-line options after the libpq support
is in.
It probably requires a long option. It can be debated whether a short option is warranted (as well as an environment variable).
David J.
On 7 June 2018 at 04:01, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 6/6/18 13:20, Konstantin Knizhnik wrote:
> Well, psql really allows to specify complete connection string with -d
> options (although it is not mentioned in help).
> But still I think that it is inconvenient to require user to write
> complete connection string to be able to specify compression option,
> while everybody prefer to use -h, -U, -p options to specify
> correspondent components of connection string.
I recommend that you avoid derailing your effort by hinging it on this
issue. You can always add command-line options after the libpq support
is in.
Strongly agree. Let libpq handle it first with the core protocol support and connstr parsing, add convenience flags later.
tKonstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 06.06.2018 02:03, Thomas Munro wrote: >> On Wed, Jun 6, 2018 at 2:06 AM, Konstantin Knizhnik >> <k.knizhnik@postgrespro.ru> wrote: >>> Thank you for review. Updated version of the patch fixing all reported >>> problems is attached. >> Small problem on Windows[1]: >> >> C:\projects\postgresql\src\include\common/zpq_stream.h(17): error >> C2143: syntax error : missing ')' before '*' >> [C:\projects\postgresql\libpq.vcxproj] >> 2395 >> >> You used ssize_t in zpq_stream.h, but Windows doesn't have that type. >> We have our own typedef in win32_port.h. Perhaps zpq_stream.c should >> include postgres.h/postgres_fe.h (depending on FRONTEND) like the >> other .c files in src/common, before it includes zpq_stream.h? >> Instead of "c.h". >> >> [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1106 >> > Thank you very much for reporting the problem. > I attached new patch with include of postgres_fe.h added to zpq_stream.c Hello! Due to being in a similar place, I'm offering some code review. I'm excited that you're looking at throughput on the network stack - it's not usually what we think of in database performance. Here are some first thoughts, which have some overlap with what others have said on this thread already: ### This build still doesn't pass Windows: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.2277 You can find more about what the bot is doing here: http://cfbot.cputube.org/ ### I have a few misgivings about pq_configure(), starting with the name. The *only* thing this function does is set up compression, so it's mis-named. (There's no point in making something generic unless it's needed - it's just confusing.) I also don't like that you've injected into the *startup* path - before authentication takes place. Fundamentally, authentication (if it happens) consists of exchanging some combination of short strings (e.g., usernames) and binary blobs (e.g., keys). None of this will compress well, so I don't see it as worth performing this negotiation there - it can wait. It's also another message in every startup. I'd leave it to connection parameters, personally, but up to you. ### Documentation! You're going to need it. There needs to be enough around for other people to implement the protocol (or if you prefer, enough for us to debug the protocol as it exists). In conjunction with that, please add information on how to set up compressed vs. uncompressed connections - similarly to how we've documentation on setting up TLS connection (though presumably compressed connection documentation will be shorter). ### Using terminology from https://facebook.github.io/zstd/zstd_manual.html : Right now you use streaming (ZSTD_{compress,decompress}Stream()) as the basis for your API. I think this is probably a mismatch for what we're doing here - we're doing one-shot compression/decompression of packets, not sending video or something. I think our use case is better served by the non-streaming interface, or what they call the "Simple API" (ZSTD_{decompress,compress}()). Documentation suggests it may be worth it to keep an explicit context around and use that interface instead (i.e., ZSTD_{compressCCTx,decompressDCtx}()), but that's something you'll have to figure out. You may find making this change helpful for addressing the next issue. ### I don't like where you've put the entry points to the compression logic: it's a layering violation. A couple people have expressed similar reservations I think, so let me see if I can explain using `pqsecure_read()` as an example. In pseudocode, `pqsecure_read()` looks like this: if conn->is_tls: n = tls_read(conn, ptr, len) else: n = pqsecure_raw_read(conn, ptr, len) return n I want to see this extended by your code to something like: if conn->is_tls: n = tls_read(conn, ptr, len) else: n = pqsecure_raw_read(conn, ptr, len) if conn->is_compressed: n = decompress(ptr, n) return n In conjunction with the above change, this should also significantly reduce the size of the patch (I think). ### The compression flag has proven somewhat contentious, as you've already seen on this thread. I recommend removing it for now and putting it in a separate patch to be merged later, since it's separable. ### It's not worth flagging style violations in your code right now, but you should be aware that there are quite a few style and whitespace problems. In particular, please be sure that you're using hard tabs when appropriate, and that line lengths are fewer than 80 characters (unless they contain error messages), and that pointers are correctly declared (`void *arg`, not `void* arg`). ### Thanks, --Robbie
Attachment
On 18.06.2018 23:34, Robbie Harwood wrote:
Thank you.tKonstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:On 06.06.2018 02:03, Thomas Munro wrote:On Wed, Jun 6, 2018 at 2:06 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:Thank you for review. Updated version of the patch fixing all reported problems is attached.Small problem on Windows[1]: C:\projects\postgresql\src\include\common/zpq_stream.h(17): error C2143: syntax error : missing ')' before '*' [C:\projects\postgresql\libpq.vcxproj] 2395 You used ssize_t in zpq_stream.h, but Windows doesn't have that type. We have our own typedef in win32_port.h. Perhaps zpq_stream.c should include postgres.h/postgres_fe.h (depending on FRONTEND) like the other .c files in src/common, before it includes zpq_stream.h? Instead of "c.h". [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1106Thank you very much for reporting the problem. I attached new patch with include of postgres_fe.h added to zpq_stream.cHello! Due to being in a similar place, I'm offering some code review. I'm excited that you're looking at throughput on the network stack - it's not usually what we think of in database performance. Here are some first thoughts, which have some overlap with what others have said on this thread already: ### This build still doesn't pass Windows: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.2277 You can find more about what the bot is doing here: http://cfbot.cputube.org/
Looks like I found the reason: Mkvsbuild.pm has to be patched.
Well, my intention was that this function *may* in future perform some other configuration setting not related with compression.### I have a few misgivings about pq_configure(), starting with the name. The *only* thing this function does is set up compression, so it's mis-named. (There's no point in making something generic unless it's needed - it's just confusing.)
And it is better to encapsulate this knowledge in pqcomm, rather than make postmaster (BackendStartup) worry about it.
But I can rename this function to pq_cofigure_compression() or whatever your prefer.
I also don't like that you've injected into the *startup* path - before authentication takes place. Fundamentally, authentication (if it happens) consists of exchanging some combination of short strings (e.g., usernames) and binary blobs (e.g., keys). None of this will compress well, so I don't see it as worth performing this negotiation there - it can wait. It's also another message in every startup. I'd leave it to connection parameters, personally, but up to you.
From my point of view compression of libpq traffic is similar with SSL and should be toggled at the same stage.
Definitely authentication parameter are not so large to be efficiently compressed, by compression (may be in future password protected) can somehow protect this data.
In any case I do not think that compression of authentication data may have any influence on negotiation speed.
So I am not 100% sure that toggling compression just after receiving startup package is the only right solution.
But I am not also convinced in that there is some better place where compressor should be configured.
Do you have some concrete suggestions for it? In InitPostgres just after PerformAuthentication ?
Also please notice that compression is useful not only for client-server communication, but also for replication channels.
Right now it is definitely used in both cases, but if we move pq_configure somewhere else, we should check that this code is invoked in both for normal backends and walsender.
### Documentation! You're going to need it. There needs to be enough around for other people to implement the protocol (or if you prefer, enough for us to debug the protocol as it exists). In conjunction with that, please add information on how to set up compressed vs. uncompressed connections - similarly to how we've documentation on setting up TLS connection (though presumably compressed connection documentation will be shorter).
Sorry, definitely I will add documentation for configuring compression.
### Using terminology from https://facebook.github.io/zstd/zstd_manual.html : Right now you use streaming (ZSTD_{compress,decompress}Stream()) as the basis for your API. I think this is probably a mismatch for what we're doing here - we're doing one-shot compression/decompression of packets, not sending video or something. I think our use case is better served by the non-streaming interface, or what they call the "Simple API" (ZSTD_{decompress,compress}()). Documentation suggests it may be worth it to keep an explicit context around and use that interface instead (i.e., ZSTD_{compressCCTx,decompressDCtx}()), but that's something you'll have to figure out. You may find making this change helpful for addressing the next issue.
Sorry, but here I completely disagree with you.
What we are doing is really streaming compression, not compression of individual messages/packages.
Yes, it is not a video, but actually COPY data has the same nature as video data.
The main benefit of streaming compression is that we can use the same dictionary for compressing all messages (and adjust this dictionary based on new data).
We do not need to write dictionary and separate header for each record. Otherwize compression of libpq messages will be completely useless: typical size of message is too short to be efficiently compressed. The main drawback of streaming compression is that you can not decompress some particular message without decompression of all previous messages.
This is why streaming compression can not be used to compress database pages (as it is done in CFS, provided in PostgresPro EE). But for libpq it is no needed.
### I don't like where you've put the entry points to the compression logic: it's a layering violation. A couple people have expressed similar reservations I think, so let me see if I can explain using `pqsecure_read()` as an example. In pseudocode, `pqsecure_read()` looks like this: if conn->is_tls: n = tls_read(conn, ptr, len) else: n = pqsecure_raw_read(conn, ptr, len) return n I want to see this extended by your code to something like: if conn->is_tls: n = tls_read(conn, ptr, len) else: n = pqsecure_raw_read(conn, ptr, len) if conn->is_compressed: n = decompress(ptr, n) return n In conjunction with the above change, this should also significantly reduce the size of the patch (I think).
Yes, it will simplify patch. But make libpq compression completely useless (see my explanation above).
We need to use streaming compression, and to be able to use streaming compression I have to pass function for fetching more data to compression library.
### The compression flag has proven somewhat contentious, as you've already seen on this thread. I recommend removing it for now and putting it in a separate patch to be merged later, since it's separable. ### It's not worth flagging style violations in your code right now, but you should be aware that there are quite a few style and whitespace problems. In particular, please be sure that you're using hard tabs when appropriate, and that line lengths are fewer than 80 characters (unless they contain error messages), and that pointers are correctly declared (`void *arg`, not `void* arg`). ###
Ok, I will fix it.
Thanks, --Robbie
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 18.06.2018 23:34, Robbie Harwood wrote: > >> I also don't like that you've injected into the *startup* path - >> before authentication takes place. Fundamentally, authentication (if >> it happens) consists of exchanging some combination of short strings >> (e.g., usernames) and binary blobs (e.g., keys). None of this will >> compress well, so I don't see it as worth performing this negotiation >> there - it can wait. It's also another message in every startup. >> I'd leave it to connection parameters, personally, but up to you. > > From my point of view compression of libpq traffic is similar with SSL > and should be toggled at the same stage. But that's not what you're doing. This isn't where TLS gets toggled. TLS negotiation happens as the very first packet: after completing the TCP handshake, the client will send a TLS negotiation request. If it doesn't happen there, it doesn't happen at all. (You *could* configure it where TLS is toggled. This is, I think, not a good idea. TLS encryption is a probe: the server can reject it, at which point the client tears everything down and connects without TLS. So if you do the same with compression, that's another point of tearing down an starting over. The scaling on it isn't good either: if we add another encryption method into the mix, you've doubled the number of teardowns.) > Definitely authentication parameter are not so large to be efficiently > compressed, by compression (may be in future password protected) can > somehow protect this data. > In any case I do not think that compression of authentication data may > have any influence on negotiation speed. > So I am not 100% sure that toggling compression just after receiving > startup package is the only right solution. > But I am not also convinced in that there is some better place where > compressor should be configured. > Do you have some concrete suggestions for it? In InitPostgres just after > PerformAuthentication ? Hmm, let me try to explain this differently. pq_configure() (as you've called it) shouldn't send a packet. At its callsite, we *already know* whether we want to use compression - that's what the port->use_compression option says. So there's no point in having a negotiation there - it's already happened. The other thing you do in pq_configure() is call zpq_create(), which does a bunch of initialization for you. I am pretty sure that all of this can be deferred until the first time you want to send a compressed message - i.e., when compress()/decompress() is called for the first time from *secure_read() or *secure_write(). > Also please notice that compression is useful not only for client-server > communication, but also for replication channels. > Right now it is definitely used in both cases, but if we move > pq_configure somewhere else, we should check that this code is invoked > in both for normal backends and walsender. "We" meaning you, at the moment, since I don't think any of the rest of us have set up tests with this code :) If there's common code to be shared around, that's great. But it's not imperative; in a lot of ways, the network stacks are very different from each other, as I'm sure you've seen. Let's not have the desire for code reuse get in the way of good, maintainable design. >> Using terminology from https://facebook.github.io/zstd/zstd_manual.html : >> >> Right now you use streaming (ZSTD_{compress,decompress}Stream()) as the >> basis for your API. I think this is probably a mismatch for what we're >> doing here - we're doing one-shot compression/decompression of packets, >> not sending video or something. >> >> I think our use case is better served by the non-streaming interface, or >> what they call the "Simple API" (ZSTD_{decompress,compress}()). >> Documentation suggests it may be worth it to keep an explicit context >> around and use that interface instead (i.e., >> ZSTD_{compressCCTx,decompressDCtx}()), but that's something you'll have >> to figure out. >> >> You may find making this change helpful for addressing the next issue. > > Sorry, but here I completely disagree with you. > What we are doing is really streaming compression, not compression of > individual messages/packages. > Yes, it is not a video, but actually COPY data has the same nature as > video data. > The main benefit of streaming compression is that we can use the same > dictionary for compressing all messages (and adjust this dictionary > based on new data). > We do not need to write dictionary and separate header for each record. > Otherwize compression of libpq messages will be completely useless: > typical size of message is too short to be efficiently compressed. The > main drawback of streaming compression is that you can not decompress > some particular message without decompression of all previous messages. > This is why streaming compression can not be used to compress database > pages (as it is done in CFS, provided in PostgresPro EE). But for libpq > it is no needed. That makes sense, thanks. The zstd documentation doesn't articulate that at all. >> I don't like where you've put the entry points to the compression logic: >> it's a layering violation. A couple people have expressed similar >> reservations I think, so let me see if I can explain using >> `pqsecure_read()` as an example. In pseudocode, `pqsecure_read()` looks >> like this: >> >> if conn->is_tls: >> n = tls_read(conn, ptr, len) >> else: >> n = pqsecure_raw_read(conn, ptr, len) >> return n >> >> I want to see this extended by your code to something like: >> >> if conn->is_tls: >> n = tls_read(conn, ptr, len) >> else: >> n = pqsecure_raw_read(conn, ptr, len) >> >> if conn->is_compressed: >> n = decompress(ptr, n) >> >> return n >> >> In conjunction with the above change, this should also significantly >> reduce the size of the patch (I think). > > Yes, it will simplify patch. But make libpq compression completely > useless (see my explanation above). > We need to use streaming compression, and to be able to use streaming > compression I have to pass function for fetching more data to > compression library. I don't think you need that, even with the streaming API. To make this very concrete, let's talk about ZSTD_decompressStream (I'm pulling information from https://facebook.github.io/zstd/zstd_manual.html#Chapter7 ). Using the pseudocode I'm asking for above, the decompress() function would look vaguely like this: decompress(ptr, n) ZSTD_inBuffer in = {0} ZSTD_outBuffer out = {0} in.src = ptr in.size = n while in.pos < in.size: ret = ZSTD_decompressStream(out, in) if ZSTD_isError(ret): give_up() memcpy(ptr, out.dst, out.pos) return out.pos (and compress() would follow a similar pattern, if we were to talk about it). Thanks, --Robbie
Attachment
On 20.06.2018 00:04, Robbie Harwood wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > >> On 18.06.2018 23:34, Robbie Harwood wrote: >> >>> I also don't like that you've injected into the *startup* path - >>> before authentication takes place. Fundamentally, authentication (if >>> it happens) consists of exchanging some combination of short strings >>> (e.g., usernames) and binary blobs (e.g., keys). None of this will >>> compress well, so I don't see it as worth performing this negotiation >>> there - it can wait. It's also another message in every startup. >>> I'd leave it to connection parameters, personally, but up to you. >> From my point of view compression of libpq traffic is similar with SSL >> and should be toggled at the same stage. > But that's not what you're doing. This isn't where TLS gets toggled. > > TLS negotiation happens as the very first packet: after completing the > TCP handshake, the client will send a TLS negotiation request. If it > doesn't happen there, it doesn't happen at all. > > (You *could* configure it where TLS is toggled. This is, I think, not a > good idea. TLS encryption is a probe: the server can reject it, at > which point the client tears everything down and connects without TLS. > So if you do the same with compression, that's another point of tearing > down an starting over. The scaling on it isn't good either: if we add > another encryption method into the mix, you've doubled the number of > teardowns.) Yes, you are right. There is special message for enabling TLS procotol. But I do not think that the same think is needed for compression. This is why I prefer to specify compression in connectoin options. So compression may be enabled straight after processing of startup package. Frankly speaking I still do no see reasons to postpone enabling compression till some later moment. >> Definitely authentication parameter are not so large to be efficiently >> compressed, by compression (may be in future password protected) can >> somehow protect this data. >> In any case I do not think that compression of authentication data may >> have any influence on negotiation speed. >> So I am not 100% sure that toggling compression just after receiving >> startup package is the only right solution. >> But I am not also convinced in that there is some better place where >> compressor should be configured. >> Do you have some concrete suggestions for it? In InitPostgres just after >> PerformAuthentication ? > Hmm, let me try to explain this differently. > > pq_configure() (as you've called it) shouldn't send a packet. At its > callsite, we *already know* whether we want to use compression - that's > what the port->use_compression option says. So there's no point in > having a negotiation there - it's already happened. My idea was the following: client want to use compression. But server may reject this attempt (for any reasons: it doesn't support it, has no proper compression library, do not want to spend CPU for decompression,...) Right now compression algorithm is hardcoded. But in future client and server may negotiate to choose proper compression protocol. This is why I prefer to perform negotiation between client and server to enable compression. > > The other thing you do in pq_configure() is call zpq_create(), which > does a bunch of initialization for you. I am pretty sure that all of > this can be deferred until the first time you want to send a compressed > message - i.e., when compress()/decompress() is called for the first > time from *secure_read() or *secure_write(). > >> Also please notice that compression is useful not only for client-server >> communication, but also for replication channels. >> Right now it is definitely used in both cases, but if we move >> pq_configure somewhere else, we should check that this code is invoked >> in both for normal backends and walsender. > "We" meaning you, at the moment, since I don't think any of the rest of > us have set up tests with this code :) > > If there's common code to be shared around, that's great. But it's not > imperative; in a lot of ways, the network stacks are very different from > each other, as I'm sure you've seen. Let's not have the desire for code > reuse get in the way of good, maintainable design. > >>> Using terminology from https://facebook.github.io/zstd/zstd_manual.html : >>> >>> Right now you use streaming (ZSTD_{compress,decompress}Stream()) as the >>> basis for your API. I think this is probably a mismatch for what we're >>> doing here - we're doing one-shot compression/decompression of packets, >>> not sending video or something. >>> >>> I think our use case is better served by the non-streaming interface, or >>> what they call the "Simple API" (ZSTD_{decompress,compress}()). >>> Documentation suggests it may be worth it to keep an explicit context >>> around and use that interface instead (i.e., >>> ZSTD_{compressCCTx,decompressDCtx}()), but that's something you'll have >>> to figure out. >>> >>> You may find making this change helpful for addressing the next issue. >> Sorry, but here I completely disagree with you. >> What we are doing is really streaming compression, not compression of >> individual messages/packages. >> Yes, it is not a video, but actually COPY data has the same nature as >> video data. >> The main benefit of streaming compression is that we can use the same >> dictionary for compressing all messages (and adjust this dictionary >> based on new data). >> We do not need to write dictionary and separate header for each record. >> Otherwize compression of libpq messages will be completely useless: >> typical size of message is too short to be efficiently compressed. The >> main drawback of streaming compression is that you can not decompress >> some particular message without decompression of all previous messages. >> This is why streaming compression can not be used to compress database >> pages (as it is done in CFS, provided in PostgresPro EE). But for libpq >> it is no needed. > That makes sense, thanks. The zstd documentation doesn't articulate > that at all. > >>> I don't like where you've put the entry points to the compression logic: >>> it's a layering violation. A couple people have expressed similar >>> reservations I think, so let me see if I can explain using >>> `pqsecure_read()` as an example. In pseudocode, `pqsecure_read()` looks >>> like this: >>> >>> if conn->is_tls: >>> n = tls_read(conn, ptr, len) >>> else: >>> n = pqsecure_raw_read(conn, ptr, len) >>> return n >>> >>> I want to see this extended by your code to something like: >>> >>> if conn->is_tls: >>> n = tls_read(conn, ptr, len) >>> else: >>> n = pqsecure_raw_read(conn, ptr, len) >>> >>> if conn->is_compressed: >>> n = decompress(ptr, n) >>> >>> return n >>> >>> In conjunction with the above change, this should also significantly >>> reduce the size of the patch (I think). >> Yes, it will simplify patch. But make libpq compression completely >> useless (see my explanation above). >> We need to use streaming compression, and to be able to use streaming >> compression I have to pass function for fetching more data to >> compression library. > I don't think you need that, even with the streaming API. > > To make this very concrete, let's talk about ZSTD_decompressStream (I'm > pulling information from > https://facebook.github.io/zstd/zstd_manual.html#Chapter7 ). > > Using the pseudocode I'm asking for above, the decompress() function > would look vaguely like this: > > decompress(ptr, n) > ZSTD_inBuffer in = {0} > ZSTD_outBuffer out = {0} > > in.src = ptr > in.size = n > > while in.pos < in.size: > ret = ZSTD_decompressStream(out, in) > if ZSTD_isError(ret): > give_up() > > memcpy(ptr, out.dst, out.pos) > return out.pos > > (and compress() would follow a similar pattern, if we were to talk about > it). It will not work in this way. We do not know how much input data we need to be able to decompress message. So loop should be something like this: decompress(ptr, n) ZSTD_inBuffer in = {0} ZSTD_outBuffer out = {0} in.src = ptr in.size = n while true ret = ZSTD_decompressStream(out, in) if ZSTD_isError(ret): give_up() if out.pos != 0 // if we deomcpress soemthing return out.pos; read_input(in); -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 20.06.2018 00:04, Robbie Harwood wrote: >> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> On 18.06.2018 23:34, Robbie Harwood wrote: >>> >>>> I also don't like that you've injected into the *startup* path - >>>> before authentication takes place. Fundamentally, authentication >>>> (if it happens) consists of exchanging some combination of short >>>> strings (e.g., usernames) and binary blobs (e.g., keys). None of >>>> this will compress well, so I don't see it as worth performing this >>>> negotiation there - it can wait. It's also another message in >>>> every startup. I'd leave it to connection parameters, personally, >>>> but up to you. >>> >>> From my point of view compression of libpq traffic is similar with >>> SSL and should be toggled at the same stage. >> >> But that's not what you're doing. This isn't where TLS gets toggled. >> >> TLS negotiation happens as the very first packet: after completing >> the TCP handshake, the client will send a TLS negotiation request. >> If it doesn't happen there, it doesn't happen at all. >> >> (You *could* configure it where TLS is toggled. This is, I think, >> not a good idea. TLS encryption is a probe: the server can reject >> it, at which point the client tears everything down and connects >> without TLS. So if you do the same with compression, that's another >> point of tearing down an starting over. The scaling on it isn't good >> either: if we add another encryption method into the mix, you've >> doubled the number of teardowns.) > > Yes, you are right. There is special message for enabling TLS > procotol. But I do not think that the same think is needed for > compression. This is why I prefer to specify compression in > connectoin options. So compression may be enabled straight after > processing of startup package. Frankly speaking I still do no see > reasons to postpone enabling compression till some later moment. I'm arguing for connection option only (with no additional negotiation round-trip). See below. >>> Definitely authentication parameter are not so large to be >>> efficiently compressed, by compression (may be in future password >>> protected) can somehow protect this data. In any case I do not >>> think that compression of authentication data may have any influence >>> on negotiation speed. So I am not 100% sure that toggling >>> compression just after receiving startup package is the only right >>> solution. But I am not also convinced in that there is some better >>> place where compressor should be configured. Do you have some >>> concrete suggestions for it? In InitPostgres just after >>> PerformAuthentication ? >> >> Hmm, let me try to explain this differently. >> >> pq_configure() (as you've called it) shouldn't send a packet. At its >> callsite, we *already know* whether we want to use compression - >> that's what the port->use_compression option says. So there's no >> point in having a negotiation there - it's already happened. > > My idea was the following: client want to use compression. But server > may reject this attempt (for any reasons: it doesn't support it, has > no proper compression library, do not want to spend CPU for > decompression,...) Right now compression algorithm is hardcoded. But > in future client and server may negotiate to choose proper compression > protocol. This is why I prefer to perform negotiation between client > and server to enable compression. Well, for negotiation you could put the name of the algorithm you want in the startup. It doesn't have to be a boolean for compression, and then you don't need an additional round-trip. >>>> I don't like where you've put the entry points to the compression >>>> logic: it's a layering violation. A couple people have expressed >>>> similar reservations I think, so let me see if I can explain using >>>> `pqsecure_read()` as an example. In pseudocode, `pqsecure_read()` >>>> looks like this: >>>> >>>> if conn->is_tls: >>>> n = tls_read(conn, ptr, len) >>>> else: >>>> n = pqsecure_raw_read(conn, ptr, len) >>>> return n >>>> >>>> I want to see this extended by your code to something like: >>>> >>>> if conn->is_tls: >>>> n = tls_read(conn, ptr, len) >>>> else: >>>> n = pqsecure_raw_read(conn, ptr, len) >>>> >>>> if conn->is_compressed: >>>> n = decompress(ptr, n) >>>> >>>> return n >>>> >>>> In conjunction with the above change, this should also >>>> significantly reduce the size of the patch (I think). >>> >>> Yes, it will simplify patch. But make libpq compression completely >>> useless (see my explanation above). We need to use streaming >>> compression, and to be able to use streaming compression I have to >>> pass function for fetching more data to compression library. >> >> I don't think you need that, even with the streaming API. >> >> To make this very concrete, let's talk about ZSTD_decompressStream (I'm >> pulling information from >> https://facebook.github.io/zstd/zstd_manual.html#Chapter7 ). >> >> Using the pseudocode I'm asking for above, the decompress() function >> would look vaguely like this: >> >> decompress(ptr, n) >> ZSTD_inBuffer in = {0} >> ZpSTD_outBuffer out = {0} >> >> in.src = ptr >> in.size = n >> >> while in.pos < in.size: >> ret = ZSTD_decompressStream(out, in) >> if ZSTD_isError(ret): >> give_up() >> >> memcpy(ptr, out.dst, out.pos) >> return out.pos >> >> (and compress() would follow a similar pattern, if we were to talk >> about it). > > It will not work in this way. We do not know how much input data we > need to be able to decompress message. Well, that's a design decision you've made. You could put lengths on chunks that are sent out - then you'd know exactly how much is needed. (For instance, 4 bytes of network-order length followed by a complete payload.) Then you'd absolutely know whether you have enough to decompress or not. > So loop should be something like this: > > decompress(ptr, n) > ZSTD_inBuffer in = {0} > ZSTD_outBuffer out = {0} > > in.src = ptr > in.size = n > > while true > ret = ZSTD_decompressStream(out, in) > if ZSTD_isError(ret): > give_up() > if out.pos != 0 > // if we deomcpress soemthing > return out.pos; > read_input(in); The last line is what I'm taking issue with. The interface we have already in postgres's network code has a notion of partial reads, or that reads might not return data. (Same with writing and partial writes.) So you'd buffer what compressed data you have and return - this is the preferred idiom so that we don't block or spin on a nonblocking socket. This is how the TLS code works already. Look at, for instance, pgtls_read(). If we get back SSL_ERROR_WANT_READ (i.e., not enough data to decrypt), we return no data and wait until the socket becomes readable again. Thanks, --Robbie
Attachment
On 20.06.2018 23:34, Robbie Harwood wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > > > My idea was the following: client want to use compression. But server > may reject this attempt (for any reasons: it doesn't support it, has > no proper compression library, do not want to spend CPU for > decompression,...) Right now compression algorithm is hardcoded. But > in future client and server may negotiate to choose proper compression > protocol. This is why I prefer to perform negotiation between client > and server to enable compression. > Well, for negotiation you could put the name of the algorithm you want > in the startup. It doesn't have to be a boolean for compression, and > then you don't need an additional round-trip. Sorry, I can only repeat arguments I already mentioned: - in future it may be possible to specify compression algorithm - even with boolean compression option server may have some reasons to reject client's request to use compression Extra flexibility is always good thing if it doesn't cost too much. And extra round of negotiation in case of enabling compression seems to me not to be a high price for it. > > Well, that's a design decision you've made. You could put lengths on > chunks that are sent out - then you'd know exactly how much is needed. > (For instance, 4 bytes of network-order length followed by a complete > payload.) Then you'd absolutely know whether you have enough to > decompress or not. Do you really suggest to send extra header for each chunk of data? Please notice that chunk can be as small as one message: dozen of bytes because libpq is used for client-server communication with request-reply pattern. Frankly speaking I do not completely understand the source of your concern. My primary idea was to preseve behavior of libpq function as much as possible, so there is no need to rewrite all places in Postgres code when them are used. It seems to me that I succeed in reaching this goal. Incase of enabled compression zpq_stream functions (zpq-read/write) are used instead of (pq)secure_read/write and in turn are using them to fetch/send more data. I do not see any bad flaws, encapsulation violation or some other problems in such solution. So before discussing some alternative ways of embedding compression in libpq, I will want to understand what's wrong with this approach. >> So loop should be something like this: >> >> decompress(ptr, n) >> ZSTD_inBuffer in = {0} >> ZSTD_outBuffer out = {0} >> >> in.src = ptr >> in.size = n >> >> while true >> ret = ZSTD_decompressStream(out, in) >> if ZSTD_isError(ret): >> give_up() >> if out.pos != 0 >> // if we deomcpress soemthing >> return out.pos; >> read_input(in); > The last line is what I'm taking issue with. The interface we have > already in postgres's network code has a notion of partial reads, or > that reads might not return data. (Same with writing and partial > writes.) So you'd buffer what compressed data you have and return - > this is the preferred idiom so that we don't block or spin on a > nonblocking socket. If socket is in non-blocking mode and there is available data, then secure_read function will also immediately return 0. The pseudocode above is not quite correct. Let me show the real implementation of zpq_read: ssize_t zpq_read(ZpqStream *zs, void *buf, size_t size, size_t *processed) { ssize_t rc; ZSTD_outBuffer out; out.dst = buf; out.pos = 0; out.size = size; while (1) { rc = ZSTD_decompressStream(zs->rx_stream, &out, &zs->rx); if (ZSTD_isError(rc)) { zs->rx_error = ZSTD_getErrorName(rc); return ZPQ_DECOMPRESS_ERROR; } /* Return result if we fill requested amount of bytes or read operation was performed */ if (out.pos != 0) { zs->rx_total_raw += out.pos; return out.pos; } if (zs->rx.pos == zs->rx.size) { zs->rx.pos = zs->rx.size = 0; /* Reset rx buffer */ } rc = zs->rx_func(zs->arg, (char*)zs->rx.src + zs->rx.size, ZPQ_BUFFER_SIZE - zs->rx.size); if (rc > 0) /* read fetches some data */ { zs->rx.size += rc; zs->rx_total += rc; } else /* read failed */ { *processed = out.pos; zs->rx_total_raw += out.pos; return rc; } } } Sorry, but I have spent quite enough time trying to provide the same behavior of zpq_read/write as of secure_read/write both for blocking and non-blocking mode. And I hope that now it is preserved. And frankly speaking I do not see much differences of this approach with supporting TLS. Current implementation allows to combine compression with TLS and in some cases it may be really useful. > > This is how the TLS code works already. Look at, for instance, > pgtls_read(). If we get back SSL_ERROR_WANT_READ (i.e., not enough data > to decrypt), we return no data and wait until the socket becomes > readable again. > > Thanks, > --Robbie -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 20.06.2018 23:34, Robbie Harwood wrote: >> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >> >> >> My idea was the following: client want to use compression. But server >> may reject this attempt (for any reasons: it doesn't support it, has >> no proper compression library, do not want to spend CPU for >> decompression,...) Right now compression algorithm is hardcoded. But >> in future client and server may negotiate to choose proper compression >> protocol. This is why I prefer to perform negotiation between client >> and server to enable compression. >> Well, for negotiation you could put the name of the algorithm you want >> in the startup. It doesn't have to be a boolean for compression, and >> then you don't need an additional round-trip. > > Sorry, I can only repeat arguments I already mentioned: > - in future it may be possible to specify compression algorithm > - even with boolean compression option server may have some reasons to > reject client's request to use compression > > Extra flexibility is always good thing if it doesn't cost too much. And > extra round of negotiation in case of enabling compression seems to me > not to be a high price for it. You already have this flexibility even without negotiation. I don't want you to lose your flexibility. Protocol looks like this: - Client sends connection option "compression" with list of algorithms it wants to use (comma-separated, or something). - First packet that the server can compress one of those algorithms (or none, if it doesn't want to turn on compression). No additional round-trips needed. >> Well, that's a design decision you've made. You could put lengths on >> chunks that are sent out - then you'd know exactly how much is needed. >> (For instance, 4 bytes of network-order length followed by a complete >> payload.) Then you'd absolutely know whether you have enough to >> decompress or not. > > Do you really suggest to send extra header for each chunk of data? > Please notice that chunk can be as small as one message: dozen of bytes > because libpq is used for client-server communication with request-reply > pattern. I want you to think critically about your design. I *really* don't want to design it for you - I have enough stuff to be doing. But again, the design I gave you doesn't necessarily need that - you just need to properly buffer incomplete data. > Frankly speaking I do not completely understand the source of your > concern. My primary idea was to preseve behavior of libpq function as > much as possible, so there is no need to rewrite all places in > Postgres code when them are used. It seems to me that I succeed in > reaching this goal. Incase of enabled compression zpq_stream functions > (zpq-read/write) are used instead of (pq)secure_read/write and in turn > are using them to fetch/send more data. I do not see any bad flaws, > encapsulation violation or some other problems in such solution. > > So before discussing some alternative ways of embedding compression in > libpq, I will want to understand what's wrong with this approach. You're destroying the existing model for no reason. If you needed to, I could understand some argument for the way you've done it, but what I've tried to tell you is that you don't need to do so. It's longer this way, and it *significantly* complicates the (already difficult to reason about) connection state machine. I get that rewriting code can be obnoxious, and it feels like a waste of time when we have to do so. (I've been there; I'm on version 19 of my postgres patchset.) >>> So loop should be something like this: >>> >>> decompress(ptr, n) >>> ZSTD_inBuffer in = {0} >>> ZSTD_outBuffer out = {0} >>> >>> in.src = ptr >>> in.size = n >>> >>> while true >>> ret = ZSTD_decompressStream(out, in) >>> if ZSTD_isError(ret): >>> give_up() >>> if out.pos != 0 >>> // if we deomcpress soemthing >>> return out.pos; >>> read_input(in); >> >> The last line is what I'm taking issue with. The interface we have >> already in postgres's network code has a notion of partial reads, or >> that reads might not return data. (Same with writing and partial >> writes.) So you'd buffer what compressed data you have and return - >> this is the preferred idiom so that we don't block or spin on a >> nonblocking socket. > > If socket is in non-blocking mode and there is available data, then > secure_read function will also immediately return 0. > The pseudocode above is not quite correct. Let me show the real > implementation of zpq_read: > > ssize_t > zpq_read(ZpqStream *zs, void *buf, size_t size, size_t *processed) > { > ssize_t rc; > ZSTD_outBuffer out; > out.dst = buf; > out.pos = 0; > out.size = size; > > while (1) > { > rc = ZSTD_decompressStream(zs->rx_stream, &out, &zs->rx); > if (ZSTD_isError(rc)) > { > zs->rx_error = ZSTD_getErrorName(rc); > return ZPQ_DECOMPRESS_ERROR; > } > /* Return result if we fill requested amount of bytes or read > operation was performed */ > if (out.pos != 0) > { > zs->rx_total_raw += out.pos; > return out.pos; > } > if (zs->rx.pos == zs->rx.size) > { > zs->rx.pos = zs->rx.size = 0; /* Reset rx buffer */ > } > rc = zs->rx_func(zs->arg, (char*)zs->rx.src + zs->rx.size, > ZPQ_BUFFER_SIZE - zs->rx.size); > if (rc > 0) /* read fetches some data */ > { > zs->rx.size += rc; > zs->rx_total += rc; > } > else /* read failed */ > { > *processed = out.pos; > zs->rx_total_raw += out.pos; > return rc; > } > } > } > > Sorry, but I have spent quite enough time trying to provide the same > behavior of zpq_read/write as of secure_read/write both for blocking and > non-blocking mode. > And I hope that now it is preserved. And frankly speaking I do not see > much differences of this approach with supporting TLS. > > Current implementation allows to combine compression with TLS and in > some cases it may be really useful. The bottom line, though, is that I cannot recommend the code for committer as long you have it plumbed like this. -1. Thanks, --Robbie
Attachment
On 21.06.2018 17:56, Robbie Harwood wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > >> On 20.06.2018 23:34, Robbie Harwood wrote: >>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> >>> >>> My idea was the following: client want to use compression. But server >>> may reject this attempt (for any reasons: it doesn't support it, has >>> no proper compression library, do not want to spend CPU for >>> decompression,...) Right now compression algorithm is hardcoded. But >>> in future client and server may negotiate to choose proper compression >>> protocol. This is why I prefer to perform negotiation between client >>> and server to enable compression. >>> Well, for negotiation you could put the name of the algorithm you want >>> in the startup. It doesn't have to be a boolean for compression, and >>> then you don't need an additional round-trip. >> Sorry, I can only repeat arguments I already mentioned: >> - in future it may be possible to specify compression algorithm >> - even with boolean compression option server may have some reasons to >> reject client's request to use compression >> >> Extra flexibility is always good thing if it doesn't cost too much. And >> extra round of negotiation in case of enabling compression seems to me >> not to be a high price for it. > You already have this flexibility even without negotiation. I don't > want you to lose your flexibility. Protocol looks like this: > > - Client sends connection option "compression" with list of algorithms > it wants to use (comma-separated, or something). > > - First packet that the server can compress one of those algorithms (or > none, if it doesn't want to turn on compression). > > No additional round-trips needed. This is exactly how it works now... Client includes compression option in connection string and server replies with special message ('Z') if it accepts request to compress traffic between this client and server. I do not know whether sending such message can be considered as "round-trip" or not, but I do not want to loose possibility to make decision at server whether to use compression or not. > >>> Well, that's a design decision you've made. You could put lengths on >>> chunks that are sent out - then you'd know exactly how much is needed. >>> (For instance, 4 bytes of network-order length followed by a complete >>> payload.) Then you'd absolutely know whether you have enough to >>> decompress or not. >> Do you really suggest to send extra header for each chunk of data? >> Please notice that chunk can be as small as one message: dozen of bytes >> because libpq is used for client-server communication with request-reply >> pattern. > I want you to think critically about your design. I *really* don't want > to design it for you - I have enough stuff to be doing. But again, the > design I gave you doesn't necessarily need that - you just need to > properly buffer incomplete data. Right now secure_read may return any number of available bytes. But in case of using streaming compression, it can happen that available number of bytes is not enough to perform decompression. This is why we may need to try to fetch additional portion of data. This is how zpq_stream is working now. I do not understand how it is possible to implement in different way and what is wrong with current implementation. > >> Frankly speaking I do not completely understand the source of your >> concern. My primary idea was to preseve behavior of libpq function as >> much as possible, so there is no need to rewrite all places in >> Postgres code when them are used. It seems to me that I succeed in >> reaching this goal. Incase of enabled compression zpq_stream functions >> (zpq-read/write) are used instead of (pq)secure_read/write and in turn >> are using them to fetch/send more data. I do not see any bad flaws, >> encapsulation violation or some other problems in such solution. >> >> So before discussing some alternative ways of embedding compression in >> libpq, I will want to understand what's wrong with this approach. > You're destroying the existing model for no reason. Why? Sorry, I really do not understand why adding compression in this way breaks exited model. Can you please explain it to me once again. > If you needed to, I > could understand some argument for the way you've done it, but what I've > tried to tell you is that you don't need to do so. It's longer this > way, and it *significantly* complicates the (already difficult to reason > about) connection state machine. > > I get that rewriting code can be obnoxious, and it feels like a waste of > time when we have to do so. (I've been there; I'm on version 19 of my > postgres patchset.) I am not against rewriting code many times, but first I should understand the problem which needs to be solved. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 21.06.2018 17:56, Robbie Harwood wrote: >> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> On 20.06.2018 23:34, Robbie Harwood wrote: >>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>> >>>> Well, that's a design decision you've made. You could put lengths >>>> on chunks that are sent out - then you'd know exactly how much is >>>> needed. (For instance, 4 bytes of network-order length followed by >>>> a complete payload.) Then you'd absolutely know whether you have >>>> enough to decompress or not. >>> >>> Do you really suggest to send extra header for each chunk of data? >>> Please notice that chunk can be as small as one message: dozen of >>> bytes because libpq is used for client-server communication with >>> request-reply pattern. >> >> I want you to think critically about your design. I *really* don't >> want to design it for you - I have enough stuff to be doing. But >> again, the design I gave you doesn't necessarily need that - you just >> need to properly buffer incomplete data. > > Right now secure_read may return any number of available bytes. But in > case of using streaming compression, it can happen that available > number of bytes is not enough to perform decompression. This is why we > may need to try to fetch additional portion of data. This is how > zpq_stream is working now. No, you need to buffer and wait until you're called again. Which is to say: decompress() shouldn't call secure_read(). secure_read() should call decompress(). > I do not understand how it is possible to implement in different way > and what is wrong with current implementation. The most succinct thing I can say is: absolutely don't modify pq_recvbuf(). I gave you pseudocode for how to do that. All of your logic should be *below* the secure_read/secure_write functions. I cannot approve code that modifies pq_recvbuf() in the manner you currently do. >>>> My idea was the following: client want to use compression. But >>>> server may reject this attempt (for any reasons: it doesn't support >>>> it, has no proper compression library, do not want to spend CPU for >>>> decompression,...) Right now compression algorithm is >>>> hardcoded. But in future client and server may negotiate to choose >>>> proper compression protocol. This is why I prefer to perform >>>> negotiation between client and server to enable compression. Well, >>>> for negotiation you could put the name of the algorithm you want in >>>> the startup. It doesn't have to be a boolean for compression, and >>>> then you don't need an additional round-trip. >>> >>> Sorry, I can only repeat arguments I already mentioned: >>> - in future it may be possible to specify compression algorithm >>> - even with boolean compression option server may have some reasons to >>> reject client's request to use compression >>> >>> Extra flexibility is always good thing if it doesn't cost too >>> much. And extra round of negotiation in case of enabling compression >>> seems to me not to be a high price for it. >> >> You already have this flexibility even without negotiation. I don't >> want you to lose your flexibility. Protocol looks like this: >> >> - Client sends connection option "compression" with list of >> algorithms it wants to use (comma-separated, or something). >> >> - First packet that the server can compress one of those algorithms >> (or none, if it doesn't want to turn on compression). >> >> No additional round-trips needed. > > This is exactly how it works now... Client includes compression > option in connection string and server replies with special message > ('Z') if it accepts request to compress traffic between this client > and server. No, it's not. You don't need this message. If the server receives a compression request, it should just turn compression on (or not), and then have the client figure out whether it got compression back. This is of course made harder by your refusal to use packet framing, but still shouldn't be particularly difficult. Thanks, --Robbie
Attachment
On Thu, Jun 21, 2018 at 10:12:17AM +0300, Konstantin Knizhnik wrote: > On 20.06.2018 23:34, Robbie Harwood wrote: > >Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > >Well, that's a design decision you've made. You could put lengths on > >chunks that are sent out - then you'd know exactly how much is needed. > >(For instance, 4 bytes of network-order length followed by a complete > >payload.) Then you'd absolutely know whether you have enough to > >decompress or not. > > Do you really suggest to send extra header for each chunk of data? > Please notice that chunk can be as small as one message: dozen of bytes > because libpq is used for client-server communication with request-reply > pattern. You must have lengths, yes, otherwise you're saying that the chosen compression mechanism must itself provide framing. I'm not that familiar with compression APIs and formats, but looking at RFC1950 (zlib) for example I see no framing. So I think you just have to have lengths. Now, this being about compression, I understand that you might now want to have 4-byte lengths, especially given that most messages will be under 8KB. So use a varint encoding for the lengths. Nico --
On 21.06.2018 20:14, Robbie Harwood wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > >> On 21.06.2018 17:56, Robbie Harwood wrote: >>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>> On 20.06.2018 23:34, Robbie Harwood wrote: >>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>> >>>>> Well, that's a design decision you've made. You could put lengths >>>>> on chunks that are sent out - then you'd know exactly how much is >>>>> needed. (For instance, 4 bytes of network-order length followed by >>>>> a complete payload.) Then you'd absolutely know whether you have >>>>> enough to decompress or not. >>>> Do you really suggest to send extra header for each chunk of data? >>>> Please notice that chunk can be as small as one message: dozen of >>>> bytes because libpq is used for client-server communication with >>>> request-reply pattern. >>> I want you to think critically about your design. I *really* don't >>> want to design it for you - I have enough stuff to be doing. But >>> again, the design I gave you doesn't necessarily need that - you just >>> need to properly buffer incomplete data. >> Right now secure_read may return any number of available bytes. But in >> case of using streaming compression, it can happen that available >> number of bytes is not enough to perform decompression. This is why we >> may need to try to fetch additional portion of data. This is how >> zpq_stream is working now. > No, you need to buffer and wait until you're called again. Which is to > say: decompress() shouldn't call secure_read(). secure_read() should > call decompress(). > I this case I will have to implement this code twice: both for backend and frontend, i.e. for secure_read/secure_write and pqsecure_read/pqsecure_write. Frankly speaking i was very upset by design of libpq communication layer in Postgres: there are two different implementations of almost the same stuff for cbackend and frontend. I do not see any meaningful argument for it except "historical reasons". The better decision was to encapsulate socket communication layer (and some other system dependent stuff) in SAL (system abstraction layer) and use it both in backend and frontend. By passing secure_read/pqsecure_read functions to zpq_stream I managed to avoid such code duplication at least for compression. >> I do not understand how it is possible to implement in different way >> and what is wrong with current implementation. > The most succinct thing I can say is: absolutely don't modify > pq_recvbuf(). I gave you pseudocode for how to do that. All of your > logic should be *below* the secure_read/secure_write functions. > > I cannot approve code that modifies pq_recvbuf() in the manner you > currently do. Well. I understand you arguments. But please also consider my argument above (about avoiding code duplication). In any case, secure_read function is called only from pq_recvbuf() as well as pqsecure_read is called only from pqReadData. So I do not think principle difference in handling compression in secure_read or pq_recvbuf functions and do not understand why it is "destroying the existing model". Frankly speaking, I will really like to destroy existed model, moving all system dependent stuff in Postgres to SAL and avoid this awful mix of code sharing and duplication between backend and frontend. But it is a another story and I do not want to discuss it here. If we are speaking about the "right design", then neither your suggestion, neither my implementation are good and I do not see principle differences between them. The right approach is using "decorator pattern": this is how streams are designed in .Net/Java. You can construct pipe of "secure", "compressed" and whatever else streams. Yes, it is first of all pattern for object-oriented approach and Postgres is implemented in C. But it is actually possible to use OO approach in pure C (X-Windows!). But once again, this discussion may lead other too far away from the topic of libpq compression. As far as I already wrote, the main points of my design were: 1. Minimize changes in Postgres code 2. Avoid code duplication 3. Provide abstract (compression stream) which can be used somewhere else except libpq itself. > >>>>> My idea was the following: client want to use compression. But >>>>> server may reject this attempt (for any reasons: it doesn't support >>>>> it, has no proper compression library, do not want to spend CPU for >>>>> decompression,...) Right now compression algorithm is >>>>> hardcoded. But in future client and server may negotiate to choose >>>>> proper compression protocol. This is why I prefer to perform >>>>> negotiation between client and server to enable compression. Well, >>>>> for negotiation you could put the name of the algorithm you want in >>>>> the startup. It doesn't have to be a boolean for compression, and >>>>> then you don't need an additional round-trip. >>>> Sorry, I can only repeat arguments I already mentioned: >>>> - in future it may be possible to specify compression algorithm >>>> - even with boolean compression option server may have some reasons to >>>> reject client's request to use compression >>>> >>>> Extra flexibility is always good thing if it doesn't cost too >>>> much. And extra round of negotiation in case of enabling compression >>>> seems to me not to be a high price for it. >>> You already have this flexibility even without negotiation. I don't >>> want you to lose your flexibility. Protocol looks like this: >>> >>> - Client sends connection option "compression" with list of >>> algorithms it wants to use (comma-separated, or something). >>> >>> - First packet that the server can compress one of those algorithms >>> (or none, if it doesn't want to turn on compression). >>> >>> No additional round-trips needed. >> This is exactly how it works now... Client includes compression >> option in connection string and server replies with special message >> ('Z') if it accepts request to compress traffic between this client >> and server. > No, it's not. You don't need this message. If the server receives a > compression request, it should just turn compression on (or not), and > then have the client figure out whether it got compression back. How it will managed to do it. It receives some reply and first of all it should know whether it has to be decompressed or not. > This > is of course made harder by your refusal to use packet framing, but > still shouldn't be particularly difficult. But how? > > Thanks, > --Robbie -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 22.06.2018 00:34, Nico Williams wrote: > On Thu, Jun 21, 2018 at 10:12:17AM +0300, Konstantin Knizhnik wrote: >> On 20.06.2018 23:34, Robbie Harwood wrote: >>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> Well, that's a design decision you've made. You could put lengths on >>> chunks that are sent out - then you'd know exactly how much is needed. >>> (For instance, 4 bytes of network-order length followed by a complete >>> payload.) Then you'd absolutely know whether you have enough to >>> decompress or not. >> Do you really suggest to send extra header for each chunk of data? >> Please notice that chunk can be as small as one message: dozen of bytes >> because libpq is used for client-server communication with request-reply >> pattern. > You must have lengths, yes, otherwise you're saying that the chosen > compression mechanism must itself provide framing. > > I'm not that familiar with compression APIs and formats, but looking at > RFC1950 (zlib) for example I see no framing. > > So I think you just have to have lengths. > > Now, this being about compression, I understand that you might now want > to have 4-byte lengths, especially given that most messages will be > under 8KB. So use a varint encoding for the lengths. > > Nico No explicit framing and lengths are needed in case of using streaming compression. There can be certainly some kind of frames inside compression protocol itself, but it is intrinsic of compression algorithm. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 21.06.2018 20:14, Robbie Harwood wrote: >> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> On 21.06.2018 17:56, Robbie Harwood wrote: >>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>> On 20.06.2018 23:34, Robbie Harwood wrote: >>>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>> >>>>>> Well, that's a design decision you've made. You could put >>>>>> lengths on chunks that are sent out - then you'd know exactly how >>>>>> much is needed. (For instance, 4 bytes of network-order length >>>>>> followed by a complete payload.) Then you'd absolutely know >>>>>> whether you have enough to decompress or not. >>>>> >>>>> Do you really suggest to send extra header for each chunk of data? >>>>> Please notice that chunk can be as small as one message: dozen of >>>>> bytes because libpq is used for client-server communication with >>>>> request-reply pattern. >>>> >>>> I want you to think critically about your design. I *really* don't >>>> want to design it for you - I have enough stuff to be doing. But >>>> again, the design I gave you doesn't necessarily need that - you >>>> just need to properly buffer incomplete data. >>> >>> Right now secure_read may return any number of available bytes. But >>> in case of using streaming compression, it can happen that available >>> number of bytes is not enough to perform decompression. This is why >>> we may need to try to fetch additional portion of data. This is how >>> zpq_stream is working now. >> >> No, you need to buffer and wait until you're called again. Which is >> to say: decompress() shouldn't call secure_read(). secure_read() >> should call decompress(). > > I this case I will have to implement this code twice: both for backend > and frontend, i.e. for secure_read/secure_write and > pqsecure_read/pqsecure_write. Likely, yes. You can see that this is how TLS does it (which you should be using as a model, architecture-wise). > Frankly speaking i was very upset by design of libpq communication > layer in Postgres: there are two different implementations of almost > the same stuff for cbackend and frontend. Changing the codebases so that more could be shared is not necessarily a bad idea; however, it is a separate change from compression. >>> I do not understand how it is possible to implement in different way >>> and what is wrong with current implementation. >> >> The most succinct thing I can say is: absolutely don't modify >> pq_recvbuf(). I gave you pseudocode for how to do that. All of your >> logic should be *below* the secure_read/secure_write functions. >> >> I cannot approve code that modifies pq_recvbuf() in the manner you >> currently do. > > Well. I understand you arguments. But please also consider my > argument above (about avoiding code duplication). > > In any case, secure_read function is called only from pq_recvbuf() as > well as pqsecure_read is called only from pqReadData. So I do not > think principle difference in handling compression in secure_read or > pq_recvbuf functions and do not understand why it is "destroying the > existing model". > > Frankly speaking, I will really like to destroy existed model, moving > all system dependent stuff in Postgres to SAL and avoid this awful mix > of code sharing and duplication between backend and frontend. But it > is a another story and I do not want to discuss it here. I understand you want to avoid code duplication. I will absolutely agree that the current setup makes it difficult to share code between postmaster and libpq clients. But the way I see it, you have two choices: 1. Modify the code to make code sharing easier. Once this has been done, *then* build a compression patch on top, with the nice new architecture. 2. Leave the architecture as-is and add compression support. (Optionally, you can make code sharing easier at a later point.) Fundamentally, I think you value code sharing more than I do. So while I might advocate for (2), you might personally prefer (1). But right now you're not doing either of those. > If we are speaking about the "right design", then neither your > suggestion, neither my implementation are good and I do not see > principle differences between them. > > The right approach is using "decorator pattern": this is how streams > are designed in .Net/Java. You can construct pipe of "secure", > "compressed" and whatever else streams. I *strongly* disagree, but I don't think you're seriously suggesting this. >>>>>> My idea was the following: client want to use compression. But >>>>>> server may reject this attempt (for any reasons: it doesn't support >>>>>> it, has no proper compression library, do not want to spend CPU for >>>>>> decompression,...) Right now compression algorithm is >>>>>> hardcoded. But in future client and server may negotiate to choose >>>>>> proper compression protocol. This is why I prefer to perform >>>>>> negotiation between client and server to enable compression. Well, >>>>>> for negotiation you could put the name of the algorithm you want in >>>>>> the startup. It doesn't have to be a boolean for compression, and >>>>>> then you don't need an additional round-trip. >>>>> Sorry, I can only repeat arguments I already mentioned: >>>>> - in future it may be possible to specify compression algorithm >>>>> - even with boolean compression option server may have some reasons to >>>>> reject client's request to use compression >>>>> >>>>> Extra flexibility is always good thing if it doesn't cost too >>>>> much. And extra round of negotiation in case of enabling compression >>>>> seems to me not to be a high price for it. >>>> You already have this flexibility even without negotiation. I don't >>>> want you to lose your flexibility. Protocol looks like this: >>>> >>>> - Client sends connection option "compression" with list of >>>> algorithms it wants to use (comma-separated, or something). >>>> >>>> - First packet that the server can compress one of those algorithms >>>> (or none, if it doesn't want to turn on compression). >>>> >>>> No additional round-trips needed. >>> This is exactly how it works now... Client includes compression >>> option in connection string and server replies with special message >>> ('Z') if it accepts request to compress traffic between this client >>> and server. >> >> No, it's not. You don't need this message. If the server receives a >> compression request, it should just turn compression on (or not), and >> then have the client figure out whether it got compression back. > > How it will managed to do it. It receives some reply and first of all > it should know whether it has to be decompressed or not. You can tell whether a message is compressed by looking at it. The way the protocol works, every message has a type associated with it: a single byte, like 'R', that says what kind of message it is. Thanks, --Robbie
Attachment
On Fri, Jun 22, 2018 at 10:18:12AM +0300, Konstantin Knizhnik wrote: > On 22.06.2018 00:34, Nico Williams wrote: > >So I think you just have to have lengths. > > > >Now, this being about compression, I understand that you might now want > >to have 4-byte lengths, especially given that most messages will be > >under 8KB. So use a varint encoding for the lengths. > > No explicit framing and lengths are needed in case of using streaming > compression. > There can be certainly some kind of frames inside compression protocol > itself, but it is intrinsic of compression algorithm. I don't think that's generally true. It may be true of the compression algorithm you're working with. This is fine, of course, but plugging in other compression algorithms will require the authors to add framing.
On 22.06.2018 18:59, Robbie Harwood wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > >> On 21.06.2018 20:14, Robbie Harwood wrote: >>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>> On 21.06.2018 17:56, Robbie Harwood wrote: >>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>> On 20.06.2018 23:34, Robbie Harwood wrote: >>>>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>>> >>>>>>> Well, that's a design decision you've made. You could put >>>>>>> lengths on chunks that are sent out - then you'd know exactly how >>>>>>> much is needed. (For instance, 4 bytes of network-order length >>>>>>> followed by a complete payload.) Then you'd absolutely know >>>>>>> whether you have enough to decompress or not. >>>>>> Do you really suggest to send extra header for each chunk of data? >>>>>> Please notice that chunk can be as small as one message: dozen of >>>>>> bytes because libpq is used for client-server communication with >>>>>> request-reply pattern. >>>>> I want you to think critically about your design. I *really* don't >>>>> want to design it for you - I have enough stuff to be doing. But >>>>> again, the design I gave you doesn't necessarily need that - you >>>>> just need to properly buffer incomplete data. >>>> Right now secure_read may return any number of available bytes. But >>>> in case of using streaming compression, it can happen that available >>>> number of bytes is not enough to perform decompression. This is why >>>> we may need to try to fetch additional portion of data. This is how >>>> zpq_stream is working now. >>> No, you need to buffer and wait until you're called again. Which is >>> to say: decompress() shouldn't call secure_read(). secure_read() >>> should call decompress(). >> I this case I will have to implement this code twice: both for backend >> and frontend, i.e. for secure_read/secure_write and >> pqsecure_read/pqsecure_write. > Likely, yes. You can see that this is how TLS does it (which you should > be using as a model, architecture-wise). > >> Frankly speaking i was very upset by design of libpq communication >> layer in Postgres: there are two different implementations of almost >> the same stuff for cbackend and frontend. > Changing the codebases so that more could be shared is not necessarily a > bad idea; however, it is a separate change from compression. > >>>> I do not understand how it is possible to implement in different way >>>> and what is wrong with current implementation. >>> The most succinct thing I can say is: absolutely don't modify >>> pq_recvbuf(). I gave you pseudocode for how to do that. All of your >>> logic should be *below* the secure_read/secure_write functions. >>> >>> I cannot approve code that modifies pq_recvbuf() in the manner you >>> currently do. >> Well. I understand you arguments. But please also consider my >> argument above (about avoiding code duplication). >> >> In any case, secure_read function is called only from pq_recvbuf() as >> well as pqsecure_read is called only from pqReadData. So I do not >> think principle difference in handling compression in secure_read or >> pq_recvbuf functions and do not understand why it is "destroying the >> existing model". >> >> Frankly speaking, I will really like to destroy existed model, moving >> all system dependent stuff in Postgres to SAL and avoid this awful mix >> of code sharing and duplication between backend and frontend. But it >> is a another story and I do not want to discuss it here. > I understand you want to avoid code duplication. I will absolutely > agree that the current setup makes it difficult to share code between > postmaster and libpq clients. But the way I see it, you have two > choices: > > 1. Modify the code to make code sharing easier. Once this has been > done, *then* build a compression patch on top, with the nice new > architecture. > > 2. Leave the architecture as-is and add compression support. > (Optionally, you can make code sharing easier at a later point.) > > Fundamentally, I think you value code sharing more than I do. So while > I might advocate for (2), you might personally prefer (1). > > But right now you're not doing either of those. > >> If we are speaking about the "right design", then neither your >> suggestion, neither my implementation are good and I do not see >> principle differences between them. >> >> The right approach is using "decorator pattern": this is how streams >> are designed in .Net/Java. You can construct pipe of "secure", >> "compressed" and whatever else streams. > I *strongly* disagree, but I don't think you're seriously suggesting > this. > >>>>>>> My idea was the following: client want to use compression. But >>>>>>> server may reject this attempt (for any reasons: it doesn't support >>>>>>> it, has no proper compression library, do not want to spend CPU for >>>>>>> decompression,...) Right now compression algorithm is >>>>>>> hardcoded. But in future client and server may negotiate to choose >>>>>>> proper compression protocol. This is why I prefer to perform >>>>>>> negotiation between client and server to enable compression. Well, >>>>>>> for negotiation you could put the name of the algorithm you want in >>>>>>> the startup. It doesn't have to be a boolean for compression, and >>>>>>> then you don't need an additional round-trip. >>>>>> Sorry, I can only repeat arguments I already mentioned: >>>>>> - in future it may be possible to specify compression algorithm >>>>>> - even with boolean compression option server may have some reasons to >>>>>> reject client's request to use compression >>>>>> >>>>>> Extra flexibility is always good thing if it doesn't cost too >>>>>> much. And extra round of negotiation in case of enabling compression >>>>>> seems to me not to be a high price for it. >>>>> You already have this flexibility even without negotiation. I don't >>>>> want you to lose your flexibility. Protocol looks like this: >>>>> >>>>> - Client sends connection option "compression" with list of >>>>> algorithms it wants to use (comma-separated, or something). >>>>> >>>>> - First packet that the server can compress one of those algorithms >>>>> (or none, if it doesn't want to turn on compression). >>>>> >>>>> No additional round-trips needed. >>>> This is exactly how it works now... Client includes compression >>>> option in connection string and server replies with special message >>>> ('Z') if it accepts request to compress traffic between this client >>>> and server. >>> No, it's not. You don't need this message. If the server receives a >>> compression request, it should just turn compression on (or not), and >>> then have the client figure out whether it got compression back. >> How it will managed to do it. It receives some reply and first of all >> it should know whether it has to be decompressed or not. > You can tell whether a message is compressed by looking at it. The way > the protocol works, every message has a type associated with it: a > single byte, like 'R', that says what kind of message it is. Compressed message can contain any sequence of bytes, including 'R':) > > Thanks, > --Robbie -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 22.06.2018 19:05, Nico Williams wrote: > On Fri, Jun 22, 2018 at 10:18:12AM +0300, Konstantin Knizhnik wrote: >> On 22.06.2018 00:34, Nico Williams wrote: >>> So I think you just have to have lengths. >>> >>> Now, this being about compression, I understand that you might now want >>> to have 4-byte lengths, especially given that most messages will be >>> under 8KB. So use a varint encoding for the lengths. >> No explicit framing and lengths are needed in case of using streaming >> compression. >> There can be certainly some kind of frames inside compression protocol >> itself, but it is intrinsic of compression algorithm. > I don't think that's generally true. It may be true of the compression > algorithm you're working with. This is fine, of course, but plugging in > other compression algorithms will require the authors to add framing. If compression algorithm supports streaming mode (and most of them does), then you should not worry about frames. And if compression algorithm doesn't support streaming mode, then it should not be used for libpq traffic compression. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 22.06.2018 18:59, Robbie Harwood wrote: >> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>> On 21.06.2018 20:14, Robbie Harwood wrote: >>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>> On 21.06.2018 17:56, Robbie Harwood wrote: >>>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>>> On 20.06.2018 23:34, Robbie Harwood wrote: >>>>>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>>>> >>>>>>>>> My idea was the following: client want to use compression. But >>>>>>>>> server may reject this attempt (for any reasons: it doesn't >>>>>>>>> support it, has no proper compression library, do not want to >>>>>>>>> spend CPU for decompression,...) Right now compression >>>>>>>>> algorithm is hardcoded. But in future client and server may >>>>>>>>> negotiate to choose proper compression protocol. This is why >>>>>>>>> I prefer to perform negotiation between client and server to >>>>>>>>> enable compression. >>>>>>>> >>>>>>>> Well, for negotiation you could put the name of the algorithm >>>>>>>> you want in the startup. It doesn't have to be a boolean for >>>>>>>> compression, and then you don't need an additional round-trip. >>>>>>> >>>>>>> Sorry, I can only repeat arguments I already mentioned: >>>>>>> >>>>>>> - in future it may be possible to specify compression algorithm >>>>>>> >>>>>>> - even with boolean compression option server may have some >>>>>>> reasons to reject client's request to use compression >>>>>>> >>>>>>> Extra flexibility is always good thing if it doesn't cost too >>>>>>> much. And extra round of negotiation in case of enabling >>>>>>> compression seems to me not to be a high price for it. >>>>>> >>>>>> You already have this flexibility even without negotiation. I >>>>>> don't want you to lose your flexibility. Protocol looks like >>>>>> this: >>>>>> >>>>>> - Client sends connection option "compression" with list of >>>>>> algorithms it wants to use (comma-separated, or something). >>>>>> >>>>>> - First packet that the server can compress one of those algorithms >>>>>> (or none, if it doesn't want to turn on compression). >>>>>> >>>>>> No additional round-trips needed. >>>>> >>>>> This is exactly how it works now... Client includes compression >>>>> option in connection string and server replies with special >>>>> message ('Z') if it accepts request to compress traffic between >>>>> this client and server. >>>> >>>> No, it's not. You don't need this message. If the server receives >>>> a compression request, it should just turn compression on (or not), >>>> and then have the client figure out whether it got compression >>>> back. >>> >>> How it will managed to do it. It receives some reply and first of >>> all it should know whether it has to be decompressed or not. >> >> You can tell whether a message is compressed by looking at it. The >> way the protocol works, every message has a type associated with it: >> a single byte, like 'R', that says what kind of message it is. > > Compressed message can contain any sequence of bytes, including 'R':) Then tag your messages with a type byte. Or do it the other way around - look for the zstd framing within a message. Please, try to work with me on this instead of fighting every design change. Thanks, --Robbie
Attachment
On 22.06.2018 20:56, Robbie Harwood wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > >> On 22.06.2018 18:59, Robbie Harwood wrote: >>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>> On 21.06.2018 20:14, Robbie Harwood wrote: >>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>> On 21.06.2018 17:56, Robbie Harwood wrote: >>>>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>>>> On 20.06.2018 23:34, Robbie Harwood wrote: >>>>>>>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: >>>>>>>>> >>>>>>>>>> My idea was the following: client want to use compression. But >>>>>>>>>> server may reject this attempt (for any reasons: it doesn't >>>>>>>>>> support it, has no proper compression library, do not want to >>>>>>>>>> spend CPU for decompression,...) Right now compression >>>>>>>>>> algorithm is hardcoded. But in future client and server may >>>>>>>>>> negotiate to choose proper compression protocol. This is why >>>>>>>>>> I prefer to perform negotiation between client and server to >>>>>>>>>> enable compression. >>>>>>>>> Well, for negotiation you could put the name of the algorithm >>>>>>>>> you want in the startup. It doesn't have to be a boolean for >>>>>>>>> compression, and then you don't need an additional round-trip. >>>>>>>> Sorry, I can only repeat arguments I already mentioned: >>>>>>>> >>>>>>>> - in future it may be possible to specify compression algorithm >>>>>>>> >>>>>>>> - even with boolean compression option server may have some >>>>>>>> reasons to reject client's request to use compression >>>>>>>> >>>>>>>> Extra flexibility is always good thing if it doesn't cost too >>>>>>>> much. And extra round of negotiation in case of enabling >>>>>>>> compression seems to me not to be a high price for it. >>>>>>> You already have this flexibility even without negotiation. I >>>>>>> don't want you to lose your flexibility. Protocol looks like >>>>>>> this: >>>>>>> >>>>>>> - Client sends connection option "compression" with list of >>>>>>> algorithms it wants to use (comma-separated, or something). >>>>>>> >>>>>>> - First packet that the server can compress one of those algorithms >>>>>>> (or none, if it doesn't want to turn on compression). >>>>>>> >>>>>>> No additional round-trips needed. >>>>>> This is exactly how it works now... Client includes compression >>>>>> option in connection string and server replies with special >>>>>> message ('Z') if it accepts request to compress traffic between >>>>>> this client and server. >>>>> No, it's not. You don't need this message. If the server receives >>>>> a compression request, it should just turn compression on (or not), >>>>> and then have the client figure out whether it got compression >>>>> back. >>>> How it will managed to do it. It receives some reply and first of >>>> all it should know whether it has to be decompressed or not. >>> You can tell whether a message is compressed by looking at it. The >>> way the protocol works, every message has a type associated with it: >>> a single byte, like 'R', that says what kind of message it is. >> Compressed message can contain any sequence of bytes, including 'R':) > Then tag your messages with a type byte. Or do it the other way around > - look for the zstd framing within a message. > > Please, try to work with me on this instead of fighting every design > change. Sorry, I do not want fighting. I am always vote for peace and constructive dialog. But it is hard to me to understand and accept your arguments. I do not understand why secure_read function is better place for handling compression than pq_recvbuf. And why it is destroying existed model. I already mentioned my arguments: 1. I want to use the same code for frontend and backend. 2. I think that streaming compression can be used not only for libpq. This is why I tried to make zpq_stream independent from communication layer and pass here callbacks for sending/receiving data. If pq_recvbuf is not right place for performing decommpression, I can introduce some other function, like read_raw or something like that and do decompression here. But I do not see much sense in it. Concerning necessity of special message for acknowledging compression by server: I also do not understand why you do not like idea to send some message and what is wrong with it. What you are suggesting "then tag your message" actually is the same as sending new message. Because what is the difference between tag 'Z' and message with code 'Z'? Sorry, but I do not understand problems you are going to solve and do not see any arguments except "I can not accept it". > Thanks, > --Robbie
On 18.06.2018 23:34, Robbie Harwood wrote: > > ### > > Documentation! You're going to need it. There needs to be enough > around for other people to implement the protocol (or if you prefer, > enough for us to debug the protocol as it exists). > > In conjunction with that, please add information on how to set up > compressed vs. uncompressed connections - similarly to how we've > documentation on setting up TLS connection (though presumably compressed > connection documentation will be shorter). > Document protocol changes needed for libpq compression. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 06/25/2018 05:32 AM, Konstantin Knizhnik wrote: > > > On 18.06.2018 23:34, Robbie Harwood wrote: >> >> ### >> >> Documentation! You're going to need it. There needs to be enough >> around for other people to implement the protocol (or if you prefer, >> enough for us to debug the protocol as it exists). >> >> In conjunction with that, please add information on how to set up >> compressed vs. uncompressed connections - similarly to how we've >> documentation on setting up TLS connection (though presumably compressed >> connection documentation will be shorter). >> > > Document protocol changes needed for libpq compression. > This thread appears to have gone quiet. What concerns me is that there appears to be substantial disagreement between the author and the reviewers. Since the last thing was this new patch it should really have been put back into "needs review" (my fault to some extent - I missed that). So rather than return the patch with feedfack I'm going to set it to "needs review" and move it to the next CF. However, if we can't arrive at a consensus about the direction during the next CF it should be returned with feedback. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Aug 10, 2018 at 5:55 PM, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > This thread appears to have gone quiet. What concerns me is that there > appears to be substantial disagreement between the author and the reviewers. > Since the last thing was this new patch it should really have been put back > into "needs review" (my fault to some extent - I missed that). So rather > than return the patch with feedfack I'm going to set it to "needs review" > and move it to the next CF. However, if we can't arrive at a consensus about > the direction during the next CF it should be returned with feedback. I agree with the critiques from Robbie Harwood and Michael Paquier that the way in that compression is being hooked into the existing architecture looks like a kludge. I'm not sure I know exactly how it should be done, but the current approach doesn't look natural; it looks like it was bolted on. I agree with the critique from Peter Eisentraut and others that we should not go around and add -Z as a command-line option to all of our tools; this feature hasn't yet proved itself to be useful enough to justify that. Better to let people specify it via a connection string option if they want it. I think Thomas Munro was right to ask about what will happen when different compression libraries are in use, and I think failing uncleanly is quite unacceptable. I think there needs to be some method for negotiating the compression type; the client can, for example, provide a comma-separated list of methods it supports in preference order, and the server can pick the first one it likes. In short, I think that a number of people have provided really good feedback on this patch, and I suggest to Konstantin that he should consider accepting all of those suggestions. Commit ae65f6066dc3d19a55f4fdcd3b30003c5ad8dbed tried to introduce some facilities that can be used for protocol version negotiation as new features are added, but this patch doesn't use them. It looks to me like it instead just breaks backward compatibility. The new 'compression' option won't be recognized by older servers. If it were spelled '_pq_.compression' then the facilities in that commit would cause a NegotiateProtocolVersion message to be sent by servers which have that commit but don't support the compression option. I'm not exactly sure what will happen on even-older servers that don't have that commit either, but I hope they'll treat it as a GUC name; note that setting an unknown GUC name with a namespace prefix is not an error, but setting one without such a prefix IS an ERROR. Servers which do support compression would respond with a message indicating that compression had been enabled or, maybe, just start sending back compressed-packet messages, if we go with including some framing in the libpq protocol. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 08/13/2018 02:47 PM, Robert Haas wrote: > On Fri, Aug 10, 2018 at 5:55 PM, Andrew Dunstan > <andrew.dunstan@2ndquadrant.com> wrote: >> This thread appears to have gone quiet. What concerns me is that there >> appears to be substantial disagreement between the author and the reviewers. >> Since the last thing was this new patch it should really have been put back >> into "needs review" (my fault to some extent - I missed that). So rather >> than return the patch with feedfack I'm going to set it to "needs review" >> and move it to the next CF. However, if we can't arrive at a consensus about >> the direction during the next CF it should be returned with feedback. > I agree with the critiques from Robbie Harwood and Michael Paquier > that the way in that compression is being hooked into the existing > architecture looks like a kludge. I'm not sure I know exactly how it > should be done, but the current approach doesn't look natural; it > looks like it was bolted on. I agree with the critique from Peter > Eisentraut and others that we should not go around and add -Z as a > command-line option to all of our tools; this feature hasn't yet > proved itself to be useful enough to justify that. Better to let > people specify it via a connection string option if they want it. I > think Thomas Munro was right to ask about what will happen when > different compression libraries are in use, and I think failing > uncleanly is quite unacceptable. I think there needs to be some > method for negotiating the compression type; the client can, for > example, provide a comma-separated list of methods it supports in > preference order, and the server can pick the first one it likes. In > short, I think that a number of people have provided really good > feedback on this patch, and I suggest to Konstantin that he should > consider accepting all of those suggestions. > > Commit ae65f6066dc3d19a55f4fdcd3b30003c5ad8dbed tried to introduce > some facilities that can be used for protocol version negotiation as > new features are added, but this patch doesn't use them. It looks to > me like it instead just breaks backward compatibility. The new > 'compression' option won't be recognized by older servers. If it were > spelled '_pq_.compression' then the facilities in that commit would > cause a NegotiateProtocolVersion message to be sent by servers which > have that commit but don't support the compression option. I'm not > exactly sure what will happen on even-older servers that don't have > that commit either, but I hope they'll treat it as a GUC name; note > that setting an unknown GUC name with a namespace prefix is not an > error, but setting one without such a prefix IS an ERROR. Servers > which do support compression would respond with a message indicating > that compression had been enabled or, maybe, just start sending back > compressed-packet messages, if we go with including some framing in > the libpq protocol. > Excellent summary, and well argued recommendations, thanks. I've changed the status to waiting on author. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Robert,
First of all thank you for review and your time spent on analyzing this patch.
My comments are inside.
Sorry, I know that it is too impertinently to ask you or somebody else to suggest better way of integration compression in libpq
frontend/backend. My primary intention was to use the same code both for backend and frontend. This is why I pass pointer to receive/send function
to compression module. I am passing secure_read/secure_write function for backend and pqsecure_read/pqsecure_write functions for frontend.
And change pq_recvbuf/pqReadData, internal_flush/pqSendSome functions to call zpq_read/zpq_write functions when compression is enabled.
Robbie thinks that compression should be toggled in secure_write/secure_write/pqsecure_read/pqsecure_write.
I do not understand why it is better then my implementation. From my point of view it just introduce extra code redundancy and has no advantages.
Just name of this functions (secure_*) will be confusing if this function will handle compression as well. May be it is better to introduce some other set of function
which in turn will secuire_* functions, but I also do no think that it is good idea.
My only concern was that if somebody want to upload data using psql+COPY command,
it will be more difficult to completely change the way of invoking psql in this case rather than just adding one option
(most of user are using -h -D -U options rather than passing complete connection string).
I will add checking of supported compression method.
Right now Postgres can be configured to use either zlib, either zstd streaming compression, but not both of them.
So choosing appreciate compression algorithm is not possible, I can only check that client and server are supporting the same compression method.
Certainly it is possible to implement support of different compression method and make it possible to chose on at runtime, but I do not think that it is really good idea
unless we want to support custom compression methods.
If I tried to login with new client with _pq_.compression option to old server (9.6.8) then I got the following error:
knizhnik@knizhnik:~/dtm-data$ psql -d "port=5432 _pq_.compression=zlib dbname=postgres"
psql: expected authentication request from server, but received v
Frankly speaking I do not see big problem here: the error will happen only if we use NEW client to connect OLD server and EXPLICITLY specify compression=on option in connection string.
There will be no problem if we use old client to access new server or use new client to access old server without switching on compression.
First of all thank you for review and your time spent on analyzing this patch.
My comments are inside.
On 13.08.2018 21:47, Robert Haas wrote:
On Fri, Aug 10, 2018 at 5:55 PM, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:This thread appears to have gone quiet. What concerns me is that there appears to be substantial disagreement between the author and the reviewers. Since the last thing was this new patch it should really have been put back into "needs review" (my fault to some extent - I missed that). So rather than return the patch with feedfack I'm going to set it to "needs review" and move it to the next CF. However, if we can't arrive at a consensus about the direction during the next CF it should be returned with feedback.I agree with the critiques from Robbie Harwood and Michael Paquier that the way in that compression is being hooked into the existing architecture looks like a kludge. I'm not sure I know exactly how it should be done, but the current approach doesn't look natural; it looks like it was bolted on.
Sorry, I know that it is too impertinently to ask you or somebody else to suggest better way of integration compression in libpq
frontend/backend. My primary intention was to use the same code both for backend and frontend. This is why I pass pointer to receive/send function
to compression module. I am passing secure_read/secure_write function for backend and pqsecure_read/pqsecure_write functions for frontend.
And change pq_recvbuf/pqReadData, internal_flush/pqSendSome functions to call zpq_read/zpq_write functions when compression is enabled.
Robbie thinks that compression should be toggled in secure_write/secure_write/pqsecure_read/pqsecure_write.
I do not understand why it is better then my implementation. From my point of view it just introduce extra code redundancy and has no advantages.
Just name of this functions (secure_*) will be confusing if this function will handle compression as well. May be it is better to introduce some other set of function
which in turn will secuire_* functions, but I also do no think that it is good idea.
It is not a big deal to remove command line options.I agree with the critique from Peter Eisentraut and others that we should not go around and add -Z as a command-line option to all of our tools; this feature hasn't yet proved itself to be useful enough to justify that. Better to let people specify it via a connection string option if they want it.
My only concern was that if somebody want to upload data using psql+COPY command,
it will be more difficult to completely change the way of invoking psql in this case rather than just adding one option
(most of user are using -h -D -U options rather than passing complete connection string).
I think Thomas Munro was right to ask about what will happen when different compression libraries are in use, and I think failing uncleanly is quite unacceptable. I think there needs to be some method for negotiating the compression type; the client can, for example, provide a comma-separated list of methods it supports in preference order, and the server can pick the first one it likes.
I will add checking of supported compression method.
Right now Postgres can be configured to use either zlib, either zstd streaming compression, but not both of them.
So choosing appreciate compression algorithm is not possible, I can only check that client and server are supporting the same compression method.
Certainly it is possible to implement support of different compression method and make it possible to chose on at runtime, but I do not think that it is really good idea
unless we want to support custom compression methods.
In short, I think that a number of people have provided really good feedback on this patch, and I suggest to Konstantin that he should consider accepting all of those suggestions. Commit ae65f6066dc3d19a55f4fdcd3b30003c5ad8dbed tried to introduce some facilities that can be used for protocol version negotiation as new features are added, but this patch doesn't use them. It looks to me like it instead just breaks backward compatibility. The new 'compression' option won't be recognized by older servers. If it were spelled '_pq_.compression' then the facilities in that commit would cause a NegotiateProtocolVersion message to be sent by servers which have that commit but don't support the compression option. I'm not exactly sure what will happen on even-older servers that don't have that commit either, but I hope they'll treat it as a GUC name; note that setting an unknown GUC name with a namespace prefix is not an error, but setting one without such a prefix IS an ERROR. Servers which do support compression would respond with a message indicating that compression had been enabled or, maybe, just start sending back compressed-packet messages, if we go with including some framing in the libpq protocol.
If I tried to login with new client with _pq_.compression option to old server (9.6.8) then I got the following error:
knizhnik@knizhnik:~/dtm-data$ psql -d "port=5432 _pq_.compression=zlib dbname=postgres"
psql: expected authentication request from server, but received v
Frankly speaking I do not see big problem here: the error will happen only if we use NEW client to connect OLD server and EXPLICITLY specify compression=on option in connection string.
There will be no problem if we use old client to access new server or use new client to access old server without switching on compression.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 13.08.2018 23:06, Andrew Dunstan wrote: > > > On 08/13/2018 02:47 PM, Robert Haas wrote: >> On Fri, Aug 10, 2018 at 5:55 PM, Andrew Dunstan >> <andrew.dunstan@2ndquadrant.com> wrote: >>> This thread appears to have gone quiet. What concerns me is that there >>> appears to be substantial disagreement between the author and the >>> reviewers. >>> Since the last thing was this new patch it should really have been >>> put back >>> into "needs review" (my fault to some extent - I missed that). So >>> rather >>> than return the patch with feedfack I'm going to set it to "needs >>> review" >>> and move it to the next CF. However, if we can't arrive at a >>> consensus about >>> the direction during the next CF it should be returned with feedback. >> I agree with the critiques from Robbie Harwood and Michael Paquier >> that the way in that compression is being hooked into the existing >> architecture looks like a kludge. I'm not sure I know exactly how it >> should be done, but the current approach doesn't look natural; it >> looks like it was bolted on. I agree with the critique from Peter >> Eisentraut and others that we should not go around and add -Z as a >> command-line option to all of our tools; this feature hasn't yet >> proved itself to be useful enough to justify that. Better to let >> people specify it via a connection string option if they want it. I >> think Thomas Munro was right to ask about what will happen when >> different compression libraries are in use, and I think failing >> uncleanly is quite unacceptable. I think there needs to be some >> method for negotiating the compression type; the client can, for >> example, provide a comma-separated list of methods it supports in >> preference order, and the server can pick the first one it likes. In >> short, I think that a number of people have provided really good >> feedback on this patch, and I suggest to Konstantin that he should >> consider accepting all of those suggestions. >> >> Commit ae65f6066dc3d19a55f4fdcd3b30003c5ad8dbed tried to introduce >> some facilities that can be used for protocol version negotiation as >> new features are added, but this patch doesn't use them. It looks to >> me like it instead just breaks backward compatibility. The new >> 'compression' option won't be recognized by older servers. If it were >> spelled '_pq_.compression' then the facilities in that commit would >> cause a NegotiateProtocolVersion message to be sent by servers which >> have that commit but don't support the compression option. I'm not >> exactly sure what will happen on even-older servers that don't have >> that commit either, but I hope they'll treat it as a GUC name; note >> that setting an unknown GUC name with a namespace prefix is not an >> error, but setting one without such a prefix IS an ERROR. Servers >> which do support compression would respond with a message indicating >> that compression had been enabled or, maybe, just start sending back >> compressed-packet messages, if we go with including some framing in >> the libpq protocol. >> > > > Excellent summary, and well argued recommendations, thanks. I've > changed the status to waiting on author. > New version of the patch is attached: I removed -Z options form pgbench and psql and add checking that server and client are implementing the same compression algorithm. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Mon, Aug 20, 2018 at 06:00:39PM +0300, Konstantin Knizhnik wrote: > New version of the patch is attached: I removed -Z options form pgbench and > psql and add checking that server and client are implementing the same > compression algorithm. The patch had no reviews, and does not apply anymore, so it is moved to next CF with waiting on author as status. -- Michael
Attachment
On 01.10.2018 09:49, Michael Paquier wrote: > On Mon, Aug 20, 2018 at 06:00:39PM +0300, Konstantin Knizhnik wrote: >> New version of the patch is attached: I removed -Z options form pgbench and >> psql and add checking that server and client are implementing the same >> compression algorithm. > The patch had no reviews, and does not apply anymore, so it is moved to > next CF with waiting on author as status. > -- > Michael Rebased version of the patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
> On Mon, Aug 13, 2018 at 8:48 PM Robert Haas <robertmhaas@gmail.com> wrote: > > I agree with the critiques from Robbie Harwood and Michael Paquier > that the way in that compression is being hooked into the existing > architecture looks like a kludge. I'm not sure I know exactly how it > should be done, but the current approach doesn't look natural; it > looks like it was bolted on. After some time spend reading this patch and investigating different points, mentioned in the discussion, I tend to agree with that. As far as I see it's probably the biggest disagreement here, that keeps things from progressing. I'm interested in this feature, so if Konstantin doesn't mind, I'll post in the near future (after I'll wrap up the current CF) an updated patch I'm working on right now to propose another way of incorporating compression. For now I'm moving patch to the next CF.
Hi, > > I agree with the critiques from Robbie Harwood and Michael Paquier > > that the way in that compression is being hooked into the existing > > architecture looks like a kludge. I'm not sure I know exactly how it > > should be done, but the current approach doesn't look natural; it > > looks like it was bolted on. > > After some time spend reading this patch and investigating different points, > mentioned in the discussion, I tend to agree with that. As far as I see it's > probably the biggest disagreement here, that keeps things from progressing. > I'm interested in this feature, so if Konstantin doesn't mind, I'll post in > the near future (after I'll wrap up the current CF) an updated patch I'm working > on right now to propose another way of incorporating compression. For now > I'm moving patch to the next CF. This thread seems to be stopped. In last e-mail, Dmitry suggest to update the patch that implements the function in another way, and as far as I saw, he hasnot updated patch yet. (It may be because author has not responded.) I understand big disagreement is here, however the status is "Needs review". There is no review after author update the patch to v9. So I will do. About the patch, Please update your patch to attach current master. I could not test. About Documentation, there are typos. Please check it. I am waiting for the reviewer of the sentence because I am not sogood at English. When you add new protocol message, it needs the information of "Length of message contents in bytes, including self.". It provides supported compression algorithm as a Byte1. I think it better to provide it as a list like the NegotiateProtocolVersionprotocol. I quickly saw code changes. + nread = conn->zstream + ? zpq_read(conn->zstream, conn->inBuffer + conn->inEnd, + conn->inBufSize - conn->inEnd, &processed) + : pqsecure_read(conn, conn->inBuffer + conn->inEnd, + conn->inBufSize - conn->inEnd); How about combine as a #define macro? Because there are same logic in two place. Do you consider anything about memory control? Typically compression algorithm keeps dictionary in memory. I think it needs reset or some method. Regards, Aya Iwata
> On Wed, Jan 9, 2019 at 11:25 AM Iwata, Aya <iwata.aya@jp.fujitsu.com> wrote: > > This thread seems to be stopped. > In last e-mail, Dmitry suggest to update the patch that implements the > function in another way, and as far as I saw, he has not updated patch yet. Yep, I'm still working on it, hopefully I can submit something rather soon. But it doesn't cancel in any way all this work, done by Konstantin, so anyway thanks for the review!
On 09.01.2019 13:25, Iwata, Aya wrote: > Hi, > >>> I agree with the critiques from Robbie Harwood and Michael Paquier >>> that the way in that compression is being hooked into the existing >>> architecture looks like a kludge. I'm not sure I know exactly how it >>> should be done, but the current approach doesn't look natural; it >>> looks like it was bolted on. >> After some time spend reading this patch and investigating different points, >> mentioned in the discussion, I tend to agree with that. As far as I see it's >> probably the biggest disagreement here, that keeps things from progressing. >> I'm interested in this feature, so if Konstantin doesn't mind, I'll post in >> the near future (after I'll wrap up the current CF) an updated patch I'm working >> on right now to propose another way of incorporating compression. For now >> I'm moving patch to the next CF. > This thread seems to be stopped. > In last e-mail, Dmitry suggest to update the patch that implements the function in another way, and as far as I saw, hehas not updated patch yet. (It may be because author has not responded.) > I understand big disagreement is here, however the status is "Needs review". > There is no review after author update the patch to v9. So I will do. > > About the patch, Please update your patch to attach current master. I could not test. > > About Documentation, there are typos. Please check it. I am waiting for the reviewer of the sentence because I am not sogood at English. > > When you add new protocol message, it needs the information of "Length of message contents in bytes, including self.". > It provides supported compression algorithm as a Byte1. I think it better to provide it as a list like the NegotiateProtocolVersionprotocol. > > I quickly saw code changes. > > + nread = conn->zstream > + ? zpq_read(conn->zstream, conn->inBuffer + conn->inEnd, > + conn->inBufSize - conn->inEnd, &processed) > + : pqsecure_read(conn, conn->inBuffer + conn->inEnd, > + conn->inBufSize - conn->inEnd); > > How about combine as a #define macro? Because there are same logic in two place. > > Do you consider anything about memory control? > Typically compression algorithm keeps dictionary in memory. I think it needs reset or some method. Thank you for review. Attached please find rebased version of the patch. I fixed all issues you have reported except using list of supported compression algorithms. It will require extra round of communication between client and server to make a decision about used compression algorithm. I still not sure whether it is good idea to make it possible to user to explicitly specify compression algorithm. Right now used streaming compression algorithm is hardcoded and depends on --use-zstd ort --use-zlib configuration options. If client and server were built with the same options, then they are able to use compression. Concerning memory control: there is a call of zpq_free(PqStream) in socket_close() function which should deallocate all memory used by compressor: void zpq_free(ZpqStream *zs) { if (zs != NULL) { ZSTD_freeCStream(zs->tx_stream); ZSTD_freeDStream(zs->rx_stream); free(zs); } } -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Hi, I am sorry for my late reply. > I fixed all issues you have reported except using list of supported compression > algorithms. Sure. I confirmed that. > It will require extra round of communication between client and server to > make a decision about used compression algorithm. In beginning of this thread, Robbie Harwood said that no extra communication needed. I think so, too. > I still not sure whether it is good idea to make it possible to user to > explicitly specify compression algorithm. > Right now used streaming compression algorithm is hardcoded and depends on > --use-zstd ort --use-zlib configuration options. > If client and server were built with the same options, then they are able > to use compression. I understand that compression algorithm is hardcoded in your proposal. However given the possibility of future implementation, I think it would be better for it to have a flexibility to choose compression library. src/backend/libpq/pqcomm.c : In current Postgres source code, pq_recvbuf() calls secure_read() and pq_getbyte_if_available() also calls secure_read(). It means these functions are on the same level. However in your change, pq_getbyte_if_available() calls pq_recvbuf(), and pq_recvbuf() calls secure_read(). The level of these functions is different. I think the purpose of pq_getbyte_if_available() is to get a character if it exists and the purpose of pq_recvbuf() is to acquire data up to the expected length. In your change, pq_getbyte_if_available() may have to do unnecessary process waiting or something. So how about changing your code like this? The part that checks whether it is compressed is implemented as a #define macro(like fe_misc.c). And pq_recvbuf() and pq_getbyte_if_available()modify little, like this; - r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, - PQ_RECV_BUFFER_SIZE - PqRecvLength); + r = SOME_DEFINE_NAME_(); configure: Adding following message to the top of zlib in configure ``` {$as_echo "$as_me:${as_lineno-$LINENO}:checking whethere to build with zstd support">&5 $as_echo_n "checking whether to build with zstd suppor... ">&6;} ``` Regards, Aya Iwata
Hi, On 2019-02-08 07:01:01 +0000, Iwata, Aya wrote: > > I still not sure whether it is good idea to make it possible to user to > > explicitly specify compression algorithm. > > Right now used streaming compression algorithm is hardcoded and depends on > > --use-zstd ort --use-zlib configuration options. > > If client and server were built with the same options, then they are able > > to use compression. > I understand that compression algorithm is hardcoded in your proposal. > However given the possibility of future implementation, I think > it would be better for it to have a flexibility to choose compression library. Agreed. I think that's a hard requirement. Making explicitly not forward compatible interfaces is a bad idea. Greetings, Andres Freund
Hi, On 2018-03-30 15:53:39 +0300, Konstantin Knizhnik wrote: > Taken in account that vulnerability was found in SSL compression and so > SSLComppression is considered to be deprecated and insecure > (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), > it will be nice to have some alternative mechanism of reducing libpq > traffic. > > I have implemented some prototype implementation of it (patch is attached). > To use zstd compression, Postgres should be configured with --with-zstd. > Otherwise compression will use zlib unless it is disabled by --without-zlib > option. > I have added compression=on/off parameter to connection string and -Z option > to psql and pgbench utilities. > Below are some results: I think compression is pretty useful, and I'm not convinced that the threat model underlying the attacks on SSL really apply to postgres. But having said that, have you done any analysis of whether your implementation has the same issues? Greetings, Andres Freund
On 08.02.2019 10:14, Andres Freund wrote: > Hi, > > On 2018-03-30 15:53:39 +0300, Konstantin Knizhnik wrote: >> Taken in account that vulnerability was found in SSL compression and so >> SSLComppression is considered to be deprecated and insecure >> (http://www.postgresql-archive.org/disable-SSL-compression-td6010072.html), >> it will be nice to have some alternative mechanism of reducing libpq >> traffic. >> >> I have implemented some prototype implementation of it (patch is attached). >> To use zstd compression, Postgres should be configured with --with-zstd. >> Otherwise compression will use zlib unless it is disabled by --without-zlib >> option. >> I have added compression=on/off parameter to connection string and -Z option >> to psql and pgbench utilities. >> Below are some results: > I think compression is pretty useful, and I'm not convinced that the > threat model underlying the attacks on SSL really apply to postgres. But > having said that, have you done any analysis of whether your > implementation has the same issues? Sorry, I am not an expert in security area, so I cannot perform analysis whether using compression in SSL protocol is vulnerable and is it really applicable to libpq communication between Postgres client and server. The main idea of compression implementation at libpq level was not to solve this possible vulnerability (I am also not convinced that such kind of attack is applicable to postgres client-server communication) but reduce traffic without requirement to use SSL (which may not be possible or convenient because of many other reasons not only related with potential vulnerability). Also I believe (although I have not performed this test yet) that zstd compression is much more efficient than one used in SSL both in speed and compression ratio. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 08.02.2019 10:01, Iwata, Aya wrote: > Hi, > > I am sorry for my late reply. > >> I fixed all issues you have reported except using list of supported compression >> algorithms. > Sure. I confirmed that. > >> It will require extra round of communication between client and server to >> make a decision about used compression algorithm. > In beginning of this thread, Robbie Harwood said that no extra communication needed. > I think so, too. Well, I think that this problem is more complex and requires more discussion. There are three places determining choice of compression algorithm: 1. Specification of compression algorithm by client. Right now it is just boolean "compression" parameter in connection string, but it is obviously possible to specify concrete algorithm here. 2. List of compression algorithms supported by client. 3. List of compression algorithms supported by server. Concerning first option I have very serious doubt that it is good idea to let client choose compression protocol. Without extra round-trip it can be only implemented in this way: if client toggles compression option in connection string, then libpq includes in startup packet list of supported compression algorithms. Then server intersects this list with its own set of supported compression algorithms and if result is not empty, then somehow choose one of the commonly supported algorithms and sends it to the client with 'z' command. One more question: should we allow custom defined compression methods and if so, how them can be handled at client side (at server we can use standard extension dynamic loading mechanism). Frankly speaking, I do not think that such flexibility in choosing compression algorithms is really needed. I do not expect that there will be many situations where old client has to communicate with new server or visa versa. In most cases both client and server belongs to the same postgres distributive and so implements the same compression algorithm. As far as we are compressing only temporary data (traffic), the problem of providing backward compatibility seems to be not so important. > >> I still not sure whether it is good idea to make it possible to user to >> explicitly specify compression algorithm. >> Right now used streaming compression algorithm is hardcoded and depends on >> --use-zstd ort --use-zlib configuration options. >> If client and server were built with the same options, then they are able >> to use compression. > I understand that compression algorithm is hardcoded in your proposal. > However given the possibility of future implementation, I think > it would be better for it to have a flexibility to choose compression library. > > src/backend/libpq/pqcomm.c : > In current Postgres source code, pq_recvbuf() calls secure_read() > and pq_getbyte_if_available() also calls secure_read(). > It means these functions are on the same level. > However in your change, pq_getbyte_if_available() calls pq_recvbuf(), > and pq_recvbuf() calls secure_read(). The level of these functions is different. > > I think the purpose of pq_getbyte_if_available() is to get a character if it exists and > the purpose of pq_recvbuf() is to acquire data up to the expected length. > In your change, pq_getbyte_if_available() may have to do unnecessary process waiting or something. Sorry, but this change is essential. We can have some available data in compression buffer and we need to try to fetch it in pq_getbyte_if_available() instead of just returning EOF. > > So how about changing your code like this? > The part that checks whether it is compressed is implemented as a #define macro(like fe_misc.c). And pq_recvbuf() and pq_getbyte_if_available()modify little, like this; > > - r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, > - PQ_RECV_BUFFER_SIZE - PqRecvLength); > + r = SOME_DEFINE_NAME_(); > > configure: > Adding following message to the top of zlib in configure > ``` > {$as_echo "$as_me:${as_lineno-$LINENO}:checking whethere to build with zstd support">&5 > $as_echo_n "checking whether to build with zstd suppor... ">&6;} > ``` Sorry, but it seems to me that the following fragment of configure is doing it: + +if test "$with_zstd" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ZSTD_compress in -lzstd" >&5 +$as_echo_n "checking for ZSTD_compress in -lzstd... " >&6; } +if ${ac_cv_lib_zstd_ZSTD_compress+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_check_lib_save_LIBS=$LIBS +LIBS="-lzstd $LIBS" +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char ZSTD_compress (); +int +main () +{ +return ZSTD_compress (); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + ac_cv_lib_zstd_ZSTD_compress=yes +else + ac_cv_lib_zstd_ZSTD_compress=no +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext +LIBS=$ac_check_lib_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_zstd_ZSTD_compress" >&5 +$as_echo "$ac_cv_lib_zstd_ZSTD_compress" >&6; } +if test "x$ac_cv_lib_zstd_ZSTD_compress" = xyes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_LIBZSTD 1 +_ACEOF + + LIBS="-lzstd $LIBS" + +else + as_fn_error $? "library 'zstd' is required for ZSTD support" "$LINENO" 5 +fi + +fi > Regards, > Aya Iwata -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> On 8 Feb 2019, at 10:15, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Frankly speaking, I do not think that such flexibility in choosing compression algorithms is really needed. > I do not expect that there will be many situations where old client has to communicate with new server or visa versa. > In most cases both client and server belongs to the same postgres distributive and so implements the same compression algorithm. > As far as we are compressing only temporary data (traffic), the problem of providing backward compatibility seems to benot so important. I don’t think this assumption is entirely valid, and would risk unnecessary breakage. cheers ./daniel
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > On 08.02.2019 10:01, Iwata, Aya wrote: > >>> I fixed all issues you have reported except using list of supported >>> compression algorithms. >> >> Sure. I confirmed that. >> >>> It will require extra round of communication between client and >>> server to make a decision about used compression algorithm. >> >> In beginning of this thread, Robbie Harwood said that no extra >> communication needed. I think so, too. > > Well, I think that this problem is more complex and requires more > discussion. > There are three places determining choice of compression algorithm: > 1. Specification of compression algorithm by client. Right now it is > just boolean "compression" parameter in connection string, > but it is obviously possible to specify concrete algorithm here. > 2. List of compression algorithms supported by client. > 3. List of compression algorithms supported by server. > > Concerning first option I have very serious doubt that it is good idea > to let client choose compression protocol. > Without extra round-trip it can be only implemented in this way: > if client toggles compression option in connection string, then libpq > includes in startup packet list of supported compression algorithms. > Then server intersects this list with its own set of supported > compression algorithms and if result is not empty, then > somehow choose one of the commonly supported algorithms and sends it to > the client with 'z' command. The easiest way, which I laid out earlier in id:jlgfu1gqjbk.fsf@redhat.com, is to have the server perform selection. The client sends a list of supported algorithms in startup. Startup has a reply, so if the server wishes to use compression, then its startup reply contains which algorithm to use. Compression then begins after startup. If you really wanted to compress half the startup for some reason, you can even have the server send a packet which consists of the choice of compression algorithm and everything else in it subsequently compressed. I don't see this being useful. However, you can use a similar approach to let the client choose the algorithm if there were some compelling reason for that (there is none I'm aware of to prefer one to the other) - startup from client requests compression, reply from server lists supported algorithms, next packet from client indicates which one is in use along with compressed payload. It may help to keep in mind that you are defining your own message type here. > Frankly speaking, I do not think that such flexibility in choosing > compression algorithms is really needed. > I do not expect that there will be many situations where old client has > to communicate with new server or visa versa. > In most cases both client and server belongs to the same postgres > distributive and so implements the same compression algorithm. > As far as we are compressing only temporary data (traffic), the problem > of providing backward compatibility seems to be not so important. Your comments have been heard, but this is the model that numerous folks from project has told you we have. Your code will not pass review without algorithm agility. >> src/backend/libpq/pqcomm.c : >> In current Postgres source code, pq_recvbuf() calls secure_read() >> and pq_getbyte_if_available() also calls secure_read(). >> It means these functions are on the same level. >> However in your change, pq_getbyte_if_available() calls pq_recvbuf(), >> and pq_recvbuf() calls secure_read(). The level of these functions is different. >> >> I think the purpose of pq_getbyte_if_available() is to get a >> character if it exists and the purpose of pq_recvbuf() is to acquire >> data up to the expected length. In your change, >> pq_getbyte_if_available() may have to do unnecessary process waiting >> or something. > > Sorry, but this change is essential. We can have some available data > in compression buffer and we need to try to fetch it in > pq_getbyte_if_available() instead of just returning EOF. Aya is correct about the purposes of these functions. Take a look at how the buffering in TLS or GSSAPI works for an example of how to do this correctly. As with agility, this is what multiple folks from the project have told you is a hard requirement. None of us will be okaying your code without proper transport layering. Thanks, --Robbie
Attachment
On 08.02.2019 12:33, Daniel Gustafsson wrote: >> On 8 Feb 2019, at 10:15, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: >> Frankly speaking, I do not think that such flexibility in choosing compression algorithms is really needed. >> I do not expect that there will be many situations where old client has to communicate with new server or visa versa. >> In most cases both client and server belongs to the same postgres distributive and so implements the same compressionalgorithm. >> As far as we are compressing only temporary data (traffic), the problem of providing backward compatibility seems to benot so important. > I don’t think this assumption is entirely valid, and would risk unnecessary > breakage. I am also not sure in this assumption. We (PostgresPro) are having some issues now with CFS (file level compression of Postgres database). Some build are using zstd, some are using zlib (default)... zstd is faster and provides better compression ratio and is available at most platforms. But zlib is available almost everywhere and is used by Postgres by default... The only thing I know for sure: if we implement several algorithms and make it possible for database user to make a choice, then we will get much more problems. Right now Postgres is using zlib as the only supported compression algorithm in many places. So may be libpq compression should also use only zlib and provide no other choices? Concerning backward compatibility. Assume that we allow use zstd, but then Facebook change zstd license or some critical bug in it is found. So we will have to exclude dependency on zstd. So there will be no backward compatibility in any case, even if we support more sophisticated negotiation between client and server in choosing compression algorithm. What can be more interesting - is to support custom compression algorithms (optimized for the particular data flow). But it seems to be completely different and much more sophisticated story. We have to provide some mechanism for loading foreign libraries at client side! IMHO it is overkill. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 08.02.2019 19:26, Robbie Harwood wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > >> On 08.02.2019 10:01, Iwata, Aya wrote: >> >>>> I fixed all issues you have reported except using list of supported >>>> compression algorithms. >>> Sure. I confirmed that. >>> >>>> It will require extra round of communication between client and >>>> server to make a decision about used compression algorithm. >>> In beginning of this thread, Robbie Harwood said that no extra >>> communication needed. I think so, too. >> Well, I think that this problem is more complex and requires more >> discussion. >> There are three places determining choice of compression algorithm: >> 1. Specification of compression algorithm by client. Right now it is >> just boolean "compression" parameter in connection string, >> but it is obviously possible to specify concrete algorithm here. >> 2. List of compression algorithms supported by client. >> 3. List of compression algorithms supported by server. >> >> Concerning first option I have very serious doubt that it is good idea >> to let client choose compression protocol. >> Without extra round-trip it can be only implemented in this way: >> if client toggles compression option in connection string, then libpq >> includes in startup packet list of supported compression algorithms. >> Then server intersects this list with its own set of supported >> compression algorithms and if result is not empty, then >> somehow choose one of the commonly supported algorithms and sends it to >> the client with 'z' command. > The easiest way, which I laid out earlier in > id:jlgfu1gqjbk.fsf@redhat.com, is to have the server perform selection. > The client sends a list of supported algorithms in startup. Startup has > a reply, so if the server wishes to use compression, then its startup > reply contains which algorithm to use. Compression then begins after > startup. > > If you really wanted to compress half the startup for some reason, you > can even have the server send a packet which consists of the choice of > compression algorithm and everything else in it subsequently > compressed. I don't see this being useful. However, you can use a > similar approach to let the client choose the algorithm if there were > some compelling reason for that (there is none I'm aware of to prefer > one to the other) - startup from client requests compression, reply from > server lists supported algorithms, next packet from client indicates > which one is in use along with compressed payload. I already replied you that that next package cannot indicate which algorithm the client has choosen. Using magics or some other tricks are not reliable and acceptable solution/ It can be done only by introducing extra message type. Actually it is not really needed, because if client sends to the server list of supported algorithms in startup packet, then server can made a decision and inform client about it (using special message type as it is done now). Then client doesn't need to make a choice. I can do it if everybody think that choice of compression algorithm is so important feature. Just wonder: Postgres now is living good with hardcoded zlib and built-in LZ compression algorithm implementation. > It may help to keep in mind that you are defining your own message type > here. > >> Frankly speaking, I do not think that such flexibility in choosing >> compression algorithms is really needed. >> I do not expect that there will be many situations where old client has >> to communicate with new server or visa versa. >> In most cases both client and server belongs to the same postgres >> distributive and so implements the same compression algorithm. >> As far as we are compressing only temporary data (traffic), the problem >> of providing backward compatibility seems to be not so important. > Your comments have been heard, but this is the model that numerous folks > from project has told you we have. Your code will not pass review > without algorithm agility. > >>> src/backend/libpq/pqcomm.c : >>> In current Postgres source code, pq_recvbuf() calls secure_read() >>> and pq_getbyte_if_available() also calls secure_read(). >>> It means these functions are on the same level. >>> However in your change, pq_getbyte_if_available() calls pq_recvbuf(), >>> and pq_recvbuf() calls secure_read(). The level of these functions is different. >>> >>> I think the purpose of pq_getbyte_if_available() is to get a >>> character if it exists and the purpose of pq_recvbuf() is to acquire >>> data up to the expected length. In your change, >>> pq_getbyte_if_available() may have to do unnecessary process waiting >>> or something. >> Sorry, but this change is essential. We can have some available data >> in compression buffer and we need to try to fetch it in >> pq_getbyte_if_available() instead of just returning EOF. > Aya is correct about the purposes of these functions. Take a look at > how the buffering in TLS or GSSAPI works for an example of how to do > this correctly. > > As with agility, this is what multiple folks from the project have told > you is a hard requirement. None of us will be okaying your code without > proper transport layering. Guys, I wondering which layering violation you are talking about? Right now there are two cut&pasted peace of almost the same code in pqcomm.c: static int pq_recvbuf(void) ... /* Ensure that we're in blocking mode */ socket_set_nonblocking(false); /* Can fill buffer from PqRecvLength and upwards */ for (;;) { int r; r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, PQ_RECV_BUFFER_SIZE - PqRecvLength); if (r < 0) { if (errno == EINTR) continue; /* Ok if interrupted */ /* * Careful: an ereport() that tries to write to the client would * cause recursion to here, leading to stack overflow and core * dump! This message must go *only* to the postmaster log. */ ereport(COMMERROR, (errcode_for_socket_access(), errmsg("could not receive data from client: %m"))); return EOF; } if (r == 0) { /* * EOF detected. We used to write a log message here, but it's * better to expect the ultimate caller to do that. */ return EOF; } /* r contains number of bytes read, so just incr length */ PqRecvLength += r; return 0; } } int pq_getbyte_if_available(unsigned char *c) { ... /* Put the socket into non-blocking mode */ socket_set_nonblocking(true); r = secure_read(MyProcPort, c, 1); if (r < 0) { /* * Ok if no data available without blocking or interrupted (though * EINTR really shouldn't happen with a non-blocking socket). Report * other errors. */ if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR) r = 0; else { /* * Careful: an ereport() that tries to write to the client would * cause recursion to here, leading to stack overflow and core * dump! This message must go *only* to the postmaster log. */ ereport(COMMERROR, (errcode_for_socket_access(), errmsg("could not receive data from client: %m"))); r = EOF; } } else if (r == 0) { /* EOF detected */ r = EOF; } return r; } The only difference between them is that in first case we are using block mode and in the second case non-blocking mode. Also there is loop in first case handling ENITR. I have added nowait parameter to pq_recvbuf(bool nowait) and remove second fragment of code so it is changed just to one line: if (PqRecvPointer < PqRecvLength || (r = pq_recvbuf(true)) > 0) Both functions (pq_recvbuf and pq_getbyte_if_available) are defined in the same file. So there is not any layering violation here. It is just elimination of duplicated code. If the only objection is that pq_getbyte_if_available is calling pq_recvbuf, then I can add some other function compress_read (as analog of secure read) and call it from both functions. But frankly speaking I do not see any advantages of such approach. It just introduce extra function call and gives no extra encapsulation.modularity, flexibility or whatever else. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 2019-02-08 12:15:58 +0300, Konstantin Knizhnik wrote: > Frankly speaking, I do not think that such flexibility in choosing > compression algorithms is really needed. > I do not expect that there will be many situations where old client has to > communicate with new server or visa versa. > In most cases both client and server belongs to the same postgres > distributive and so implements the same compression algorithm. > As far as we are compressing only temporary data (traffic), the problem of > providing backward compatibility seems to be not so important. I think we should outright reject any patch without compression type negotiation.
On 08.02.2019 21:57, Andres Freund wrote: > On 2019-02-08 12:15:58 +0300, Konstantin Knizhnik wrote: >> Frankly speaking, I do not think that such flexibility in choosing >> compression algorithms is really needed. >> I do not expect that there will be many situations where old client has to >> communicate with new server or visa versa. >> In most cases both client and server belongs to the same postgres >> distributive and so implements the same compression algorithm. >> As far as we are compressing only temporary data (traffic), the problem of >> providing backward compatibility seems to be not so important. > I think we should outright reject any patch without compression type > negotiation. Does it mean that it is necessary to support multiple compression algorithms and make it possible to perform switch between them at runtime? Right now compression algorithm is linked statically. Negotiation of compression type is currently performed but it only checks that server and client are implementing the same algorithm and disables compression if it is not true. If we are going to support multiple compression algorithms, do we need dynamic loading of correspondent compression libraries or static linking is ok? In case of dynamic linking we need to somehow specify information about available compression algorithms. Some special subdirectory for them so that I can traverse this directory and try to load correspondent libraries? Only I find it too complicated for the addressed problem?
On 2/8/19 11:10 PM, Konstantin Knizhnik wrote: > > > On 08.02.2019 21:57, Andres Freund wrote: >> On 2019-02-08 12:15:58 +0300, Konstantin Knizhnik wrote: >>> Frankly speaking, I do not think that such flexibility in choosing >>> compression algorithms is really needed. >>> I do not expect that there will be many situations where old client >>> has to >>> communicate with new server or visa versa. >>> In most cases both client and server belongs to the same postgres >>> distributive and so implements the same compression algorithm. >>> As far as we are compressing only temporary data (traffic), the >>> problem of >>> providing backward compatibility seems to be not so important. >> I think we should outright reject any patch without compression type >> negotiation. > Does it mean that it is necessary to support multiple compression > algorithms and make it possible to perform switch between them at > runtime? IMHO the negotiation should happen at connection time, i.e. the server should support connections compressed by different algorithms. Not sure if that's what you mean by runtime. AFAICS this is quite close to how negotiation of encryption algorithms works, in TLS and so on. Client specifies supported algorithms, server compares that to list of supported algorithms, deduces the encryption algorithm and notifies the client. To allow fall-back to uncompressed connection, use "none" as algorithm. If there's no common algorithm, fail. > Right now compression algorithm is linked statically. > Negotiation of compression type is currently performed but it only > checks that server and client are implementing the same algorithm and > disables compression if it is not true. > I don't think we should automatically fall-back to disabled compression, when a client specifies compression algorithm. > If we are going to support multiple compression algorithms, do we need > dynamic loading of correspondent compression libraries or static linking > is ok? In case of dynamic linking we need to somehow specify information > about available compression algorithms. > Some special subdirectory for them so that I can traverse this directory > and try to load correspondent libraries? > > Only I find it too complicated for the addressed problem? > I don't think we need dynamic algorithms v1, but IMHO it'd be pretty simple to do - just add a shared_preload_library which registers it in a list in memory. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 09.02.2019 1:38, Tomas Vondra wrote: > On 2/8/19 11:10 PM, Konstantin Knizhnik wrote: >> >> On 08.02.2019 21:57, Andres Freund wrote: >>> On 2019-02-08 12:15:58 +0300, Konstantin Knizhnik wrote: >>>> Frankly speaking, I do not think that such flexibility in choosing >>>> compression algorithms is really needed. >>>> I do not expect that there will be many situations where old client >>>> has to >>>> communicate with new server or visa versa. >>>> In most cases both client and server belongs to the same postgres >>>> distributive and so implements the same compression algorithm. >>>> As far as we are compressing only temporary data (traffic), the >>>> problem of >>>> providing backward compatibility seems to be not so important. >>> I think we should outright reject any patch without compression type >>> negotiation. >> Does it mean that it is necessary to support multiple compression >> algorithms and make it possible to perform switch between them at >> runtime? > IMHO the negotiation should happen at connection time, i.e. the server > should support connections compressed by different algorithms. Not sure > if that's what you mean by runtime. > > AFAICS this is quite close to how negotiation of encryption algorithms > works, in TLS and so on. Client specifies supported algorithms, server > compares that to list of supported algorithms, deduces the encryption > algorithm and notifies the client. > > To allow fall-back to uncompressed connection, use "none" as algorithm. > If there's no common algorithm, fail. It is good analogue with SSL. Yes, SSL protocol provides several ways of authentication, encryption,... And there are several different libraries implementing SSL. But Postgres is using only one of them: OpenSSL. If I want to use some other library (for example to make it possible to serialize and pass SSL session state to other process), then there is no way to achieve it. Actually zstd also includes implementations of several compression algorithms and it choose one of them best fitting particular data stream. As in case of SSL, choice of algorithm is performed internally inside zstd - not at libpq level. Sorry, if my explanation about static and dynamic (at runtime) choice were not correct. This is how compression is toggled now: #if HAVE_LIBZSTD ZpqStream* zpq_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg) { ... } #endif So if Postgres was configured with zstd, then this implementation is included inclient and server Postgres libraries. If postgres is configures with zlib, them zlib implementation will be used. This is similar with using compression and most of other configurable features in Postgres. If we want to provide dynamic choice at runtime, then we need to have array with available compression algorithms: #if HAVE_LIBZSTD static ZpqStream* zstd_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg) { ... } #endif ZpqCompressorImpl compressorImpl[] = { #if HAVE_LIBZSTD {zstd_create, zstd_read,zstd_write,...}, #endif #if HAVE_ZLIB {zlib_create, zlib_read,zslib_write,...}, #endif ... } And the most interesting case is that if we load library dynamically. Each implementation is generated in separate library (for example libpztd.so). In this case we need to somehow specify available libraries. For example by placing them in separate directory, or specifying list of libraries in postgresql.conf. Then we try to load this library using dlopen. Such library has external dependencies of correspondent compressor library (for example -lz). The library can be successfully loaded if there correspond compressor implementation was install at the system. This is most flexible approach allowing to provide custom implementation of compressors. Compression implementation can be organized as Postgres extension and its PG_init function registers this implementation in some list. This is what I am asking about. Right now approach 1) is implemented: compression algorithm is defined by configure. It is no so difficult to extend it to support multiple algorithms. And the most flexible but more sophisticated is to load libraries dynamically. > >> Right now compression algorithm is linked statically. >> Negotiation of compression type is currently performed but it only >> checks that server and client are implementing the same algorithm and >> disables compression if it is not true. >> > I don't think we should automatically fall-back to disabled compression, > when a client specifies compression algorithm. Compression is disabled only when client and server were configured with different compression algorithms (i.e. zstd and zlib). > >> If we are going to support multiple compression algorithms, do we need >> dynamic loading of correspondent compression libraries or static linking >> is ok? In case of dynamic linking we need to somehow specify information >> about available compression algorithms. >> Some special subdirectory for them so that I can traverse this directory >> and try to load correspondent libraries? >> >> Only I find it too complicated for the addressed problem? >> > I don't think we need dynamic algorithms v1, but IMHO it'd be pretty > simple to do - just add a shared_preload_library which registers it in a > list in memory. I do not think that it is necessary to include such libraries in preload_shared_libraries list. It can be done lazily only of compression is requested by client. Also please notice that we need to load compression library both at server and client sides. preload_shared_libraries works only for postmaster.
Hi, On 2019-02-08 23:38:12 +0100, Tomas Vondra wrote: > On 2/8/19 11:10 PM, Konstantin Knizhnik wrote: > > Does it mean that it is necessary to support multiple compression > > algorithms and make it possible to perform switch between them at > > runtime? > > IMHO the negotiation should happen at connection time, i.e. the server > should support connections compressed by different algorithms. Exactly. And support client libraries with support for different compression algorithms. We have large forward and backward compatibility with libpq and even moreso with the wire protocol, we therefore shouldn't require exactly matching client / server versions or compilation options. > Not sure if that's what you mean by runtime. Same. > AFAICS this is quite close to how negotiation of encryption algorithms > works, in TLS and so on. Client specifies supported algorithms, server > compares that to list of supported algorithms, deduces the encryption > algorithm and notifies the client. > > To allow fall-back to uncompressed connection, use "none" as algorithm. > If there's no common algorithm, fail. I'm somewhat inclined to think that not compressing is preferrable. There's some reason to think that there's some cryptographical issues around compression, therefore it could make a ton of sense for specific servers to disable compression centrally. > > Right now compression algorithm is linked statically. > > Negotiation of compression type is currently performed but it only > > checks that server and client are implementing the same algorithm and > > disables compression if it is not true. > I don't think we should automatically fall-back to disabled compression, > when a client specifies compression algorithm. Why? > > If we are going to support multiple compression algorithms, do we need > > dynamic loading of correspondent compression libraries or static linking > > is ok? In case of dynamic linking we need to somehow specify information > > about available compression algorithms. > > Some special subdirectory for them so that I can traverse this directory > > and try to load correspondent libraries? > > > > Only I find it too complicated for the addressed problem? > > > > I don't think we need dynamic algorithms v1, but IMHO it'd be pretty > simple to do - just add a shared_preload_library which registers it in a > list in memory. I personally don't think that's something we need to support. There's a lot of issues around naming registries that need to synchronized between client/server. Greetings, Andres Freund
On 2/9/19 3:14 PM, Andres Freund wrote: > Hi, > > On 2019-02-08 23:38:12 +0100, Tomas Vondra wrote: >> On 2/8/19 11:10 PM, Konstantin Knizhnik wrote: >>> Does it mean that it is necessary to support multiple compression >>> algorithms and make it possible to perform switch between them at >>> runtime? >> >> IMHO the negotiation should happen at connection time, i.e. the server >> should support connections compressed by different algorithms. > > Exactly. And support client libraries with support for different > compression algorithms. We have large forward and backward compatibility > with libpq and even moreso with the wire protocol, we therefore > shouldn't require exactly matching client / server versions or > compilation options. > IMHO the main reason to want/need this is not as much backward/forward compatibility (in the sense of us adding/removing supported algorithms). A much bigger problem (IMHO) is that different systems may not have some of the libraries needed, or maybe the packager decided not to enable a particular library, etc. That's likely far more dynamic. > >> AFAICS this is quite close to how negotiation of encryption algorithms >> works, in TLS and so on. Client specifies supported algorithms, server >> compares that to list of supported algorithms, deduces the encryption >> algorithm and notifies the client. >> >> To allow fall-back to uncompressed connection, use "none" as algorithm. >> If there's no common algorithm, fail. > > I'm somewhat inclined to think that not compressing is > preferrable. There's some reason to think that there's some > cryptographical issues around compression, therefore it could make a ton > of sense for specific servers to disable compression centrally. > I agree compression is not the same as crypto in this case, so fallback to uncompressed connection might be sensible in some cases. But in other cases I probably want to be notified ASAP that the compression does not work, before pushing 10x the amount of data through the network. That's why I think why I proposed to also have "none" as compression algorithm, to allow/disallow the fallback. > >>> Right now compression algorithm is linked statically. >>> Negotiation of compression type is currently performed but it only >>> checks that server and client are implementing the same algorithm and >>> disables compression if it is not true. > >> I don't think we should automatically fall-back to disabled compression, >> when a client specifies compression algorithm. > > Why? > Because it's pretty difficult to realize it happened, particularly for non-interactive connections. Imagine you have machines where transfer between them is not free - surely you want to know when you suddenly get 10x the traffic, right? But as I said above, this should be configurable. IMHO having "none" algorithm to explicitly enable this seems like a reasonable solution. > >>> If we are going to support multiple compression algorithms, do we need >>> dynamic loading of correspondent compression libraries or static linking >>> is ok? In case of dynamic linking we need to somehow specify information >>> about available compression algorithms. >>> Some special subdirectory for them so that I can traverse this directory >>> and try to load correspondent libraries? >>> >>> Only I find it too complicated for the addressed problem? >>> >> >> I don't think we need dynamic algorithms v1, but IMHO it'd be pretty >> simple to do - just add a shared_preload_library which registers it in a >> list in memory. > > I personally don't think that's something we need to support. There's a > lot of issues around naming registries that need to synchronized between > client/server. > Don't we have that issue even without the dynamic registration? Let's say you have custom driver implementing the protocol (but not based on libpq). Surely that assumes the algorithm names match what we have? In the worst case the decompression fails (which may happen already anyway) and the connection dies. Not a big deal, no? But I agree v1 should not include this dynamic registration. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/9/19 3:02 PM, Konstantin Knizhnik wrote: > > > On 09.02.2019 1:38, Tomas Vondra wrote: >> On 2/8/19 11:10 PM, Konstantin Knizhnik wrote: >>> >>> On 08.02.2019 21:57, Andres Freund wrote: >>>> On 2019-02-08 12:15:58 +0300, Konstantin Knizhnik wrote: >>>>> Frankly speaking, I do not think that such flexibility in choosing >>>>> compression algorithms is really needed. >>>>> I do not expect that there will be many situations where old client >>>>> has to >>>>> communicate with new server or visa versa. >>>>> In most cases both client and server belongs to the same postgres >>>>> distributive and so implements the same compression algorithm. >>>>> As far as we are compressing only temporary data (traffic), the >>>>> problem of >>>>> providing backward compatibility seems to be not so important. >>>> I think we should outright reject any patch without compression type >>>> negotiation. >>> Does it mean that it is necessary to support multiple compression >>> algorithms and make it possible to perform switch between them at >>> runtime? >> IMHO the negotiation should happen at connection time, i.e. the server >> should support connections compressed by different algorithms. Not sure >> if that's what you mean by runtime. >> >> AFAICS this is quite close to how negotiation of encryption algorithms >> works, in TLS and so on. Client specifies supported algorithms, server >> compares that to list of supported algorithms, deduces the encryption >> algorithm and notifies the client. >> >> To allow fall-back to uncompressed connection, use "none" as algorithm. >> If there's no common algorithm, fail. > > It is good analogue with SSL. > Yes, SSL protocol provides several ways of authentication, encryption,... > And there are several different libraries implementing SSL. > But Postgres is using only one of them: OpenSSL. > If I want to use some other library (for example to make it possible to > serialize and pass SSL session state to other > process), then there is no way to achieve it. > That's rather misleading. Firstly, it's true we only support OpenSSL at the moment, but I do remember we've been working on adding support to a bunch of other TLS libraries. But more importantly, it's not the TLS library that's negotiated. It's the encryption algorithms that is negotiated. The server is oblivious which TLS library is used by the client (and vice versa), because the messages are the same - what matters is that they agree on keys, ciphers, etc. And those can differ/change between libraries or even versions of the same library. For us, the situation is the same - we have the messages specified by the FE/BE protocol, and it's the algorithms that are negotiated. > Actually zstd also includes implementations of several compression > algorithms and it choose one of them best fitting particular data > stream. As in case of SSL, choice of algorithm is performed internally > inside zstd - not at libpq level. > Really? I always thought zstd is a separate compression algorithm. There's adaptive compression feature, but AFAIK that essentially tweaks compression level based on network connection. Can you point me to the sources or docs explaining this? Anyway, this does not really change anything - it's internal zstd stuff. > Sorry, if my explanation about static and dynamic (at runtime) choice > were not correct. > This is how compression is toggled now: > > #if HAVE_LIBZSTD > ZpqStream* > zpq_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg) > { > ... > } > #endif > > So if Postgres was configured with zstd, then this implementation is > included inclient and server Postgres libraries. > If postgres is configures with zlib, them zlib implementation will be > used. > This is similar with using compression and most of other configurable > features in Postgres. > > If we want to provide dynamic choice at runtime, then we need to have > array with available compression algorithms: > > #if HAVE_LIBZSTD > static ZpqStream* > zstd_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg) > { > ... > } > #endif > > ZpqCompressorImpl compressorImpl[] = > { > #if HAVE_LIBZSTD > {zstd_create, zstd_read,zstd_write,...}, > #endif > #if HAVE_ZLIB > {zlib_create, zlib_read,zslib_write,...}, > #endif > ... > } > Yes, that's mostly what I've been imagining, except that you also need some sort of identifier for the algorithm - a cstring at the beginning of the struct should be enough, I guess. > And the most interesting case is that if we load library dynamically. > Each implementation is generated in separate library (for example > libpztd.so). > In this case we need to somehow specify available libraries. > For example by placing them in separate directory, or specifying list of > libraries in postgresql.conf. > Then we try to load this library using dlopen. Such library has > external dependencies of correspondent compressor library (for example > -lz). The library can be successfully loaded if there correspond > compressor implementation was install at the system. > This is most flexible approach allowing to provide custom implementation > of compressors. > Compression implementation can be organized as Postgres extension and > its PG_init function registers this implementation in some list. > How you could make them as extensions? Those are database-specific and the authentication happens before you have access to the database. As I said before, I think adding them using shared_preload_libraries and registering them in _PG_init should be sufficient. > This is what I am asking about. > Right now approach 1) is implemented: compression algorithm is defined > by configure. > It is no so difficult to extend it to support multiple algorithms. > And the most flexible but more sophisticated is to load libraries > dynamically. > Well, there's nothing stopping you from implementing the dynamic loading, but IMHO it makes v1 unnecessarily complex. >> >>> Right now compression algorithm is linked statically. >>> Negotiation of compression type is currently performed but it only >>> checks that server and client are implementing the same algorithm and >>> disables compression if it is not true. >>> >> I don't think we should automatically fall-back to disabled compression, >> when a client specifies compression algorithm. > > Compression is disabled only when client and server were configured with > different compression algorithms (i.e. zstd and zlib). > Yes, and I'm of the opinion we shouldn't do that, unless unless both sides explicitly enable that in some way. >> >>> If we are going to support multiple compression algorithms, do we need >>> dynamic loading of correspondent compression libraries or static linking >>> is ok? In case of dynamic linking we need to somehow specify information >>> about available compression algorithms. >>> Some special subdirectory for them so that I can traverse this directory >>> and try to load correspondent libraries? >>> >>> Only I find it too complicated for the addressed problem? >>> >> I don't think we need dynamic algorithms v1, but IMHO it'd be pretty >> simple to do - just add a shared_preload_library which registers it in a >> list in memory. > > I do not think that it is necessary to include such libraries in > preload_shared_libraries list. > It can be done lazily only of compression is requested by client. > Also please notice that we need to load compression library both at > server and client sides. > preload_shared_libraries works only for postmaster. > How would you know which libraries to load for a given compression algorithm? Surely, loading all available libraries just because they might happen to implement the requested algorithm seems bad? IMHO the shared_preload_libraries is a much safer (and working) approach. But I'd just leave this aside, because trying to pack all of this into v1 just increases the likelihood of it not getting committed in time. And the fact that we don't have any such infrastructure in the client just increases the risk. +1 to go with hard-coded list of supported algorithms in v1 regars -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10.02.2019 3:25, Tomas Vondra wrote:
Ok, I will implement support of multiple configured compression algorithms.On 2/9/19 3:02 PM, Konstantin Knizhnik wrote:On 09.02.2019 1:38, Tomas Vondra wrote:On 2/8/19 11:10 PM, Konstantin Knizhnik wrote:On 08.02.2019 21:57, Andres Freund wrote:On 2019-02-08 12:15:58 +0300, Konstantin Knizhnik wrote:Frankly speaking, I do not think that such flexibility in choosing compression algorithms is really needed. I do not expect that there will be many situations where old client has to communicate with new server or visa versa. In most cases both client and server belongs to the same postgres distributive and so implements the same compression algorithm. As far as we are compressing only temporary data (traffic), the problem of providing backward compatibility seems to be not so important.I think we should outright reject any patch without compression type negotiation.Does it mean that it is necessary to support multiple compression algorithms and make it possible to perform switch between them at runtime?IMHO the negotiation should happen at connection time, i.e. the server should support connections compressed by different algorithms. Not sure if that's what you mean by runtime. AFAICS this is quite close to how negotiation of encryption algorithms works, in TLS and so on. Client specifies supported algorithms, server compares that to list of supported algorithms, deduces the encryption algorithm and notifies the client. To allow fall-back to uncompressed connection, use "none" as algorithm. If there's no common algorithm, fail.It is good analogue with SSL. Yes, SSL protocol provides several ways of authentication, encryption,... And there are several different libraries implementing SSL. But Postgres is using only one of them: OpenSSL. If I want to use some other library (for example to make it possible to serialize and pass SSL session state to other process), then there is no way to achieve it.That's rather misleading. Firstly, it's true we only support OpenSSL at the moment, but I do remember we've been working on adding support to a bunch of other TLS libraries. But more importantly, it's not the TLS library that's negotiated. It's the encryption algorithms that is negotiated. The server is oblivious which TLS library is used by the client (and vice versa), because the messages are the same - what matters is that they agree on keys, ciphers, etc. And those can differ/change between libraries or even versions of the same library. For us, the situation is the same - we have the messages specified by the FE/BE protocol, and it's the algorithms that are negotiated.Actually zstd also includes implementations of several compression algorithms and it choose one of them best fitting particular data stream. As in case of SSL, choice of algorithm is performed internally inside zstd - not at libpq level.Really? I always thought zstd is a separate compression algorithm. There's adaptive compression feature, but AFAIK that essentially tweaks compression level based on network connection. Can you point me to the sources or docs explaining this? Anyway, this does not really change anything - it's internal zstd stuff.Sorry, if my explanation about static and dynamic (at runtime) choice were not correct. This is how compression is toggled now: #if HAVE_LIBZSTD ZpqStream* zpq_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg) { ... } #endif So if Postgres was configured with zstd, then this implementation is included inclient and server Postgres libraries. If postgres is configures with zlib, them zlib implementation will be used. This is similar with using compression and most of other configurable features in Postgres. If we want to provide dynamic choice at runtime, then we need to have array with available compression algorithms: #if HAVE_LIBZSTD static ZpqStream* zstd_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg) { ... } #endif ZpqCompressorImpl compressorImpl[] = { #if HAVE_LIBZSTD {zstd_create, zstd_read,zstd_write,...}, #endif #if HAVE_ZLIB {zlib_create, zlib_read,zslib_write,...}, #endif ... }Yes, that's mostly what I've been imagining, except that you also need some sort of identifier for the algorithm - a cstring at the beginning of the struct should be enough, I guess.And the most interesting case is that if we load library dynamically. Each implementation is generated in separate library (for example libpztd.so). In this case we need to somehow specify available libraries. For example by placing them in separate directory, or specifying list of libraries in postgresql.conf. Then we try to load this library using dlopen. Such library has external dependencies of correspondent compressor library (for example -lz). The library can be successfully loaded if there correspond compressor implementation was install at the system. This is most flexible approach allowing to provide custom implementation of compressors. Compression implementation can be organized as Postgres extension and its PG_init function registers this implementation in some list.How you could make them as extensions? Those are database-specific and the authentication happens before you have access to the database. As I said before, I think adding them using shared_preload_libraries and registering them in _PG_init should be sufficient.This is what I am asking about. Right now approach 1) is implemented: compression algorithm is defined by configure. It is no so difficult to extend it to support multiple algorithms. And the most flexible but more sophisticated is to load libraries dynamically.Well, there's nothing stopping you from implementing the dynamic loading, but IMHO it makes v1 unnecessarily complex.Right now compression algorithm is linked statically. Negotiation of compression type is currently performed but it only checks that server and client are implementing the same algorithm and disables compression if it is not true.I don't think we should automatically fall-back to disabled compression, when a client specifies compression algorithm.Compression is disabled only when client and server were configured with different compression algorithms (i.e. zstd and zlib).Yes, and I'm of the opinion we shouldn't do that, unless unless both sides explicitly enable that in some way.If we are going to support multiple compression algorithms, do we need dynamic loading of correspondent compression libraries or static linking is ok? In case of dynamic linking we need to somehow specify information about available compression algorithms. Some special subdirectory for them so that I can traverse this directory and try to load correspondent libraries? Only I find it too complicated for the addressed problem?I don't think we need dynamic algorithms v1, but IMHO it'd be pretty simple to do - just add a shared_preload_library which registers it in a list in memory.I do not think that it is necessary to include such libraries in preload_shared_libraries list. It can be done lazily only of compression is requested by client. Also please notice that we need to load compression library both at server and client sides. preload_shared_libraries works only for postmaster.How would you know which libraries to load for a given compression algorithm? Surely, loading all available libraries just because they might happen to implement the requested algorithm seems bad? IMHO the shared_preload_libraries is a much safer (and working) approach. But I'd just leave this aside, because trying to pack all of this into v1 just increases the likelihood of it not getting committed in time. And the fact that we don't have any such infrastructure in the client just increases the risk. +1 to go with hard-coded list of supported algorithms in v1 regars
Concerning usage of several different compression algorithms in zsd - I was not correct.
It combines LZ77 with entropy encoding stage and can adaptively adjust the compression ratio according to the load.
I will preface this with that I am not a security guy and that also do not know how the Zstd vompression works, so take any of what I say with a grain of salt. On 2/8/19 8:14 AM, Andres Freund wrote:> I think compression is pretty useful, and I'm not convinced that the > threat model underlying the attacks on SSL really apply to postgres. I think only because it is usually harder to intercept traffic between the application server and the database than between the we bbrowser and the web server. Imagine the following query which uses the session ID from the cookie to check if the logged in user has access to a file. SELECT may_download_file(session_id => $1, path => $2); When the query with its parameters is compressed the compressed size will depend on the similarity between the session ID and the requested path (assuming Zstd works similar to DEFLATE), so by tricking the web browser into making requests with specifically crafted paths while monitoring the traffic between the web server and the database the compressed request size can be use to hone in the session ID and steal people's login sessions, just like the CRIME attack[1]. So while compression is a very useful feature I am worried that it also opens application developers to a new set of security vulnerabilities which they previously were protected from when compression was removed from SSL. 1. https://en.wikipedia.org/wiki/CRIME Andreas
On 11.02.2019 2:36, Andreas Karlsson wrote: > I will preface this with that I am not a security guy and that also do > not know how the Zstd vompression works, so take any of what I say > with a grain of salt. > > On 2/8/19 8:14 AM, Andres Freund wrote:> I think compression is pretty > useful, and I'm not convinced that the >> threat model underlying the attacks on SSL really apply to postgres. > I think only because it is usually harder to intercept traffic between > the application server and the database than between the we bbrowser > and the web server. > > Imagine the following query which uses the session ID from the cookie > to check if the logged in user has access to a file. > > SELECT may_download_file(session_id => $1, path => $2); > > When the query with its parameters is compressed the compressed size > will depend on the similarity between the session ID and the requested > path (assuming Zstd works similar to DEFLATE), so by tricking the web > browser into making requests with specifically crafted paths while > monitoring the traffic between the web server and the database the > compressed request size can be use to hone in the session ID and steal > people's login sessions, just like the CRIME attack[1]. > > So while compression is a very useful feature I am worried that it > also opens application developers to a new set of security > vulnerabilities which they previously were protected from when > compression was removed from SSL. > > 1. https://en.wikipedia.org/wiki/CRIME > > Andreas Andreas? thank you for clarification. Such kind of attack is really possible. But as far as I understand such attack requires injection between server and database (to be able to analyze traffic between them). Also such attack is possible only if session_id can be somehow "guessed". If it is just big random number, then it is very unlikely that it can be hacked in in this way. But once again - I am not expert in cryptography. And this patch is not addressing SSL vulnerabilities when using compression - I agree, that compression at libpq level is not safer than SSL level compression. The goal was to support compression without using SSL. It seems to me that there are many cases when security is not requires, but reducing network traffic is desired. The best example is replication between node in local network. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
I attach new version of the patch which support choice between different compression algorithms. Right now only zstd and zlib are supported. If postgres is configured with both of them, then zstd will be used. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Mon, Feb 11, 2019 at 05:56:24PM +0300, Konstantin Knizhnik wrote: > > Also such attack is possible only if session_id can be somehow "guessed". If > it is just big random number, then it is very unlikely that it can be hacked > in in this way. I am not arguing against compression, but this point isn't exactly true. The _uniformity_ of the key makes a big difference in the practicality of the attack, not the total entropy. For example, if the session_id was a 128 bit hex string and I knew or guessed the characters before the secret part and could send data that ended up near the secret then I can guess one character at a time and infer the guess is correct when the size of the packet gets smaller. IOW, I really only have to guess with 1/16 odds each digit (because its a hex string in this example). In the case, the 128 bit secret only provides the effective protection of an 8-bit secret because it can be guessed left to right 4 bits at a time. Garick
On 2019-Feb-11, Andreas Karlsson wrote: > Imagine the following query which uses the session ID from the cookie to > check if the logged in user has access to a file. > > SELECT may_download_file(session_id => $1, path => $2); > > When the query with its parameters is compressed the compressed size will > depend on the similarity between the session ID and the requested path > (assuming Zstd works similar to DEFLATE), so by tricking the web browser > into making requests with specifically crafted paths while monitoring the > traffic between the web server and the database the compressed request size > can be use to hone in the session ID and steal people's login sessions, just > like the CRIME attack[1]. I would have said that you'd never let the attacker eavesdrop into the traffic between webserver and DB, but then that's precisely the scenario you'd use SSL for, so I suppose that even though this attack is probably just a theoretical risk at this point, it should definitely be considered. Now, does this mean that we should forbid libpq compression completely? I'm not sure -- maybe it's still usable if you encapsulate that traffic so that the attackers cannot get at it, and there's no reason to deprive those users of the usefulness of the feature. But then we need documentation warnings pointing out the risks. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-02-11 12:46:07 -0300, Alvaro Herrera wrote: > On 2019-Feb-11, Andreas Karlsson wrote: > > > Imagine the following query which uses the session ID from the cookie to > > check if the logged in user has access to a file. > > > > SELECT may_download_file(session_id => $1, path => $2); > > > > When the query with its parameters is compressed the compressed size will > > depend on the similarity between the session ID and the requested path > > (assuming Zstd works similar to DEFLATE), so by tricking the web browser > > into making requests with specifically crafted paths while monitoring the > > traffic between the web server and the database the compressed request size > > can be use to hone in the session ID and steal people's login sessions, just > > like the CRIME attack[1]. > > I would have said that you'd never let the attacker eavesdrop into the > traffic between webserver and DB, but then that's precisely the scenario > you'd use SSL for, so I suppose that even though this attack is probably > just a theoretical risk at this point, it should definitely be > considered. Right. > Now, does this mean that we should forbid libpq compression > completely? I'm not sure -- maybe it's still usable if you encapsulate > that traffic so that the attackers cannot get at it, and there's no > reason to deprive those users of the usefulness of the feature. But > then we need documentation warnings pointing out the risks. I think it's an extremely useful feature, and can be used in situation where this kind of attack doesn't pose a significant danger. E.g. pg_dump, pg_basebackup seem candidates for that, and even streaming replication seems much less endangered than sessions with lots of very small queries. But I think that means it needs to be controllable from both the server and client, and default to off (although it doesn't seem crazy to allow it in the aforementioned cases). I suspect we'd need a febe_compression = allow | off | on type setting for both client and server, and it'd default to allow for both sides (which means it'd not be used by default, but one side can opt in). Greetings, Andres Freund
On 11/02/2019 00:36, Andreas Karlsson wrote: >> threat model underlying the attacks on SSL really apply to postgres. > I think only because it is usually harder to intercept traffic between > the application server and the database than between the we bbrowser and > the web server. > > Imagine the following query which uses the session ID from the cookie to > check if the logged in user has access to a file. > > SELECT may_download_file(session_id => $1, path => $2); > > When the query with its parameters is compressed the compressed size > will depend on the similarity between the session ID and the requested > path (assuming Zstd works similar to DEFLATE), so by tricking the web > browser into making requests with specifically crafted paths while > monitoring the traffic between the web server and the database the > compressed request size can be use to hone in the session ID and steal > people's login sessions, just like the CRIME attack[1]. One mitigation is to not write code like that, that is, don't put secret parameters and user-supplied content into the same to-be-compressed chunk, or at least don't let the end user run that code at will in a tight loop. The difference in CRIME is that the attacker supplied the code. You'd trick the user to go to http://evil.com/ (via spam), and that site automatically runs JavaScript code in the user's browser that contacts https://bank.com/, which will then automatically send along any secret cookies the user had previously saved from bank.com. The evil JavaScript code can then stuff the requests to bank.com with arbitrary bytes and run the requests in a tight loop, only subject to rate controls at bank.com. The closest equivalent to that in PostgreSQL is leading a user to a fake server and having them run their \gexec-using psql script against that. However, the difference is that a web browser would then augment those outgoing requests with locally stored domain-specific cookie data. psql doesn't have such functionality. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/11/19 5:28 PM, Peter Eisentraut wrote: > One mitigation is to not write code like that, that is, don't put secret > parameters and user-supplied content into the same to-be-compressed > chunk, or at least don't let the end user run that code at will in a > tight loop. > > The difference in CRIME is that the attacker supplied the code. You'd > trick the user to go to http://evil.com/ (via spam), and that site > automatically runs JavaScript code in the user's browser that contacts > https://bank.com/, which will then automatically send along any secret > cookies the user had previously saved from bank.com. The evil > JavaScript code can then stuff the requests to bank.com with arbitrary > bytes and run the requests in a tight loop, only subject to rate > controls at bank.com. Right, CRIME is worse since it cannot be mitigated by the application developer. But even so I do not think that my query is that odd. I do not think that it is obvious to most application developer that putting user supplied data close to sensitive data is potentially dangerous. Will this attack ever be useful in practice? No idea, but I think we should be aware of what risks we open our end users to. Andreas
On 12.02.2019 17:05, Andreas Karlsson wrote: > On 2/11/19 5:28 PM, Peter Eisentraut wrote: >> One mitigation is to not write code like that, that is, don't put secret >> parameters and user-supplied content into the same to-be-compressed >> chunk, or at least don't let the end user run that code at will in a >> tight loop. >> >> The difference in CRIME is that the attacker supplied the code. You'd >> trick the user to go to http://evil.com/ (via spam), and that site >> automatically runs JavaScript code in the user's browser that contacts >> https://bank.com/, which will then automatically send along any secret >> cookies the user had previously saved from bank.com. The evil >> JavaScript code can then stuff the requests to bank.com with arbitrary >> bytes and run the requests in a tight loop, only subject to rate >> controls at bank.com. > > Right, CRIME is worse since it cannot be mitigated by the application > developer. But even so I do not think that my query is that odd. I do > not think that it is obvious to most application developer that > putting user supplied data close to sensitive data is potentially > dangerous. > > Will this attack ever be useful in practice? No idea, but I think we > should be aware of what risks we open our end users to. > > Andreas > Attached please find updated version of the patch with more comments and warning about possible vulnerabilities of using compression in documentation. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Mon, Feb 11, 2019 at 4:52 PM Andres Freund <andres@anarazel.de> wrote: > > On 2019-02-11 12:46:07 -0300, Alvaro Herrera wrote: > > On 2019-Feb-11, Andreas Karlsson wrote: > > > > > Imagine the following query which uses the session ID from the cookie to > > > check if the logged in user has access to a file. > > > > > > SELECT may_download_file(session_id => $1, path => $2); > > > > > > When the query with its parameters is compressed the compressed size will > > > depend on the similarity between the session ID and the requested path > > > (assuming Zstd works similar to DEFLATE), so by tricking the web browser > > > into making requests with specifically crafted paths while monitoring the > > > traffic between the web server and the database the compressed request size > > > can be use to hone in the session ID and steal people's login sessions, just > > > like the CRIME attack[1]. > > > > I would have said that you'd never let the attacker eavesdrop into the > > traffic between webserver and DB, but then that's precisely the scenario > > you'd use SSL for, so I suppose that even though this attack is probably > > just a theoretical risk at this point, it should definitely be > > considered. > > Right. > > > > Now, does this mean that we should forbid libpq compression > > completely? I'm not sure -- maybe it's still usable if you encapsulate > > that traffic so that the attackers cannot get at it, and there's no > > reason to deprive those users of the usefulness of the feature. But > > then we need documentation warnings pointing out the risks. > > I think it's an extremely useful feature, and can be used in situation > where this kind of attack doesn't pose a significant > danger. E.g. pg_dump, pg_basebackup seem candidates for that, and even > streaming replication seems much less endangered than sessions with lots > of very small queries. But I think that means it needs to be > controllable from both the server and client, and default to off > (although it doesn't seem crazy to allow it in the aforementioned > cases). Totally agree with this point. On Thu, Nov 29, 2018 at 8:13 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > > > On Mon, Aug 13, 2018 at 8:48 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > I agree with the critiques from Robbie Harwood and Michael Paquier > > that the way in that compression is being hooked into the existing > > architecture looks like a kludge. I'm not sure I know exactly how it > > should be done, but the current approach doesn't look natural; it > > looks like it was bolted on. > > After some time spend reading this patch and investigating different points, > mentioned in the discussion, I tend to agree with that. As far as I see it's > probably the biggest disagreement here, that keeps things from progressing. To address this point, I've spent some time playing with the patch and to see how it would look like if compression logic would not incorporate everything else. So far I've come up with a buffers juggling that you can find in the attachments. It's based on the v11 (I'm not able to keep up with Konstantin), doesn't change all relevant code (mostly stuff around secure read/write), and most likely misses a lot of details, but at least works as expected while testing manually with psql (looks like tests are broken with the original v11, so I haven't tried to run them with the attached patch either). I wonder if this approach could be more natural for the feature? Also I have few questions/commentaries: * in this discussion few times was mentioned a situation when compression logic nees to read more data to decompress the current buffer. Am I understand correctly, that it's related to Z_BUF_ERROR, when zlib couldn't make any progress and needs more data? If yes, where is it handled? * one of the points was that the code should be copy pasted between be/fe for a proper separation, but from attached experimental patch it doesn't look that a lot of code should be duplicated. * I've noticed on v11 a problem, when an attempt to connect to a db without specified compression failed with the error "server is not supported requested compression algorithm". E.g. when I'm just executing createdb. I haven't looked at the negotiation logic yet, hope to do that soon.
Attachment
Hi Dmitry, On 13.02.2019 16:56, Dmitry Dolgov wrote: > On Mon, Feb 11, 2019 at 4:52 PM Andres Freund <andres@anarazel.de> wrote: >> On 2019-02-11 12:46:07 -0300, Alvaro Herrera wrote: >>> On 2019-Feb-11, Andreas Karlsson wrote: >>> >>>> Imagine the following query which uses the session ID from the cookie to >>>> check if the logged in user has access to a file. >>>> >>>> SELECT may_download_file(session_id => $1, path => $2); >>>> >>>> When the query with its parameters is compressed the compressed size will >>>> depend on the similarity between the session ID and the requested path >>>> (assuming Zstd works similar to DEFLATE), so by tricking the web browser >>>> into making requests with specifically crafted paths while monitoring the >>>> traffic between the web server and the database the compressed request size >>>> can be use to hone in the session ID and steal people's login sessions, just >>>> like the CRIME attack[1]. >>> I would have said that you'd never let the attacker eavesdrop into the >>> traffic between webserver and DB, but then that's precisely the scenario >>> you'd use SSL for, so I suppose that even though this attack is probably >>> just a theoretical risk at this point, it should definitely be >>> considered. >> Right. >> >> >>> Now, does this mean that we should forbid libpq compression >>> completely? I'm not sure -- maybe it's still usable if you encapsulate >>> that traffic so that the attackers cannot get at it, and there's no >>> reason to deprive those users of the usefulness of the feature. But >>> then we need documentation warnings pointing out the risks. >> I think it's an extremely useful feature, and can be used in situation >> where this kind of attack doesn't pose a significant >> danger. E.g. pg_dump, pg_basebackup seem candidates for that, and even >> streaming replication seems much less endangered than sessions with lots >> of very small queries. But I think that means it needs to be >> controllable from both the server and client, and default to off >> (although it doesn't seem crazy to allow it in the aforementioned >> cases). > Totally agree with this point. > > On Thu, Nov 29, 2018 at 8:13 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: >>> On Mon, Aug 13, 2018 at 8:48 PM Robert Haas <robertmhaas@gmail.com> wrote: >>> >>> I agree with the critiques from Robbie Harwood and Michael Paquier >>> that the way in that compression is being hooked into the existing >>> architecture looks like a kludge. I'm not sure I know exactly how it >>> should be done, but the current approach doesn't look natural; it >>> looks like it was bolted on. >> After some time spend reading this patch and investigating different points, >> mentioned in the discussion, I tend to agree with that. As far as I see it's >> probably the biggest disagreement here, that keeps things from progressing. > To address this point, I've spent some time playing with the patch and to see > how it would look like if compression logic would not incorporate everything > else. So far I've come up with a buffers juggling that you can find in the > attachments. It's based on the v11 (I'm not able to keep up with Konstantin), > doesn't change all relevant code (mostly stuff around secure read/write), and > most likely misses a lot of details, but at least works as expected > while testing > manually with psql (looks like tests are broken with the original v11, so I > haven't tried to run them with the attached patch either). I wonder if this > approach could be more natural for the feature? > > Also I have few questions/commentaries: > > * in this discussion few times was mentioned a situation when compression logic > nees to read more data to decompress the current buffer. Am I understand > correctly, that it's related to Z_BUF_ERROR, when zlib couldn't make any > progress and needs more data? If yes, where is it handled? > > * one of the points was that the code should be copy pasted between be/fe for a > proper separation, but from attached experimental patch it doesn't look that > a lot of code should be duplicated. > > * I've noticed on v11 a problem, when an attempt to connect to a db without > specified compression failed with the error "server is not supported > requested compression algorithm". E.g. when I'm just executing createdb. > > I haven't looked at the negotiation logic yet, hope to do that soon. First of all thank you for attempting to push this patch, because there is really seems to be some disagreement which blocks progress of this patch. Unfortunately first reviewer (Robbie Harwood) think that my approach cause some layering violation and should be done in other way. Robbie several times suggested me to look "how the buffering in TLS or GSSAPI works" and do it in similar way. Frankly speaking I do not see some principle differences differences. As far as precise amount of data required for decompression algorithm to produce some output is not known, it is natural (from my point of view) to pass tx/rx functions to stream compressor implementation to let it read or write data itself. Also it allows to use streaming compression not only with libpq, but with any other streaming data. Right now we are reading data in two places - in frontend and backend. Passing tx/rx function to compression stream implementation we avoid code duplication. In your implementation pair of zpq_read+zpq_read_drain is called twice - in pqsecure_read and secure_read. Moreover, please notice that your implementation is still passing functions tx/rx functions to stream constructor and so zpq_read is still able to read data itself. So I do not understand which problem you have solved by replacing zpq_read with pair of zpq_read_drain+zpq_read. If we are speaking about layering, then from my point of view it is no a good idea to let secure_read function perform decompression as well. At least name will be confusing. Answering your questions: 1. When decompressor has not enough data to produce any extra output, it doesn't return error. It just not moving forward output position in the buffer. In my implementation (and actually in your's as well because you leave this code), it is done in zpq_read function itself: if (out.pos != 0) { /* If we have some decompressed data, then we return immediately */ zs->rx_total_raw += out.pos; return out.pos; } if (zs->rx.pos == zs->rx.size) { zs->rx.pos = zs->rx.size = 0; /* Reset rx buffer */ } /* Otherwise we try to fetch more data using rx function */ rc = zs->rx_func(zs->arg, (char*)zs->rx.src + zs->rx.size, ZSTD_BUFFER_SIZE - zs->rx.size); 2. Code duplication is always bad, doesn't matter how much code is copied. Frankly speaking I think that duplication of IO code between backend and frontend is one of the most awful parts of Postgres. It is always better to avoid duplication when it is possible. 3. Sorry, it was really a bug in 11 version of the patch, fixed in 12 patch. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> On Wed, Feb 13, 2019 at 3:46 PM Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: > > Moreover, please notice that your implementation is still passing functions > tx/rx functions to stream constructor and so zpq_read is still able to read > data itself. So I do not understand which problem you have solved by > replacing zpq_read with pair of zpq_read_drain+zpq_read. Nope, I've removed the call of these functions from zlib_read/write, just forgot to remove the initialization part.
> On Wed, Feb 13, 2019 at 3:52 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > > > On Wed, Feb 13, 2019 at 3:46 PM Konstantin Knizhnik > > <k.knizhnik@postgrespro.ru> wrote: > > > > Moreover, please notice that your implementation is still passing functions > > tx/rx functions to stream constructor and so zpq_read is still able to read > > data itself. So I do not understand which problem you have solved by > > replacing zpq_read with pair of zpq_read_drain+zpq_read. > > Nope, I've removed the call of these functions from zlib_read/write, just > forgot to remove the initialization part. Oh, I see the source of confusion. Due to lack of time I've implemented my changes only for zlib part, sorry that I didn't mention that before.
On 13.02.2019 17:54, Dmitry Dolgov wrote: >> On Wed, Feb 13, 2019 at 3:52 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: >> >>> On Wed, Feb 13, 2019 at 3:46 PM Konstantin Knizhnik >>> <k.knizhnik@postgrespro.ru> wrote: >>> >>> Moreover, please notice that your implementation is still passing functions >>> tx/rx functions to stream constructor and so zpq_read is still able to read >>> data itself. So I do not understand which problem you have solved by >>> replacing zpq_read with pair of zpq_read_drain+zpq_read. >> Nope, I've removed the call of these functions from zlib_read/write, just >> forgot to remove the initialization part. > Oh, I see the source of confusion. Due to lack of time I've implemented my > changes only for zlib part, sorry that I didn't mention that before. And I have looked at zstd part;) Ok, but still I think that it is better to pass tx/rx function to stream. There are two important advantages: 1. It eliminates code duplication. 2. It allows to use (in future) this streaming compression not only for libpq for for other streaming data. And I do not see any disadvantages. Concerning "layering violation" may be it is better to introduce some other functions something like inflate_read, deflate_write and call them instead of *secure_read. But from my point of view it will not improve readability and modularity of code. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes: > First of all thank you for attempting to push this patch, because > there is really seems to be some disagreement which blocks progress of > this patch. Unfortunately first reviewer (Robbie Harwood) think that > my approach cause some layering violation and should be done in other > way. Robbie several times suggested me to look "how the buffering in > TLS or GSSAPI works" and do it in similar way. Frankly speaking I do > not see some principle differences differences. Hello, In order to comply with your evident desires, consider this message a courtesy notice that I will no longer be reviewing this patch or accepting future code from you. Thanks, --Robbie
Attachment
For the records, I'm really afraid of interfering with the conversation at this point, but I believe it's necessary for the sake of a good feature :) > On Wed, Feb 13, 2019 at 4:03 PM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > > 1. When decompressor has not enough data to produce any extra output, it > doesn't return error. It just not moving forward output position in the > buffer. I'm confused, because here is what I see in zlib: // zlib.h inflate() returns Z_OK if some progress has been made, ... , Z_BUF_ERROR if no progress was possible or if there was not enough room in the output buffer when Z_FINISH is used. Note that Z_BUF_ERROR is not fatal, and inflate() can be called again with more input and more output space to continue decompressing. So, sounds like if no moving forward happened, there should be Z_BUF_ERROR. But I haven't experimented with this yet to figure out. > Ok, but still I think that it is better to pass tx/rx function to stream. > There are two important advantages: > 1. It eliminates code duplication. Ok. > 2. It allows to use (in future) this streaming compression not only for > libpq for for other streaming data. Can you elaborate on this please? > Concerning "layering violation" may be it is better to introduce some other > functions something like inflate_read, deflate_write and call them instead of > secure_read. But from my point of view it will not improve readability and > modularity of code. If we will unwrap the current compression logic to not contain tx/rx functions, isn't it going to be the same as you describing it anyway, just from the higher point of view? What I'm saying is that there is a compression logic, for it some data would be read or written from it, just not right here an now by compression code itself, but rather by already existing machinery (which could be even beneficial for the patch implementation). > And I do not see any disadvantages. The main disadvantage, as I see it, is that there is no agreement about this approach. Probably in such situations it makes sense to experiment with different suggestions, to see how would they look like - let's be flexible. > On Wed, Feb 13, 2019 at 8:34 PM Robbie Harwood <rharwood@redhat.com> wrote: > > In order to comply with your evident desires, consider this message a > courtesy notice that I will no longer be reviewing this patch or > accepting future code from you. I'm failing to see why miscommunication should necessarily lead to such statements, but it's your decision after all.
On 14.02.2019 19:45, Dmitry Dolgov wrote: > For the records, I'm really afraid of interfering with the conversation at this > point, but I believe it's necessary for the sake of a good feature :) > >> On Wed, Feb 13, 2019 at 4:03 PM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: >> >> 1. When decompressor has not enough data to produce any extra output, it >> doesn't return error. It just not moving forward output position in the >> buffer. > I'm confused, because here is what I see in zlib: > > // zlib.h > inflate() returns Z_OK if some progress has been made, ... , Z_BUF_ERROR > if no progress was possible or if there was not enough room in the output > buffer when Z_FINISH is used. Note that Z_BUF_ERROR is not fatal, and > inflate() can be called again with more input and more output space to > continue decompressing. > > So, sounds like if no moving forward happened, there should be Z_BUF_ERROR. But > I haven't experimented with this yet to figure out. Looks like I really missed this case. I need to handle Z_BUF_ERROR in zlib_read: if (rc != Z_OK && rc != Z_BUF_ERROR) { return ZPQ_DECOMPRESS_ERROR; } Strange that it works without it. Looks like written compressed message is very rarely splitted because of socket buffer overflow. > >> Ok, but still I think that it is better to pass tx/rx function to stream. >> There are two important advantages: >> 1. It eliminates code duplication. > Ok. > >> 2. It allows to use (in future) this streaming compression not only for >> libpq for for other streaming data. > Can you elaborate on this please? All this logic with fetching enough data to perform successful decompression of data chunk is implemented in one place - in zpq_stream and it is not needed to repeat it in all places where compression is used. IMHO passing rx/tx function to compressor stream is quite natural model - it is "decorator design pattern" https://en.wikipedia.org/wiki/Decorator_pattern (it is how for example streams are implemented in Java). > >> Concerning "layering violation" may be it is better to introduce some other >> functions something like inflate_read, deflate_write and call them instead of >> secure_read. But from my point of view it will not improve readability and >> modularity of code. > If we will unwrap the current compression logic to not contain tx/rx functions, > isn't it going to be the same as you describing it anyway, just from the higher > point of view? What I'm saying is that there is a compression logic, for it > some data would be read or written from it, just not right here an now by > compression code itself, but rather by already existing machinery (which could > be even beneficial for the patch implementation). I do not understand why passing rx/tx functions to zpq_create is violating existed machinery. > >> And I do not see any disadvantages. > The main disadvantage, as I see it, is that there is no agreement about this > approach. Probably in such situations it makes sense to experiment with > different suggestions, to see how would they look like - let's be flexible. Well, from my point of view approach with rx/tx is more flexible and modular. But if most of other developers think that using read/read_drain is preferable, then I will not complaint against using your approach. > >> On Wed, Feb 13, 2019 at 8:34 PM Robbie Harwood <rharwood@redhat.com> wrote: >> >> In order to comply with your evident desires, consider this message a >> courtesy notice that I will no longer be reviewing this patch or >> accepting future code from you. > I'm failing to see why miscommunication should necessarily lead to such > statements, but it's your decision after all. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 2018-06-19 09:54, Konstantin Knizhnik wrote: > The main drawback of streaming compression is that you can not > decompress some particular message without decompression of all previous > messages. It seems this would have an adverse effect on protocol-aware connection proxies: They would have to uncompress everything coming in and recompress everything going out. The alternative of compressing each packet individually would work much better: A connection proxy could peek into the packet header and only uncompress the (few, small) packets that it needs for state and routing. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 15.02.2019 15:42, Peter Eisentraut wrote:
Individual compression of each message depreciate all idea of libpq compression.On 2018-06-19 09:54, Konstantin Knizhnik wrote:The main drawback of streaming compression is that you can not decompress some particular message without decompression of all previous messages.It seems this would have an adverse effect on protocol-aware connection proxies: They would have to uncompress everything coming in and recompress everything going out. The alternative of compressing each packet individually would work much better: A connection proxy could peek into the packet header and only uncompress the (few, small) packets that it needs for state and routing.
Messages are two small to efficiently compress each of them separately.
So using streaming compression algorithm is absolutely necessary here.
Concerning possible problem with proxies I do not think that it is really a problem.
Proxy is very rarely located somewhere in the "middle" between client and database servers.
It is usually launched either in the same network as DBMS client (for example, if client is application server) either in the same network with database server.
In both cases there is not so much sense to pass compressed traffic through the proxy.
If proxy and DBMS server are located in the same network, then proxy should perform decompression and send
decompressed messages to the database server.
Thank you very much for noticing this problem with compatibility compression and protocol-aware connection proxies.
I have wrote that current compression implementation (zpq_stream.c) can be used not only for libpq backend/frontend, but
also for compression any other streaming data. But I could not imaging what other data sources can require compression.
And proxy is exactly such case: it also needs to compress/decompress messages.
It is one more argument to make interface of zpq_stream as simple as possible and encapsulate all inflating/deflating logic in this code.
It can be achieved by passing arbitrary rx/tx function to zpq_create function.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 2/15/19 3:03 PM, Konstantin Knizhnik wrote: > > > On 15.02.2019 15:42, Peter Eisentraut wrote: >> On 2018-06-19 09:54, Konstantin Knizhnik wrote: >>> The main drawback of streaming compression is that you can not >>> decompress some particular message without decompression of all previous >>> messages. >> It seems this would have an adverse effect on protocol-aware connection >> proxies: They would have to uncompress everything coming in and >> recompress everything going out. >> >> The alternative of compressing each packet individually would work much >> better: A connection proxy could peek into the packet header and only >> uncompress the (few, small) packets that it needs for state and routing. >> > Individual compression of each message depreciate all idea of libpq > compression. Messages are two small to efficiently compress each of > them separately. So using streaming compression algorithm is > absolutely necessary here. > Hmmm, I see Peter was talking about "packets" while you're talking about "messages". Are you talking about the same thing? Anyway, I was going to write about the same thing - that per-message compression would likely eliminate most of the benefits - but I'm wondering if it's actually true. That is, how much will the compression ratio drop if we compress individual messages? Obviously, if there are just tiny messages, it might easily eliminate any benefits (and in fact it would add overhead). But I'd say we're way more interested in transferring large data sets (result sets, data for copy, etc.) and presumably those messages are much larger. So maybe we could compress just those, somehow? > Concerning possible problem with proxies I do not think that it is > really a problem. > Proxy is very rarely located somewhere in the "middle" between client > and database servers. > It is usually launched either in the same network as DBMS client (for > example, if client is application server) either in the same network > with database server. > In both cases there is not so much sense to pass compressed traffic > through the proxy. > If proxy and DBMS server are located in the same network, then proxy > should perform decompression and send > decompressed messages to the database server. > I don't think that's entirely true. It makes perfect sense to pass compressed traffic in various situations - even in local network the network bandwidth matters quite a lot, these days. Because "local network" may be "same availability zone" or "same DC" etc. That being said, I'm not sure it's a big deal / issue when the proxy has to deal with compression. Either it's fine to forward decompressed data, so the proxy performs just decompression, which requires much less CPU. (It probably needs to compress data in the opposite direction, but it's usually quite asymmetric - much more data is sent in one direction). Or the data has to be recompressed, because it saves enough network bandwidth. It's essentially a trade-off between using CPU and network bandwidth. IMHO it'd be nonsense to adopt the per-message compression based merely on the fact that it might be easier to handle on proxies. We need to know if we can get reasonable compression ratio with that approach, because if not then it's useless that it's more proxy-friendly. Do the proxies actually need to recompress the data? Can't they just decompress it to determine which messages are in the data, and then forward the original compressed stream? That would be much cheaper, because decompression requires much less CPU. Although, some proxies (like connection pools) probably have to compress the connections independently ... > Thank you very much for noticing this problem with compatibility > compression and protocol-aware connection proxies. > I have wrote that current compression implementation (zpq_stream.c) can > be used not only for libpq backend/frontend, but > also for compression any other streaming data. But I could not imaging > what other data sources can require compression. > And proxy is exactly such case: it also needs to compress/decompress > messages. > It is one more argument to make interface of zpq_stream as simple as > possible and encapsulate all inflating/deflating logic in this code. > It can be achieved by passing arbitrary rx/tx function to zpq_create > function. > regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 15.02.2019 18:26, Tomas Vondra wrote: > > > On 2/15/19 3:03 PM, Konstantin Knizhnik wrote: >> >> On 15.02.2019 15:42, Peter Eisentraut wrote: >>> On 2018-06-19 09:54, Konstantin Knizhnik wrote: >>>> The main drawback of streaming compression is that you can not >>>> decompress some particular message without decompression of all previous >>>> messages. >>> It seems this would have an adverse effect on protocol-aware connection >>> proxies: They would have to uncompress everything coming in and >>> recompress everything going out. >>> >>> The alternative of compressing each packet individually would work much >>> better: A connection proxy could peek into the packet header and only >>> uncompress the (few, small) packets that it needs for state and routing. >>> >> Individual compression of each message depreciate all idea of libpq >> compression. Messages are two small to efficiently compress each of >> them separately. So using streaming compression algorithm is >> absolutely necessary here. >> > Hmmm, I see Peter was talking about "packets" while you're talking about > "messages". Are you talking about the same thing? Sorry, but there are no "packet" in libpq protocol, so I assumed that packet=message. In any protocol-aware proxy has to proceed each message. > > Anyway, I was going to write about the same thing - that per-message > compression would likely eliminate most of the benefits - but I'm > wondering if it's actually true. That is, how much will the compression > ratio drop if we compress individual messages? Compression of small messages without shared dictionary will give awful results. Assume that average record and so message size is 100 bytes. Just perform very simple experiment create file with 100 equal characters and try to compress it. With zlib result will be 173 bytes. So after "compression" size of file is increase 1.7 times. This is why there is no other way to efficiently compress libpq traffic without usage of streaming compression (when dictionary is shared and updated for all messages). > > Obviously, if there are just tiny messages, it might easily eliminate > any benefits (and in fact it would add overhead). But I'd say we're way > more interested in transferring large data sets (result sets, data for > copy, etc.) and presumably those messages are much larger. So maybe we > could compress just those, somehow? Please notice that copy stream consists of individual messages for each record. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin, This patch appears to be failing tests so I have marked it Waiting on Author. I have also removed the reviewer since no review had been done. Maybe somebody else will have a look. Regards, -- -David david@pgmasters.net
On 25.03.2019 11:06, David Steele wrote: > Konstantin, > > > This patch appears to be failing tests so I have marked it Waiting on > Author. > > I have also removed the reviewer since no review had been done. Maybe > somebody else will have a look. > > Regards, Can you please inform me which tests are failed? I have done "make check-world" and there were no failed tests. Actually if compression is not enabled (and it is disabled by default unless explicitly requested by client), there should be no difference with vanilla Postgres. So it will be strange if some tests are failed without using compression. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 3/25/19 1:04 PM, Konstantin Knizhnik wrote: > > > On 25.03.2019 11:06, David Steele wrote: >> Konstantin, >> >> >> This patch appears to be failing tests so I have marked it Waiting on >> Author. >> >> I have also removed the reviewer since no review had been done. Maybe >> somebody else will have a look. >> >> Regards, > > Can you please inform me which tests are failed? > I have done "make check-world" and there were no failed tests. > Actually if compression is not enabled (and it is disabled by default > unless explicitly requested by client), there should be no difference > with vanilla Postgres. > So it will be strange if some tests are failed without using compression. Check out the cfbot report at http://commitfest.cputube.org. Both platforms were failed earlier but Windows is running again now. Doubt that will make much difference, though. Regards, -- -David david@pgmasters.net
> On Mon, Mar 25, 2019 at 11:39 AM David Steele <david@pgmasters.net> wrote: > > On 3/25/19 1:04 PM, Konstantin Knizhnik wrote: > > > > > > On 25.03.2019 11:06, David Steele wrote: > >> Konstantin, > >> > >> > >> This patch appears to be failing tests so I have marked it Waiting on > >> Author. > >> > >> I have also removed the reviewer since no review had been done. Maybe > >> somebody else will have a look. > >> > >> Regards, > > > > Can you please inform me which tests are failed? > > I have done "make check-world" and there were no failed tests. > > Actually if compression is not enabled (and it is disabled by default > > unless explicitly requested by client), there should be no difference > > with vanilla Postgres. > > So it will be strange if some tests are failed without using compression. > > Check out the cfbot report at http://commitfest.cputube.org I guess it's red because the last posted patch happened to be my experimental patch (based indeed on a broken revision v11), not the one posted by Konstantin.
On 25.03.2019 13:48, Dmitry Dolgov wrote: >> On Mon, Mar 25, 2019 at 11:39 AM David Steele <david@pgmasters.net> wrote: >> >> On 3/25/19 1:04 PM, Konstantin Knizhnik wrote: >>> >>> On 25.03.2019 11:06, David Steele wrote: >>>> Konstantin, >>>> >>>> >>>> This patch appears to be failing tests so I have marked it Waiting on >>>> Author. >>>> >>>> I have also removed the reviewer since no review had been done. Maybe >>>> somebody else will have a look. >>>> >>>> Regards, >>> Can you please inform me which tests are failed? >>> I have done "make check-world" and there were no failed tests. >>> Actually if compression is not enabled (and it is disabled by default >>> unless explicitly requested by client), there should be no difference >>> with vanilla Postgres. >>> So it will be strange if some tests are failed without using compression. >> Check out the cfbot report at http://commitfest.cputube.org > I guess it's red because the last posted patch happened to be my experimental > patch (based indeed on a broken revision v11), not the one posted by Konstantin. Rebased version of my patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 25.03.2019 13:38, David Steele wrote: > On 3/25/19 1:04 PM, Konstantin Knizhnik wrote: >> >> >> On 25.03.2019 11:06, David Steele wrote: >>> Konstantin, >>> >>> >>> This patch appears to be failing tests so I have marked it Waiting >>> on Author. >>> >>> I have also removed the reviewer since no review had been done. >>> Maybe somebody else will have a look. >>> >>> Regards, >> >> Can you please inform me which tests are failed? >> I have done "make check-world" and there were no failed tests. >> Actually if compression is not enabled (and it is disabled by default >> unless explicitly requested by client), there should be no difference >> with vanilla Postgres. >> So it will be strange if some tests are failed without using >> compression. > > Check out the cfbot report at http://commitfest.cputube.org. Both > platforms were failed earlier but Windows is running again now. Doubt > that will make much difference, though. > > Regards, Yet another version of the patch which should fix problems at Windows. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
> 26 марта 2019 г., в 19:46, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> написал(а): > > Version of the patch correctly working when no compression algorithm are avaiable. Thanks for this work, Konstantin. PFA rebased version of this patch. This compression seems very important to reduce network-induced replication lag. Recently I've found out that small installation suffer from huge latency spike when archive_timeout occurs. How this happens? archive_timeout emits segment switch, which pads current segment with zeroes. Now these zeroes need tobe replicated to standbys, without compression. Transfer of zeroes causes long waits for synchronous replication (up to1s with network restricted to 16Mbps). Also replication compression will reduce overall cross-AZ traffic of HA installations. So I'm considering following plan for 2020: implement this protocol in Odyssey and send patches to drivers to enable earlyaccess to this feature. Best regards, Andrey Borodin.
Attachment
On 06.10.2020 9:34, Andrey M. Borodin wrote: > >> 26 марта 2019 г., в 19:46, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> написал(а): >> >> Version of the patch correctly working when no compression algorithm are avaiable. > Thanks for this work, Konstantin. > PFA rebased version of this patch. > > This compression seems very important to reduce network-induced replication lag. > Recently I've found out that small installation suffer from huge latency spike when archive_timeout occurs. > How this happens? archive_timeout emits segment switch, which pads current segment with zeroes. Now these zeroes need tobe replicated to standbys, without compression. Transfer of zeroes causes long waits for synchronous replication (up to1s with network restricted to 16Mbps). > > Also replication compression will reduce overall cross-AZ traffic of HA installations. > > So I'm considering following plan for 2020: implement this protocol in Odyssey and send patches to drivers to enable earlyaccess to this feature. > > Best regards, Andrey Borodin. Rebased version of the patch is attached. I am going to resubmit it to the next CF.
Attachment
Remove redundant processed parameter from zpq_read function. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Rebased version of the patch. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested Hi, thanks for the patch! I’ve made a quick review and found one issue. If the backend sends a CompressionAck message followed by some already compressed message (for example, AuthenticationOk),then there is a chance that pqReadData() will read both messages into the read buffer at once. In thiscase, the CompressionAck message will be read normally, but the client will fail to recognize the next message (for example,AuthenticationOk) since it came in a compressed form but was incorrectly read as a regular message. So the clientwould not be able to recognize the second message and will crash. Example of a successful launch (added some debug output): usernamedt-osx: ~ usernamedt $ psql -d "host = x.x.x.x port = 6432 dbname = testdb user = testuser compression = 1" NUM_READ: 6 (pqReadData read CompressionAck (6 bytes) and nothing more) pqReadData RC: 1 NUM_READ: 346 pqReadData RC: 1 psql (14devel) Type "help" for help. testdb => // OK Example of a failed launch: usernamedt-osx: ~ usernamedt $ psql -d "host = x.x.x.x port = 6432 dbname = testdb user = testuser compression = 1" NUM_READ: 24 (pqReadData read CompressionAck (6 bytes) and compressed AuthenticationOk (18 bytes) came after it) pqReadData RC: 1 psql: error: could not connect to server: expected authentication request from server, but received x // FAIL -- Daniil Zakhlystov
Hi, On 2020-10-26 19:20:46 +0300, Konstantin Knizhnik wrote: > diff --git a/configure b/configure > index ace4ed5..deba608 100755 > --- a/configure > +++ b/configure > @@ -700,6 +700,7 @@ LD > LDFLAGS_SL > LDFLAGS_EX > with_zlib > +with_zstd > with_system_tzdata > with_libxslt > XML2_LIBS I don't see a corresponding configure.ac change? > + <varlistentry id="libpq-connect-compression" xreflabel="compression"> > + <term><literal>compression</literal></term> > + <listitem> > + <para> > + Request compression of libpq traffic. Client sends to the server list of compression algorithms, supported byclient library. > + If server supports one of this algorithms, then it acknowledges use of this algorithm and then all libpq messagessend both from client to server and > + visa versa will be compressed. If server is not supporting any of the suggested algorithms, then it replies with'n' (no compression) > + message and it is up to the client whether to continue work without compression or report error. > + Supported compression algorithms are chosen at configure time. Right now two libraries are supported: zlib (default)and zstd (if Postgres was > + configured with --with-zstd option). In both cases streaming mode is used. > + </para> > + </listitem> > + </varlistentry> > + - there should be a reference to potential security impacts when used in combination with encrypted connections - What does " and it is up to the client whether to continue work without compression or report error" actually mean for a libpq parameter? - What is the point of the "streaming mode" reference? > @@ -263,6 +272,21 @@ > </varlistentry> > > <varlistentry> > + <term>CompressionAck</term> > + <listitem> > + <para> > + Server acknowledges using compression for client-server communication protocol. > + Compression can be requested by client by including "compression" option in connection string. > + Client sends to the server list of compression algorithms, supported by client library > + (compression algorithm is identified by one letter: <literal>'f'</literal> - Facebook zstd, <literal>'z'</literal>- zlib,...). > + If server supports one of this algorithms, then it acknowledges use of this algorithm and all subsequent libpqmessages send both from client to server and > + visa versa will be compressed. If server is not supporting any of the suggested algorithms, then it replies with'n' (no compression) > + algorithm identifier and it is up to the client whether to continue work without compression or report error. > + </para> > + </listitem> > + </varlistentry> Why are compression methods identified by one byte identifiers? That seems unnecessarily small, given this is commonly a once-per-connection action? The protocol sounds to me like there's no way to enable/disable compression in an existing connection. To me it seems better to have an explicit, client initiated, request to use a specific method of compression (including none). That allows to enable compression for bulk work, and disable it in other cases (e.g. for security sensitive content, or for unlikely to compress well content). I think that would also make cross-version handling easier, because a newer client driver can send the compression request and handle the error, without needing to reconnect or such. Most importantly, I think such a design is basically a necessity to make connection poolers to work in a sensible way. And lastly, wouldn't it be reasonable to allow to specify things like compression levels? All that doesn't have to be supported now, but I think the protocol should take that into account. > +<para> > + Used compression algorithm. Right now the following streaming compression algorithms are supported: 'f' - Facebookzstd, 'z' - zlib, 'n' - no compression. > +</para> I would prefer this just be referenced as zstd or zstandard, not facebook zstd. There's an RFC (albeit only "informational"), and it doesn't name facebook, except as an employer: https://tools.ietf.org/html/rfc8478 > +int > +pq_configure(Port* port) > +{ > + char* client_compression_algorithms = port->compression_algorithms; > + /* > + * If client request compression, it sends list of supported compression algorithms. > + * Each compression algorirthm is idetified by one letter ('f' - Facebook zsts, 'z' - xlib) > + */ s/algorirthm/algorithm/ s/idetified/identified/ s/zsts/zstd/ s/xlib/zlib/ That's, uh, quite the typo density. > + if (client_compression_algorithms) > + { > + char server_compression_algorithms[ZPQ_MAX_ALGORITHMS]; > + char compression_algorithm = ZPQ_NO_COMPRESSION; > + char compression[6] = {'z',0,0,0,5,0}; /* message length = 5 */ > + int rc; Why is this hand-rolling protocol messages? > + /* Intersect lists */ > + while (*client_compression_algorithms != '\0') > + { > + if (strchr(server_compression_algorithms, *client_compression_algorithms)) > + { > + compression_algorithm = *client_compression_algorithms; > + break; > + } > + client_compression_algorithms += 1; > + } Why isn't this is handled within zpq? > + /* Send 'z' message to the client with selectde comression algorithm ('n' if match is ont found) */ s/selectde/selected/ s/comression/compression/ s/ont/not/ > + socket_set_nonblocking(false); > + while ((rc = secure_write(MyProcPort, compression, sizeof(compression))) < 0 > + && errno == EINTR); > + if ((size_t)rc != sizeof(compression)) > + return -1; Huh? This all seems like an abstraction violation. > + /* initialize compression */ > + if (zpq_set_algorithm(compression_algorithm)) > + PqStream = zpq_create((zpq_tx_func)secure_write, (zpq_rx_func)secure_read, MyProcPort); > + } > + return 0; > +} Why is zpq a wrapper around secure_write/read? I'm a bit worried this will reduce the other places we could use zpq. > static int > -pq_recvbuf(void) > +pq_recvbuf(bool nowait) > { > + /* If srteaming compression is enabled then use correpondent comression read function. */ s/srteaming/streaming/ s/correpondent/correponding/ s/comression/compression/ Could you please try to proof-read the patch a bit? The typo density is quite high. > + r = PqStream > + ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength, > + PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed) > + : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, > + PQ_RECV_BUFFER_SIZE - PqRecvLength); > + PqRecvLength += processed; ? : doesn't make sense to me in this case. This should be an if/else. > if (r < 0) > { > + if (r == ZPQ_DECOMPRESS_ERROR) > + { > + char const* msg = zpq_error(PqStream); > + if (msg == NULL) > + msg = "end of stream"; > + ereport(COMMERROR, > + (errcode_for_socket_access(), > + errmsg("failed to decompress data: %s", msg))); > + return EOF; > + } I don't think we should error out with "failed to decompress data:" e.g. when the client closed the connection. > @@ -1413,13 +1457,18 @@ internal_flush(void) > char *bufptr = PqSendBuffer + PqSendStart; > char *bufend = PqSendBuffer + PqSendPointer; > > - while (bufptr < bufend) > + while (bufptr < bufend || zpq_buffered(PqStream) != 0) /* has more data to flush or unsent data in internal compressionbuffer */ > { Overly long line. > - int r; > - > - r = secure_write(MyProcPort, bufptr, bufend - bufptr); > - > - if (r <= 0) > + int r; > + size_t processed = 0; > + size_t available = bufend - bufptr; > + r = PqStream > + ? zpq_write(PqStream, bufptr, available, &processed) > + : secure_write(MyProcPort, bufptr, available); Same comment as above, re ternary expression. > +/* > + * Functions implementing streaming compression algorithm > + */ > +typedef struct > +{ > + /* > + * Returns letter identifying compression algorithm. > + */ > + char (*name)(void); > + > + /* > + * Create compression stream with using rx/tx function for fetching/sending compressed data > + */ > + ZpqStream* (*create)(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg); > + > + /* > + * Read up to "size" raw (decompressed) bytes. > + * Returns number of decompressed bytes or error code. > + * Error code is either ZPQ_DECOMPRESS_ERROR either error code returned by the rx function. > + */ > + ssize_t (*read)(ZpqStream *zs, void *buf, size_t size); > + > + /* > + * Write up to "size" raw (decompressed) bytes. > + * Returns number of written raw bytes or error code returned by tx function. > + * In the last case amount of written raw bytes is stored in *processed. > + */ > + ssize_t (*write)(ZpqStream *zs, void const *buf, size_t size, size_t *processed); This should at least specify how these functions are supposed to handle blocking/nonblocking sockets. > + > +#define ZSTD_BUFFER_SIZE (8*1024) > +#define ZSTD_COMPRESSION_LEVEL 1 Add some arguments for choosing these parameters. > + > +/* > + * Array with all supported compression algorithms. > + */ > +static ZpqAlgorithm const zpq_algorithms[] = > +{ > +#if HAVE_LIBZSTD > + {zstd_name, zstd_create, zstd_read, zstd_write, zstd_free, zstd_error, zstd_buffered}, > +#endif > +#if HAVE_LIBZ > + {zlib_name, zlib_create, zlib_read, zlib_write, zlib_free, zlib_error, zlib_buffered}, > +#endif > + {NULL} > +}; I think it's preferrable to use designated initializers. Do we really need zero terminated lists? Works fine, but brrr. > +/* > + * Index of used compression algorithm in zpq_algorithms array. > + */ > +static int zpq_algorithm_impl; This is just odd API design imo. Why doesn't the dispatch work based on an argument for zpq_create() and the ZpqStream * for the rest? What if there's two libpq connections in one process? To servers supporting different compression algorithms? This isn't going to fly. > +/* > + * Get list of the supported algorithms. > + * Each algorithm is identified by one letter: 'f' - Facebook zstd, 'z' - zlib. > + * Algorithm identifies are appended to the provided buffer and terminated by '\0'. > + */ > +void > +zpq_get_supported_algorithms(char algorithms[ZPQ_MAX_ALGORITHMS]) > +{ > + int i; > + for (i = 0; zpq_algorithms[i].name != NULL; i++) > + { > + Assert(i < ZPQ_MAX_ALGORITHMS); > + algorithms[i] = zpq_algorithms[i].name(); > + } > + Assert(i < ZPQ_MAX_ALGORITHMS); > + algorithms[i] = '\0'; > +} Uh, doesn't this bake ZPQ_MAX_ALGORITHMS into the ABI? That seems entirely unnecessary? > @@ -2180,6 +2257,20 @@ build_startup_packet(const PGconn *conn, char *packet, > ADD_STARTUP_OPTION("replication", conn->replication); > if (conn->pgoptions && conn->pgoptions[0]) > ADD_STARTUP_OPTION("options", conn->pgoptions); > + if (conn->compression && conn->compression[0]) > + { > + bool enabled; > + /* > + * If compressoin is enabled, then send to the server list of compression algorithms > + * supported by client > + */ s/compressoin/compression/ > + if (parse_bool(conn->compression, &enabled)) > + { > + char compression_algorithms[ZPQ_MAX_ALGORITHMS]; > + zpq_get_supported_algorithms(compression_algorithms); > + ADD_STARTUP_OPTION("compression", compression_algorithms); > + } > + } I think this needs to work in a graceful manner across server versions. You can make that work with an argument, using the _pq_ parameter stuff, but as I said earlier, I think it's a mistake to deal with this in the startup packet anyway. Greetings, Andres Freund
On 2020-Oct-26, Konstantin Knizhnik wrote: > + while (bufptr < bufend || zpq_buffered(PqStream) != 0) /* has more data to flush or unsent data in internal compressionbuffer */ > { > - int r; > - > - r = secure_write(MyProcPort, bufptr, bufend - bufptr); > - > - if (r <= 0) > + int r; > + size_t processed = 0; > + size_t available = bufend - bufptr; > + r = PqStream > + ? zpq_write(PqStream, bufptr, available, &processed) > + : secure_write(MyProcPort, bufptr, available); > + bufptr += processed; > + PqSendStart += processed; This bit is surprising to me. I thought the whole zpq_write() thing should be hidden inside secure_write, so internal_flush would continue to call just secure_write; and it is that routine's responsibility to call zpq_write or be_tls_write or secure_raw_write etc according to compile-time options and socket state.
Hi,
On Oct 29, 2020, at 12:27 AM, Andres Freund <andres@anarazel.de> wrote:The protocol sounds to me like there's no way to enable/disable
compression in an existing connection. To me it seems better to have an
explicit, client initiated, request to use a specific method of
compression (including none). That allows to enable compression for bulk
work, and disable it in other cases (e.g. for security sensitive
content, or for unlikely to compress well content).
I think that would also make cross-version handling easier, because a
newer client driver can send the compression request and handle the
error, without needing to reconnect or such.
Most importantly, I think such a design is basically a necessity to make
connection poolers to work in a sensible way.
Can you please clarify your opinion about connection poolers? Do you mean by sensible way that in some cases they can just forward the compressed stream without parsing?
+/*
+ * Index of used compression algorithm in zpq_algorithms array.
+ */
+static int zpq_algorithm_impl;
This is just odd API design imo. Why doesn't the dispatch work based on
an argument for zpq_create() and the ZpqStream * for the rest?
What if there's two libpq connections in one process? To servers
supporting different compression algorithms? This isn't going to fly.
I agree, I think that moving the zpq_algorithm_impl to the ZpqStream struct seems like an easy fix for this issue.
+ /* initialize compression */
+ if (zpq_set_algorithm(compression_algorithm))
+ PqStream = zpq_create((zpq_tx_func)secure_write, (zpq_rx_func)secure_read, MyProcPort);
+ }
+ return 0;
+}
Why is zpq a wrapper around secure_write/read? I'm a bit worried this
will reduce the other places we could use zpq.
Maybe we can just split the PqStream into PqCompressStream and PqDecompressStream? It looks like that they can work independently.
—
Daniil Zakhlystov
Hi, Thank for review. On 28.10.2020 22:27, Andres Freund wrote: > I don't see a corresponding configure.ac change? Shame on me - I completely forgot that configure is actually generate from configure.ac. Fixed. > >> + <varlistentry id="libpq-connect-compression" xreflabel="compression"> >> + <term><literal>compression</literal></term> >> + <listitem> >> + <para> >> + Request compression of libpq traffic. Client sends to the server list of compression algorithms, supported byclient library. >> + If server supports one of this algorithms, then it acknowledges use of this algorithm and then all libpq messagessend both from client to server and >> + visa versa will be compressed. If server is not supporting any of the suggested algorithms, then it replies with'n' (no compression) >> + message and it is up to the client whether to continue work without compression or report error. >> + Supported compression algorithms are chosen at configure time. Right now two libraries are supported: zlib (default)and zstd (if Postgres was >> + configured with --with-zstd option). In both cases streaming mode is used. >> + </para> >> + </listitem> >> + </varlistentry> >> + > > - there should be a reference to potential security impacts when used in > combination with encrypted connections Done > - What does " and it is up to the client whether to continue work > without compression or report error" actually mean for a libpq parameter? It can not happen. The client request from server use of compressed protocol only if "compression=XXX" was specified in connection string. But XXX should be supported by client, otherwise this request will be rejected. So supported protocol string sent by client can never be empty. > - What is the point of the "streaming mode" reference? There are ways of performing compression: - block mode when each block is individually compressed (compressor stores dictionary in each compressed blocked) - stream mode Block mode allows to independently decompress each page. It is good for implementing page or field compression (as pglz is used to compress toast values). But it is not efficient for compressing client-server protocol commands. It seems to me to be important to explain that libpq is using stream mode and why there is no pglz compressor > Why are compression methods identified by one byte identifiers? That > seems unnecessarily small, given this is commonly a once-per-connection > action? It is mostly for simplicity of implementation: it is always simple to work with fixed size messages (or with array of chars rather than array of strings). And I do not think that it can somehow decrease flexibility: this one-letter algorihth codes are not visible for user. And I do not think that we sometime will support more than 127 (or even 64 different compression algorithms). > The protocol sounds to me like there's no way to enable/disable > compression in an existing connection. To me it seems better to have an > explicit, client initiated, request to use a specific method of > compression (including none). That allows to enable compression for bulk > work, and disable it in other cases (e.g. for security sensitive > content, or for unlikely to compress well content). It will significantly complicate implementation (because of buffering at different levels). Also it is not clear to me who and how will control enabling/disabling compression in this case? I can imagine that "\copy" should trigger compression. But what about (server side) "copy" command? Or just select returning huge results? I do not think that making user or application to enable/disable compression on the fly is really good idea. Overhead of compressing small commands is not so large. And concerning security risks... In most cases such problem is not relevant at all because both client and server are located within single reliable network. It if security of communication really matters, you should not switch compression in all cases (including COPY and other bulk data transfer). It is very strange idea to let client to decide which data is "security sensitive" and which not. > > I think that would also make cross-version handling easier, because a > newer client driver can send the compression request and handle the > error, without needing to reconnect or such. > > Most importantly, I think such a design is basically a necessity to make > connection poolers to work in a sensible way. I do not completely understand the problem with connection pooler. Right now developers of Yandex Odyssey are trying to support libpq compression in their pooler. If them will be faced with some problems, I will definitely address them. > And lastly, wouldn't it be reasonable to allow to specify things like > compression levels? All that doesn't have to be supported now, but I > think the protocol should take that into account. Well, if we want to provide the maximal flexibility, then we should allow to specify compression level. Practically, when I have implemented our CFS compressed storage for pgpro-ee and libpq_compression I have performed a lot benchmarks comparing different compression algorithms with different compression levels. Definitely I do not pretend on doing some research in this area. But IMHO default (fastest) compression level is always the preferable choice: it provides best compromise between speed and compression ratio. Higher compression levels significantly (several times) reduce compression speed, but influence on compression ratio are much smaller. More over, zstd with default compression level compresses synthetic data (i.e. strings will with spaces generated by pgbench) much better (with compression ratio 63!) than with higher compression levels. Right now in Postgres we do not allow user to specify compression level neither for compressing TOAST data, nether for WAL compression,... And IMHO for libpq protocol compression, possibility to specify compression level is even less useful. But if you think that it is so important, I will try to implement it. Many questions arise in this case: which side should control compression level? Should client affect compression level both at client side and at server side? Or it should be possible to specify separately compression level for client and for server? >> +<para> >> + Used compression algorithm. Right now the following streaming compression algorithms are supported: 'f' - Facebookzstd, 'z' - zlib, 'n' - no compression. >> +</para> > I would prefer this just be referenced as zstd or zstandard, not > facebook zstd. There's an RFC (albeit only "informational"), and it > doesn't name facebook, except as an employer: > https://tools.ietf.org/html/rfc8478 Please notice that it is internal encoding, user will specify psql -d "dbname=postgres compression=zstd" If name "zstd" is not good, I can choose any other. > > >> +int >> +pq_configure(Port* port) >> +{ >> + char* client_compression_algorithms = port->compression_algorithms; >> + /* >> + * If client request compression, it sends list of supported compression algorithms. >> + * Each compression algorirthm is idetified by one letter ('f' - Facebook zsts, 'z' - xlib) >> + */ > s/algorirthm/algorithm/ > s/idetified/identified/ > s/zsts/zstd/ > s/xlib/zlib/ > > That's, uh, quite the typo density. > Sorry, fixed >> + if (client_compression_algorithms) >> + { >> + char server_compression_algorithms[ZPQ_MAX_ALGORITHMS]; >> + char compression_algorithm = ZPQ_NO_COMPRESSION; >> + char compression[6] = {'z',0,0,0,5,0}; /* message length = 5 */ >> + int rc; > Why is this hand-rolling protocol messages? Sorry, I do not quite understand your concern. It seems to me that all libpq message manipulation is more or less hand-rolling, isn't it (we are not using protobuf or msgbpack)? Or do you think that calling pq_sendbyte,pq_sendint32,... is much safer in this case? > >> + /* Intersect lists */ >> + while (*client_compression_algorithms != '\0') >> + { >> + if (strchr(server_compression_algorithms, *client_compression_algorithms)) >> + { >> + compression_algorithm = *client_compression_algorithms; >> + break; >> + } >> + client_compression_algorithms += 1; >> + } > Why isn't this is handled within zpq? > It seems to be part of libpq client-server handshake protocol. It seems to me that zpq is lower level component which is just ordered which compressor to use. >> + /* Send 'z' message to the client with selectde comression algorithm ('n' if match is ont found) */ > s/selectde/selected/ > s/comression/compression/ > s/ont/not/ > :( Fixed. But looks like you are inspecting not the latest patch (libpq_compression-20.patch) Because two of this three mistypings I have already fixed. >> + socket_set_nonblocking(false); >> + while ((rc = secure_write(MyProcPort, compression, sizeof(compression))) < 0 >> + && errno == EINTR); >> + if ((size_t)rc != sizeof(compression)) >> + return -1; > Huh? This all seems like an abstraction violation. > > >> + /* initialize compression */ >> + if (zpq_set_algorithm(compression_algorithm)) >> + PqStream = zpq_create((zpq_tx_func)secure_write, (zpq_rx_func)secure_read, MyProcPort); >> + } >> + return 0; >> +} > Why is zpq a wrapper around secure_write/read? I'm a bit worried this > will reduce the other places we could use zpq. zpq has to read/write data from underlying stream. And it should be used both in client and server environment. I didn't see other ways to provide single zpq implementation without code duplication except pass to it rx/tx functions. > > >> static int >> -pq_recvbuf(void) >> +pq_recvbuf(bool nowait) >> { >> + /* If srteaming compression is enabled then use correpondent comression read function. */ > s/srteaming/streaming/ > s/correpondent/correponding/ > s/comression/compression/ > > Could you please try to proof-read the patch a bit? The typo density > is quite high. Once again, sorry Will do. > >> + r = PqStream >> + ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength, >> + PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed) >> + : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, >> + PQ_RECV_BUFFER_SIZE - PqRecvLength); >> + PqRecvLength += processed; > ? : doesn't make sense to me in this case. This should be an if/else. > Isn't it a matter of style preference? Why if/else is principle better than ?: I agree that sometimes ?: leads to more complex and obscure expressions. But to you think that if-else in this case will lead to more clear or readable code? Another question is whether conditional expression here is really good idea. I prefer to replace with indirect function call... But there are several reasons which makes me to prefer such straightforward and not nice way (at lease difference in function profiles). >> if (r < 0) >> { >> + if (r == ZPQ_DECOMPRESS_ERROR) >> + { >> + char const* msg = zpq_error(PqStream); >> + if (msg == NULL) >> + msg = "end of stream"; >> + ereport(COMMERROR, >> + (errcode_for_socket_access(), >> + errmsg("failed to decompress data: %s", msg))); >> + return EOF; >> + } > I don't think we should error out with "failed to decompress data:" > e.g. when the client closed the connection. Sorry, but this error is reported only when ZPQ_DECOMPRESS_ERROR is returned. It means that received data can not be decompressed but not that client connection is broken. > > > >> @@ -1413,13 +1457,18 @@ internal_flush(void) >> char *bufptr = PqSendBuffer + PqSendStart; >> char *bufend = PqSendBuffer + PqSendPointer; >> >> - while (bufptr < bufend) >> + while (bufptr < bufend || zpq_buffered(PqStream) != 0) /* has more data to flush or unsent data in internal compressionbuffer */ >> { > Overly long line. Fixed > This should at least specify how these functions are supposed to handle > blocking/nonblocking sockets. > Blocking/nonblocking control is done by upper layer. This zpq functions implementation calls underlying IO functions and do not care if this calls are blocking or nonblocking. >> + >> +#define ZSTD_BUFFER_SIZE (8*1024) >> +#define ZSTD_COMPRESSION_LEVEL 1 > Add some arguments for choosing these parameters. > What are the suggested way to specify them? I can not put them in GUCs (because them are also needed at client side). May it possible to for client to specify them in connection string: psql -d "compression='ztsd/level=10/buffer=8k" It seems to be awful and overkill, isn't it? >> + >> +/* >> + * Array with all supported compression algorithms. >> + */ >> +static ZpqAlgorithm const zpq_algorithms[] = >> +{ >> +#if HAVE_LIBZSTD >> + {zstd_name, zstd_create, zstd_read, zstd_write, zstd_free, zstd_error, zstd_buffered}, >> +#endif >> +#if HAVE_LIBZ >> + {zlib_name, zlib_create, zlib_read, zlib_write, zlib_free, zlib_error, zlib_buffered}, >> +#endif >> + {NULL} >> +}; > I think it's preferrable to use designated initializers. > > Do we really need zero terminated lists? Works fine, but brrr. Once again - a matter of taste:) Standard C practice IMHO - not invented by me and widely used in Postgres code;) > >> +/* >> + * Index of used compression algorithm in zpq_algorithms array. >> + */ >> +static int zpq_algorithm_impl; > This is just odd API design imo. Why doesn't the dispatch work based on > an argument for zpq_create() and the ZpqStream * for the rest? > > What if there's two libpq connections in one process? To servers > supporting different compression algorithms? This isn't going to fly. Fixed. > >> +/* >> + * Get list of the supported algorithms. >> + * Each algorithm is identified by one letter: 'f' - Facebook zstd, 'z' - zlib. >> + * Algorithm identifies are appended to the provided buffer and terminated by '\0'. >> + */ >> +void >> +zpq_get_supported_algorithms(char algorithms[ZPQ_MAX_ALGORITHMS]) >> +{ >> + int i; >> + for (i = 0; zpq_algorithms[i].name != NULL; i++) >> + { >> + Assert(i < ZPQ_MAX_ALGORITHMS); >> + algorithms[i] = zpq_algorithms[i].name(); >> + } >> + Assert(i < ZPQ_MAX_ALGORITHMS); >> + algorithms[i] = '\0'; >> +} > Uh, doesn't this bake ZPQ_MAX_ALGORITHMS into the ABI? That seems > entirely unnecessary? I tried to avoid use of dynamic memory allocation because zpq is used both in client and server environments with different memory allocation policies. >> @@ -2180,6 +2257,20 @@ build_startup_packet(const PGconn *conn, char *packet, >> ADD_STARTUP_OPTION("replication", conn->replication); >> if (conn->pgoptions && conn->pgoptions[0]) >> ADD_STARTUP_OPTION("options", conn->pgoptions); >> + if (conn->compression && conn->compression[0]) >> + { >> + bool enabled; >> + /* >> + * If compressoin is enabled, then send to the server list of compression algorithms >> + * supported by client >> + */ > s/compressoin/compression/ Fixed >> + if (parse_bool(conn->compression, &enabled)) >> + { >> + char compression_algorithms[ZPQ_MAX_ALGORITHMS]; >> + zpq_get_supported_algorithms(compression_algorithms); >> + ADD_STARTUP_OPTION("compression", compression_algorithms); >> + } >> + } > I think this needs to work in a graceful manner across server > versions. You can make that work with an argument, using the _pq_ > parameter stuff, but as I said earlier, I think it's a mistake to deal > with this in the startup packet anyway. Sorry, I think that it should be quite easy for user to toggle compression. Originally I suggest to add -z option to psql and other Postgres utilities working with libpq protocol. Adding new options was considered by reviewer as bad idea and so I left correspondent option in connection string: psql -d "dbname=postgres compression=zlib" It is IMHO less convenient than "psql -z postgres". And I afraid that using the _pq_ parameter stuff makes enabling of compression even less user friendly. So can replace it to _pq_ convention if there is consensus that adding "compression" to startup package should be avoided. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
New version of the patch with fixed is attached. On 28.10.2020 22:27, Andres Freund wrote: > Hi, > > On 2020-10-26 19:20:46 +0300, Konstantin Knizhnik wrote: >> diff --git a/configure b/configure >> index ace4ed5..deba608 100755 >> --- a/configure >> +++ b/configure >> @@ -700,6 +700,7 @@ LD >> LDFLAGS_SL >> LDFLAGS_EX >> with_zlib >> +with_zstd >> with_system_tzdata >> with_libxslt >> XML2_LIBS > > I don't see a corresponding configure.ac change? > > >> + <varlistentry id="libpq-connect-compression" xreflabel="compression"> >> + <term><literal>compression</literal></term> >> + <listitem> >> + <para> >> + Request compression of libpq traffic. Client sends to the server list of compression algorithms, supported byclient library. >> + If server supports one of this algorithms, then it acknowledges use of this algorithm and then all libpq messagessend both from client to server and >> + visa versa will be compressed. If server is not supporting any of the suggested algorithms, then it replies with'n' (no compression) >> + message and it is up to the client whether to continue work without compression or report error. >> + Supported compression algorithms are chosen at configure time. Right now two libraries are supported: zlib (default)and zstd (if Postgres was >> + configured with --with-zstd option). In both cases streaming mode is used. >> + </para> >> + </listitem> >> + </varlistentry> >> + > > - there should be a reference to potential security impacts when used in > combination with encrypted connections > - What does " and it is up to the client whether to continue work > without compression or report error" actually mean for a libpq parameter? > - What is the point of the "streaming mode" reference? > > > >> @@ -263,6 +272,21 @@ >> </varlistentry> >> >> <varlistentry> >> + <term>CompressionAck</term> >> + <listitem> >> + <para> >> + Server acknowledges using compression for client-server communication protocol. >> + Compression can be requested by client by including "compression" option in connection string. >> + Client sends to the server list of compression algorithms, supported by client library >> + (compression algorithm is identified by one letter: <literal>'f'</literal> - Facebook zstd, <literal>'z'</literal>- zlib,...). >> + If server supports one of this algorithms, then it acknowledges use of this algorithm and all subsequent libpqmessages send both from client to server and >> + visa versa will be compressed. If server is not supporting any of the suggested algorithms, then it replieswith 'n' (no compression) >> + algorithm identifier and it is up to the client whether to continue work without compression or report error. >> + </para> >> + </listitem> >> + </varlistentry> > Why are compression methods identified by one byte identifiers? That > seems unnecessarily small, given this is commonly a once-per-connection > action? > > > The protocol sounds to me like there's no way to enable/disable > compression in an existing connection. To me it seems better to have an > explicit, client initiated, request to use a specific method of > compression (including none). That allows to enable compression for bulk > work, and disable it in other cases (e.g. for security sensitive > content, or for unlikely to compress well content). > > I think that would also make cross-version handling easier, because a > newer client driver can send the compression request and handle the > error, without needing to reconnect or such. > > Most importantly, I think such a design is basically a necessity to make > connection poolers to work in a sensible way. > > And lastly, wouldn't it be reasonable to allow to specify things like > compression levels? All that doesn't have to be supported now, but I > think the protocol should take that into account. > > >> +<para> >> + Used compression algorithm. Right now the following streaming compression algorithms are supported: 'f' - Facebookzstd, 'z' - zlib, 'n' - no compression. >> +</para> > I would prefer this just be referenced as zstd or zstandard, not > facebook zstd. There's an RFC (albeit only "informational"), and it > doesn't name facebook, except as an employer: > https://tools.ietf.org/html/rfc8478 > > >> +int >> +pq_configure(Port* port) >> +{ >> + char* client_compression_algorithms = port->compression_algorithms; >> + /* >> + * If client request compression, it sends list of supported compression algorithms. >> + * Each compression algorirthm is idetified by one letter ('f' - Facebook zsts, 'z' - xlib) >> + */ > s/algorirthm/algorithm/ > s/idetified/identified/ > s/zsts/zstd/ > s/xlib/zlib/ > > That's, uh, quite the typo density. > > >> + if (client_compression_algorithms) >> + { >> + char server_compression_algorithms[ZPQ_MAX_ALGORITHMS]; >> + char compression_algorithm = ZPQ_NO_COMPRESSION; >> + char compression[6] = {'z',0,0,0,5,0}; /* message length = 5 */ >> + int rc; > Why is this hand-rolling protocol messages? > > >> + /* Intersect lists */ >> + while (*client_compression_algorithms != '\0') >> + { >> + if (strchr(server_compression_algorithms, *client_compression_algorithms)) >> + { >> + compression_algorithm = *client_compression_algorithms; >> + break; >> + } >> + client_compression_algorithms += 1; >> + } > Why isn't this is handled within zpq? > > >> + /* Send 'z' message to the client with selectde comression algorithm ('n' if match is ont found) */ > s/selectde/selected/ > s/comression/compression/ > s/ont/not/ > > >> + socket_set_nonblocking(false); >> + while ((rc = secure_write(MyProcPort, compression, sizeof(compression))) < 0 >> + && errno == EINTR); >> + if ((size_t)rc != sizeof(compression)) >> + return -1; > Huh? This all seems like an abstraction violation. > > >> + /* initialize compression */ >> + if (zpq_set_algorithm(compression_algorithm)) >> + PqStream = zpq_create((zpq_tx_func)secure_write, (zpq_rx_func)secure_read, MyProcPort); >> + } >> + return 0; >> +} > Why is zpq a wrapper around secure_write/read? I'm a bit worried this > will reduce the other places we could use zpq. > > >> static int >> -pq_recvbuf(void) >> +pq_recvbuf(bool nowait) >> { >> + /* If srteaming compression is enabled then use correpondent comression read function. */ > s/srteaming/streaming/ > s/correpondent/correponding/ > s/comression/compression/ > > Could you please try to proof-read the patch a bit? The typo density > is quite high. > > >> + r = PqStream >> + ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength, >> + PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed) >> + : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, >> + PQ_RECV_BUFFER_SIZE - PqRecvLength); >> + PqRecvLength += processed; > ? : doesn't make sense to me in this case. This should be an if/else. > > >> if (r < 0) >> { >> + if (r == ZPQ_DECOMPRESS_ERROR) >> + { >> + char const* msg = zpq_error(PqStream); >> + if (msg == NULL) >> + msg = "end of stream"; >> + ereport(COMMERROR, >> + (errcode_for_socket_access(), >> + errmsg("failed to decompress data: %s", msg))); >> + return EOF; >> + } > I don't think we should error out with "failed to decompress data:" > e.g. when the client closed the connection. > > > >> @@ -1413,13 +1457,18 @@ internal_flush(void) >> char *bufptr = PqSendBuffer + PqSendStart; >> char *bufend = PqSendBuffer + PqSendPointer; >> >> - while (bufptr < bufend) >> + while (bufptr < bufend || zpq_buffered(PqStream) != 0) /* has more data to flush or unsent data in internal compressionbuffer */ >> { > Overly long line. > > >> - int r; >> - >> - r = secure_write(MyProcPort, bufptr, bufend - bufptr); >> - >> - if (r <= 0) >> + int r; >> + size_t processed = 0; >> + size_t available = bufend - bufptr; >> + r = PqStream >> + ? zpq_write(PqStream, bufptr, available, &processed) >> + : secure_write(MyProcPort, bufptr, available); > Same comment as above, re ternary expression. > > > >> +/* >> + * Functions implementing streaming compression algorithm >> + */ >> +typedef struct >> +{ >> + /* >> + * Returns letter identifying compression algorithm. >> + */ >> + char (*name)(void); >> + >> + /* >> + * Create compression stream with using rx/tx function for fetching/sending compressed data >> + */ >> + ZpqStream* (*create)(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg); >> + >> + /* >> + * Read up to "size" raw (decompressed) bytes. >> + * Returns number of decompressed bytes or error code. >> + * Error code is either ZPQ_DECOMPRESS_ERROR either error code returned by the rx function. >> + */ >> + ssize_t (*read)(ZpqStream *zs, void *buf, size_t size); >> + >> + /* >> + * Write up to "size" raw (decompressed) bytes. >> + * Returns number of written raw bytes or error code returned by tx function. >> + * In the last case amount of written raw bytes is stored in *processed. >> + */ >> + ssize_t (*write)(ZpqStream *zs, void const *buf, size_t size, size_t *processed); > This should at least specify how these functions are supposed to handle > blocking/nonblocking sockets. > > >> + >> +#define ZSTD_BUFFER_SIZE (8*1024) >> +#define ZSTD_COMPRESSION_LEVEL 1 > Add some arguments for choosing these parameters. > > >> + >> +/* >> + * Array with all supported compression algorithms. >> + */ >> +static ZpqAlgorithm const zpq_algorithms[] = >> +{ >> +#if HAVE_LIBZSTD >> + {zstd_name, zstd_create, zstd_read, zstd_write, zstd_free, zstd_error, zstd_buffered}, >> +#endif >> +#if HAVE_LIBZ >> + {zlib_name, zlib_create, zlib_read, zlib_write, zlib_free, zlib_error, zlib_buffered}, >> +#endif >> + {NULL} >> +}; > I think it's preferrable to use designated initializers. > > Do we really need zero terminated lists? Works fine, but brrr. > >> +/* >> + * Index of used compression algorithm in zpq_algorithms array. >> + */ >> +static int zpq_algorithm_impl; > This is just odd API design imo. Why doesn't the dispatch work based on > an argument for zpq_create() and the ZpqStream * for the rest? > > What if there's two libpq connections in one process? To servers > supporting different compression algorithms? This isn't going to fly. > > >> +/* >> + * Get list of the supported algorithms. >> + * Each algorithm is identified by one letter: 'f' - Facebook zstd, 'z' - zlib. >> + * Algorithm identifies are appended to the provided buffer and terminated by '\0'. >> + */ >> +void >> +zpq_get_supported_algorithms(char algorithms[ZPQ_MAX_ALGORITHMS]) >> +{ >> + int i; >> + for (i = 0; zpq_algorithms[i].name != NULL; i++) >> + { >> + Assert(i < ZPQ_MAX_ALGORITHMS); >> + algorithms[i] = zpq_algorithms[i].name(); >> + } >> + Assert(i < ZPQ_MAX_ALGORITHMS); >> + algorithms[i] = '\0'; >> +} > Uh, doesn't this bake ZPQ_MAX_ALGORITHMS into the ABI? That seems > entirely unnecessary? > > > >> @@ -2180,6 +2257,20 @@ build_startup_packet(const PGconn *conn, char *packet, >> ADD_STARTUP_OPTION("replication", conn->replication); >> if (conn->pgoptions && conn->pgoptions[0]) >> ADD_STARTUP_OPTION("options", conn->pgoptions); >> + if (conn->compression && conn->compression[0]) >> + { >> + bool enabled; >> + /* >> + * If compressoin is enabled, then send to the server list of compression algorithms >> + * supported by client >> + */ > s/compressoin/compression/ > >> + if (parse_bool(conn->compression, &enabled)) >> + { >> + char compression_algorithms[ZPQ_MAX_ALGORITHMS]; >> + zpq_get_supported_algorithms(compression_algorithms); >> + ADD_STARTUP_OPTION("compression", compression_algorithms); >> + } >> + } > I think this needs to work in a graceful manner across server > versions. You can make that work with an argument, using the _pq_ > parameter stuff, but as I said earlier, I think it's a mistake to deal > with this in the startup packet anyway. > > Greetings, > > Andres Freund -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 28.10.2020 22:58, Alvaro Herrera wrote: > On 2020-Oct-26, Konstantin Knizhnik wrote: > >> + while (bufptr < bufend || zpq_buffered(PqStream) != 0) /* has more data to flush or unsent data in internal compressionbuffer */ >> { >> - int r; >> - >> - r = secure_write(MyProcPort, bufptr, bufend - bufptr); >> - >> - if (r <= 0) >> + int r; >> + size_t processed = 0; >> + size_t available = bufend - bufptr; >> + r = PqStream >> + ? zpq_write(PqStream, bufptr, available, &processed) >> + : secure_write(MyProcPort, bufptr, available); >> + bufptr += processed; >> + PqSendStart += processed; > This bit is surprising to me. I thought the whole zpq_write() thing > should be hidden inside secure_write, so internal_flush would continue > to call just secure_write; and it is that routine's responsibility to > call zpq_write or be_tls_write or secure_raw_write etc according to > compile-time options and socket state. Sorry, may be it is not the nicest way of coding. Ideally, we should use "decorator design pattern" here where each layer (compression, TLS,...) is implemented as separate decorator class. This is how io streams are implemented in java and many other SDKs. But it requires too much redesign of Postgres code. Also from my point of view the core of the problem is that in Postgres there are two almost independent implementation of networking code for backend/frontend. IMHO it is better to have some SAL (system abstraction layer) which hides OS specific stuff from rest of the system and which can be shared by backend and frontend. In any case I failed to implement it in different way. The basic requirements was: 1. zpq code should be used both by backend and frontent. 2. decompressor may need to perform multiple reads from underlying layer to fill its buffer and be able to produce some output. 3. minimize changes in postgres code 4. be able to use zpq library in some other tools (like pooler) This is why zpq_create function is given tx/rx functions to pefrom IO operations. secure_write is such tx function for backend (and pqsecure_write for frontend). Actually the name of this function secure_write assumes that it deals only with TLS, not with compression. Certainly it is possible to rename it or better introduce some other functions, i.e. stream_read which will perform this checks. But please notice that it is not enough to perform all checks in one functions as you suggested. It should be really pipe each component of which is doing its own job: encryption, compression.... As for me I prefer to have in this place indirect function calls. But there are several reasons (at least different prototypes of the functions) which makes me choose the current way. In any case: there are many different ways of doing the same task. And different people have own opinion about best/right way of doing it. Definitely there are some objective criteria: encapsulation, lack of code duplication, readability of code,... I tried to find the best approach base on my own preferences and requirements described above. May be I am wrong but then I want to be convinced that suggested alternative is better. From my point of view calling compressor from function named secure_read is not right solution... -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hi, On 2020-10-29 16:45:58 +0300, Konstantin Knizhnik wrote: > > - What does " and it is up to the client whether to continue work > > without compression or report error" actually mean for a libpq parameter? > It can not happen. > The client request from server use of compressed protocol only if > "compression=XXX" was specified in connection string. > But XXX should be supported by client, otherwise this request will be > rejected. > So supported protocol string sent by client can never be empty. I think it's pretty important for features like this to be able to fail softly when the feature is not available on the other side. Otherwise a lot of applications need to have unnecessary version dependencies coded into them. > > - What is the point of the "streaming mode" reference? > > There are ways of performing compression: > - block mode when each block is individually compressed (compressor stores > dictionary in each compressed blocked) > - stream mode > Block mode allows to independently decompress each page. It is good for > implementing page or field compression (as pglz is used to compress toast > values). > But it is not efficient for compressing client-server protocol commands. > It seems to me to be important to explain that libpq is using stream mode > and why there is no pglz compressor To me that seems like unnecessary detail in the user oriented parts of the docs at least. > > Why are compression methods identified by one byte identifiers? That > > seems unnecessarily small, given this is commonly a once-per-connection > > action? > > It is mostly for simplicity of implementation: it is always simple to work > with fixed size messages (or with array of chars rather than array of > strings). > And I do not think that it can somehow decrease flexibility: this one-letter > algorihth codes are not visible for user. And I do not think that we > sometime will support more than 127 (or even 64 different compression > algorithms). It's pretty darn trivial to have a variable width protocol message, given that we have all the tooling for that. Having a variable length descriptor allows us to extend the compression identifier with e.g. the level without needing to change the protocol. E.g. zstd:9 or zstd:level=9 or such. I suggest using a variable length string as the identifier, and split it at the first : for the lookup, and pass the rest to the compression method. > > The protocol sounds to me like there's no way to enable/disable > > compression in an existing connection. To me it seems better to have an > > explicit, client initiated, request to use a specific method of > > compression (including none). That allows to enable compression for bulk > > work, and disable it in other cases (e.g. for security sensitive > > content, or for unlikely to compress well content). > > It will significantly complicate implementation (because of buffering at > different levels). Really? We need to be able to ensure a message is flushed out to the network at precise moments - otherwise a lot of stuff will break. > Also it is not clear to me who and how will control enabling/disabling > compression in this case? > I can imagine that "\copy" should trigger compression. But what about > (server side) "copy" command? The client could still trigger that. I don't think it should ever be server controlled. > And concerning security risks... In most cases such problem is not relevant > at all because both client and server are located within single reliable > network. The trend is towards never trusting the network, even if internal, not the opposite. > It if security of communication really matters, you should not switch > compression in all cases (including COPY and other bulk data transfer). > It is very strange idea to let client to decide which data is "security > sensitive" and which not. Huh? > > I think that would also make cross-version handling easier, because a > > newer client driver can send the compression request and handle the > > error, without needing to reconnect or such. > > > > Most importantly, I think such a design is basically a necessity to make > > connection poolers to work in a sensible way. > > I do not completely understand the problem with connection pooler. > Right now developers of Yandex Odyssey are trying to support libpq > compression in their pooler. > If them will be faced with some problems, I will definitely address > them. It makes poolers a lot more expensive if they have to decompress and then recompress again. It'd be awesome if even the decompression could be avoided in at least some cases, but compression is the really expensive part. So if a client connects to the pooler with compression=zlib and an existing server connection is used, the pooler should be able to switch the existing connection to zlib. > But if you think that it is so important, I will try to implement it. Many > questions arise in this case: which side should control compression level? > Should client affect compression level both at client side and at server > side? Or it should be possible to specify separately compression level for > client and for server? I don't think the server needs to know about what the client does, compression level wise. > > > +<para> > > > + Used compression algorithm. Right now the following streaming compression algorithms are supported: 'f' -Facebook zstd, 'z' - zlib, 'n' - no compression. > > > +</para> > > I would prefer this just be referenced as zstd or zstandard, not > > facebook zstd. There's an RFC (albeit only "informational"), and it > > doesn't name facebook, except as an employer: > > https://tools.ietf.org/html/rfc8478 > > Please notice that it is internal encoding, user will specify > psql -d "dbname=postgres compression=zstd" > > If name "zstd" is not good, I can choose any other. All I was saying is that I think you should not name it ""Facebook zstd", just "zstd". > > > + if (client_compression_algorithms) > > > + { > > > + char server_compression_algorithms[ZPQ_MAX_ALGORITHMS]; > > > + char compression_algorithm = ZPQ_NO_COMPRESSION; > > > + char compression[6] = {'z',0,0,0,5,0}; /* message length = 5 */ > > > + int rc; > > Why is this hand-rolling protocol messages? > Sorry, I do not quite understand your concern. > It seems to me that all libpq message manipulation is more or less > hand-rolling, isn't it (we are not using protobuf or msgbpack)? > Or do you think that calling pq_sendbyte,pq_sendint32,... is much safer in > this case? I think you should just use something like pq_beginmessage(buf, 'z'); pq_sendbyte(buf, compression_method_byte); pq_endmessage(buf); like most of the backend does? And if you, as I suggest, use a variable length compression identifier, use pq_sendcountedtext(buf, compression_method, strlen(compression_method)); or such. > > > + socket_set_nonblocking(false); > > > + while ((rc = secure_write(MyProcPort, compression, sizeof(compression))) < 0 > > > + && errno == EINTR); > > > + if ((size_t)rc != sizeof(compression)) > > > + return -1; > > Huh? This all seems like an abstraction violation. > > > > > > > + /* initialize compression */ > > > + if (zpq_set_algorithm(compression_algorithm)) > > > + PqStream = zpq_create((zpq_tx_func)secure_write, (zpq_rx_func)secure_read, MyProcPort); > > > + } > > > + return 0; > > > +} > > Why is zpq a wrapper around secure_write/read? I'm a bit worried this > > will reduce the other places we could use zpq. > zpq has to read/write data from underlying stream. > And it should be used both in client and server environment. > I didn't see other ways to provide single zpq implementation without code > duplication > except pass to it rx/tx functions. I am not saying it's necessarily the best approach, but it doesn't seem like it'd be hard to have zpq not send / receive the data from the network, but just transform the data actually to-be-sent/received from the network. E.g. in pq_recvbuf() you could do something roughly like the following, after a successful secure_read() if (compression_enabled) { zpq_decompress(network_data, network_data_length, &input_processed, &output_data, &output_data_length); copy_data_into_recv_buffer(output_data, output_data_length); network_data += input_processed; network_data_length -= input_processed; } and the inverse on the network receive side. The advantage would be that we can then more easily use zpq stuff in other places. If we do go with the current callback model, I think we should at least extend it so that a 'context' parameter is shuffled through zpq, so that other code can work without global state. > > > + r = PqStream > > > + ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength, > > > + PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed) > > > + : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, > > > + PQ_RECV_BUFFER_SIZE - PqRecvLength); > > > + PqRecvLength += processed; > > ? : doesn't make sense to me in this case. This should be an if/else. > > > Isn't it a matter of style preference? > Why if/else is principle better than ?: ?: makes sense when it's either much shorter, or when it allows to avoid repeating a bunch of code. E.g. if it's just a conditional argument to a function with a lot of other arguments. But when, as here, it just leads to weirder formatting, it really just makes the code harder to read. > I agree that sometimes ?: leads to more complex and obscure expressions. > But to you think that if-else in this case will lead to more clear or > readable code? Yes, pretty clearly for me. r = PqStream ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength, PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed) : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, PQ_RECV_BUFFER_SIZE - PqRecvLength); vs if (PqStream) r = zpq_read(PqStream, PqRecvBuffer + PqRecvLength, PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed); else r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, PQ_RECV_BUFFER_SIZE - PqRecvLength); the if / else are visually more clearly distinct, the sole added repetition is r =. And most importantly if / else are more common. > Another question is whether conditional expression here is really good idea. > I prefer to replace with indirect function call... A branch is cheaper than indirect calls, FWIW> > > > +#define ZSTD_BUFFER_SIZE (8*1024) > > > +#define ZSTD_COMPRESSION_LEVEL 1 > > Add some arguments for choosing these parameters. > > > What are the suggested way to specify them? > I can not put them in GUCs (because them are also needed at client side). I don't mean arguments as in configurable, I mean that you should add a comment explaining why 8KB / level 1 was chosen. > > > +/* > > > + * Get list of the supported algorithms. > > > + * Each algorithm is identified by one letter: 'f' - Facebook zstd, 'z' - zlib. > > > + * Algorithm identifies are appended to the provided buffer and terminated by '\0'. > > > + */ > > > +void > > > +zpq_get_supported_algorithms(char algorithms[ZPQ_MAX_ALGORITHMS]) > > > +{ > > > + int i; > > > + for (i = 0; zpq_algorithms[i].name != NULL; i++) > > > + { > > > + Assert(i < ZPQ_MAX_ALGORITHMS); > > > + algorithms[i] = zpq_algorithms[i].name(); > > > + } > > > + Assert(i < ZPQ_MAX_ALGORITHMS); > > > + algorithms[i] = '\0'; > > > +} > > Uh, doesn't this bake ZPQ_MAX_ALGORITHMS into the ABI? That seems > > entirely unnecessary? > > I tried to avoid use of dynamic memory allocation because zpq is used both > in client and server environments with different memory allocation > policies. We support palloc in both(). But even without that, you can just have one function to get the number of algorithms, and then have the caller pass in a large enough array. > And I afraid that using the _pq_ parameter stuff makes enabling of > compression even less user friendly. I didn't intend to suggest that the _pq_ stuff should be used in a client facing manner. What I suggesting is that if the connection string contains compression=, the driver would internally translate that to _pq_.compression to pass that on to the server, which would allow the connection establishment to succeed, even if the server is a bit older. Greetings, Andres Freund
Hi everyone! Thanks for pushing this important topic! Just my 2 cents. > 31 окт. 2020 г., в 02:03, Andres Freund <andres@anarazel.de> написал(а): > > >>> I think that would also make cross-version handling easier, because a >>> newer client driver can send the compression request and handle the >>> error, without needing to reconnect or such. >>> >>> Most importantly, I think such a design is basically a necessity to make >>> connection poolers to work in a sensible way. >> >> I do not completely understand the problem with connection pooler. >> Right now developers of Yandex Odyssey are trying to support libpq >> compression in their pooler. >> If them will be faced with some problems, I will definitely address >> them. > > It makes poolers a lot more expensive if they have to decompress and > then recompress again. It'd be awesome if even the decompression could > be avoided in at least some cases, but compression is the really > expensive part. So if a client connects to the pooler with > compression=zlib and an existing server connection is used, the pooler > should be able to switch the existing connection to zlib. The idea of reusing compressed byte ranges is neat and tantalising from technical point of view. But the price of compression is 1 cpu for 500MB/s (zstd). With a 20Gbps network adapters cost of recompressing all trafficis at most ~4 cores. Moreover we tend to optimise pooler for the case when it stands on the same host as a DB does. Despite of this, I believe having message for changing compression method (including turning it off) is a nice thing to have.I can imagine that we may want functions to control compression level of replication depending on what is bottleneckright now: network or CPU. But I do not understand how we can have FrontEnd message like "Now both FE and BE speak zstd". Some messages can alreadybe in a flight, BE cannot change them already. It looks like both BE and FE can say "Now I'm speaking zstd:6 startingfrom next byte." if sender knows that correspondent speaks zstd:6, of cause. Thanks! Best regards, Andrey Borodin.
Hi On 31.10.2020 00:03, Andres Freund wrote: > Hi, > > On 2020-10-29 16:45:58 +0300, Konstantin Knizhnik wrote: >>> - What does " and it is up to the client whether to continue work >>> without compression or report error" actually mean for a libpq parameter? >> It can not happen. >> The client request from server use of compressed protocol only if >> "compression=XXX" was specified in connection string. >> But XXX should be supported by client, otherwise this request will be >> rejected. >> So supported protocol string sent by client can never be empty. > I think it's pretty important for features like this to be able to fail > softly when the feature is not available on the other side. Otherwise a > lot of applications need to have unnecessary version dependencies coded > into them. Sorry, may be I do not completely understand your suggestion. Right now user jut specify that he wants to use compression. Libpq client sends to the server list of supported algorithms and server choose among them the best one is supported. It sends it chose to the client and them are both using this algorithm. Sorry, that in previous mail I have used incorrect samples: client is not explicitly specifying compression algorithm - it just request compression. And server choose the most efficient algorithm which is supported both by client and server. So client should not know names of the particular algorithms (i.e. zlib, zstd) and choice is based on the assumption that server (or better say programmer) knows better than user which algorithms is more efficient. Last assumption me be contested because user better know which content will be send and which algorithm is more efficient for this content. But right know when the choice is between zstd and zlib, the first one is always better: faster and provides better quality. > >>> - What is the point of the "streaming mode" reference? >> There are ways of performing compression: >> - block mode when each block is individually compressed (compressor stores >> dictionary in each compressed blocked) >> - stream mode >> Block mode allows to independently decompress each page. It is good for >> implementing page or field compression (as pglz is used to compress toast >> values). >> But it is not efficient for compressing client-server protocol commands. >> It seems to me to be important to explain that libpq is using stream mode >> and why there is no pglz compressor > To me that seems like unnecessary detail in the user oriented parts of > the docs at least. > Ok, I will remove this phrase. >>> Why are compression methods identified by one byte identifiers? That >>> seems unnecessarily small, given this is commonly a once-per-connection >>> action? >> It is mostly for simplicity of implementation: it is always simple to work >> with fixed size messages (or with array of chars rather than array of >> strings). >> And I do not think that it can somehow decrease flexibility: this one-letter >> algorihth codes are not visible for user. And I do not think that we >> sometime will support more than 127 (or even 64 different compression >> algorithms). > It's pretty darn trivial to have a variable width protocol message, > given that we have all the tooling for that. Having a variable length > descriptor allows us to extend the compression identifier with e.g. the > level without needing to change the protocol. E.g. zstd:9 or > zstd:level=9 or such. > > I suggest using a variable length string as the identifier, and split it > at the first : for the lookup, and pass the rest to the compression > method. Yes, I agree that it provides more flexibility. And it is not a big problem to handle arbitrary strings instead of chars. But right now my intention was to prevent user from choosing compression algorithm and especially specifying compression level (which are algorithm-specific). Its matter of server-client handshake to choose best compression algorithm supported by both of them. I can definitely rewrite it, but IMHO given to much flexibility for user will just complicate things without any positive effect. > >>> The protocol sounds to me like there's no way to enable/disable >>> compression in an existing connection. To me it seems better to have an >>> explicit, client initiated, request to use a specific method of >>> compression (including none). That allows to enable compression for bulk >>> work, and disable it in other cases (e.g. for security sensitive >>> content, or for unlikely to compress well content). >> It will significantly complicate implementation (because of buffering at >> different levels). > Really? We need to be able to ensure a message is flushed out to the > network at precise moments - otherwise a lot of stuff will break. Definitely stream is flushed after writing each message. The problem is at receiver side: several messages can be sent without waiting response and will be read into the buffer. If first is compressed and subsequent - not, then it will be not so trivial to handle it. I have already faced with this problem when compression is switched on: backend may send some message right after acknowledging compression protocol. This message is already compressed but is delivered together with uncompressed compression acknowledgement message. I have solved this problem in switching on compression at client side, but really do not want to solve it at arbitrary moments. And once again: my opinion is that too much flexibility is not so good here: there is no sense to switch off compression for short messages (overhead of switching can be larger than compression itself). > >> Also it is not clear to me who and how will control enabling/disabling >> compression in this case? >> I can imagine that "\copy" should trigger compression. But what about >> (server side) "copy" command? > The client could still trigger that. I don't think it should ever be > server controlled. Sorry, but I still think that possibility to turn compression on/off on the fly is very doubtful idea. And it will require extension of protocol (adding some extra functions to libpq library to control it). >> And concerning security risks... In most cases such problem is not relevant >> at all because both client and server are located within single reliable >> network. > The trend is towards never trusting the network, even if internal, not > the opposite. > > >> It if security of communication really matters, you should not switch >> compression in all cases (including COPY and other bulk data transfer). >> It is very strange idea to let client to decide which data is "security >> sensitive" and which not. > Huh? Sorry, if I was unclear. I am not specialist in security. But it seems to be very dangerous when client (and not server) decides which data is "security sensitive" and which not. Just an example: you have VerySecreteTable. And when you perform selects on this table, you switch compression off. But then client want to execute COPY command for this table and as far as it requires bulk data transfer, user decides to switch on compression. Actually specking about security risks has no sense: if you want to provide security, then you use TLS. And it provides its own compression. So there is no need to perform compression at libpq level. libpq comression is for the cases when you do not need SSL at all. >>> I think that would also make cross-version handling easier, because a >>> newer client driver can send the compression request and handle the >>> error, without needing to reconnect or such. >>> >>> Most importantly, I think such a design is basically a necessity to make >>> connection poolers to work in a sensible way. >> I do not completely understand the problem with connection pooler. >> Right now developers of Yandex Odyssey are trying to support libpq >> compression in their pooler. >> If them will be faced with some problems, I will definitely address >> them. > It makes poolers a lot more expensive if they have to decompress and > then recompress again. It'd be awesome if even the decompression could > be avoided in at least some cases, but compression is the really > expensive part. So if a client connects to the pooler with > compression=zlib and an existing server connection is used, the pooler > should be able to switch the existing connection to zlib. My expectation was that pooler is installed in the same network as database and there is no need to compress traffic between pooler and backends. But even if it is really needed I do not see problems here. Definitely we need zpq library to support different compression algorithms in the same application. But it is done. To avoid pooler perform decompression+compression ist is necessary to send command header in raw format. But it will greatly reduce effect of stream compression. > > >> But if you think that it is so important, I will try to implement it. Many >> questions arise in this case: which side should control compression level? >> Should client affect compression level both at client side and at server >> side? Or it should be possible to specify separately compression level for >> client and for server? > I don't think the server needs to know about what the client does, > compression level wise. Sorry, I do not understand you. Compression takes place at sender side. So both client compressing its messages before sending to server and server compresses responses sent to the client. In both cases compression level may be specified. And it is not clear whether client should control compression level of the server. > >>>> +<para> >>>> + Used compression algorithm. Right now the following streaming compression algorithms are supported: 'f' - Facebookzstd, 'z' - zlib, 'n' - no compression. >>>> +</para> >>> I would prefer this just be referenced as zstd or zstandard, not >>> facebook zstd. There's an RFC (albeit only "informational"), and it >>> doesn't name facebook, except as an employer: >>> https://tools.ietf.org/html/rfc8478 >> Please notice that it is internal encoding, user will specify >> psql -d "dbname=postgres compression=zstd" >> >> If name "zstd" is not good, I can choose any other. > All I was saying is that I think you should not name it ""Facebook zstd", > just "zstd". Sorry, it was my error. Right now compression name is not specified by user at all: just boolean value on/off. > I think you should just use something like > > pq_beginmessage(buf, 'z'); > pq_sendbyte(buf, compression_method_byte); > pq_endmessage(buf); > > like most of the backend does? And if you, as I suggest, use a variable > length compression identifier, use > pq_sendcountedtext(buf, compression_method, strlen(compression_method)); > or such. Sorry, I can not do it in this way because pq connection with client is not yet initialized. So I can not use pq_endmessage here and have to call secure_write manually. > I am not saying it's necessarily the best approach, but it doesn't seem > like it'd be hard to have zpq not send / receive the data from the > network, but just transform the data actually to-be-sent/received from > the network. > > E.g. in pq_recvbuf() you could do something roughly like the following, > after a successful secure_read() > > if (compression_enabled) > { > zpq_decompress(network_data, network_data_length, > &input_processed, &output_data, &output_data_length); > copy_data_into_recv_buffer(output_data, output_data_length); > network_data += input_processed; > network_data_length -= input_processed; > } Sorry, it is not possible because to produce some output, compressor may need to perform multiple read. And doing it in two different places, i.e. in pqsecure_read and secure_read is worse than doing it on once place - s it is done now. > and the inverse on the network receive side. The advantage would be that > we can then more easily use zpq stuff in other places. > > If we do go with the current callback model, I think we should at least > extend it so that a 'context' parameter is shuffled through zpq, so that > other code can work without global state. Callback is has void* arg where arbitrary daat can be passed to callback. > > >>>> + r = PqStream >>>> + ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength, >>>> + PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed) >>>> + : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, >>>> + PQ_RECV_BUFFER_SIZE - PqRecvLength); >>>> + PqRecvLength += processed; >>> ? : doesn't make sense to me in this case. This should be an if/else. >>> >> Isn't it a matter of style preference? >> Why if/else is principle better than ?: > ?: makes sense when it's either much shorter, or when it allows to avoid > repeating a bunch of code. E.g. if it's just a conditional argument to a > function with a lot of other arguments. > > But when, as here, it just leads to weirder formatting, it really just > makes the code harder to read. From my point of view the main role of ?: construction is not just saving some bytes/lines of code. It emphasizes that both parts of conditional construction produce the same result. As in this case it is result of read operation. If you replace it with if-else, this relation between actions done in two branches is missed (or at least less clear). > >> I agree that sometimes ?: leads to more complex and obscure expressions. >> But to you think that if-else in this case will lead to more clear or >> readable code? > Yes, pretty clearly for me. > > r = PqStream > ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength, > PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed) > : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, > PQ_RECV_BUFFER_SIZE - PqRecvLength); > vs > > if (PqStream) > r = zpq_read(PqStream, PqRecvBuffer + PqRecvLength, > PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed); > else > r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength, > PQ_RECV_BUFFER_SIZE - PqRecvLength); > > the if / else are visually more clearly distinct, the sole added > repetition is r =. And most importantly if / else are more common. I agree that ?: is less clear than if/else. In most of modern programming languages there is no construction ?: but if-else can be used as expression. And repetition is always bad, isn't it? Not just because of writing extra code, but because of possible inconsistencies and errors (just like normal forms in databases). In any case - I do not think that it is principle: if you prefer - I can replace it with if-else >> Another question is whether conditional expression here is really good idea. >> I prefer to replace with indirect function call... > A branch is cheaper than indirect calls, FWIW> Really? Please notice that we do not compare just branch with indirect call, but branch+direct call vs. indirect call. I didn't measure it. But in any case IMHO performance aspect is less important here than clearance and maintainability of code. And indirect call is definitely better in this sense... Unfortunately I can not use it (certainly it is not absolutely impossible, but it requires much more changes). >>>> +#define ZSTD_BUFFER_SIZE (8*1024) >>>> +#define ZSTD_COMPRESSION_LEVEL 1 >>> Add some arguments for choosing these parameters. >>> >> What are the suggested way to specify them? >> I can not put them in GUCs (because them are also needed at client side). > I don't mean arguments as in configurable, I mean that you should add a > comment explaining why 8KB / level 1 was chosen. Sorry, will do. >>>> +/* >>>> + * Get list of the supported algorithms. >>>> + * Each algorithm is identified by one letter: 'f' - Facebook zstd, 'z' - zlib. >>>> + * Algorithm identifies are appended to the provided buffer and terminated by '\0'. >>>> + */ >>>> +void >>>> +zpq_get_supported_algorithms(char algorithms[ZPQ_MAX_ALGORITHMS]) >>>> +{ >>>> + int i; >>>> + for (i = 0; zpq_algorithms[i].name != NULL; i++) >>>> + { >>>> + Assert(i < ZPQ_MAX_ALGORITHMS); >>>> + algorithms[i] = zpq_algorithms[i].name(); >>>> + } >>>> + Assert(i < ZPQ_MAX_ALGORITHMS); >>>> + algorithms[i] = '\0'; >>>> +} >>> Uh, doesn't this bake ZPQ_MAX_ALGORITHMS into the ABI? That seems >>> entirely unnecessary? >> I tried to avoid use of dynamic memory allocation because zpq is used both >> in client and server environments with different memory allocation >> policies. > We support palloc in both(). But even without that, you can just have > one function to get the number of algorithms, and then have the caller > pass in a large enough array. Certainly it is possible. But why complicate implementation if it is internal function used only in single place in the code? > > >> And I afraid that using the _pq_ parameter stuff makes enabling of >> compression even less user friendly. > I didn't intend to suggest that the _pq_ stuff should be used in a > client facing manner. What I suggesting is that if the connection string > contains compression=, the driver would internally translate that to > _pq_.compression to pass that on to the server, which would allow the > connection establishment to succeed, even if the server is a bit older. New client can establish connection with old server if it is not using compression. And if it wants to use compression, then _pq_.compression will not help much. Old server will ignore it (unlike compression= option) but then client will wait for acknowledgement from server for used compression algorithm and didn't receive one. Certainly it is possible to handle this situation (if we didn't receive expected message) but why it is better than adding compression option to startup package? But if you think that adding new options to startup package should be avoided as much as possible, then I will replace it with _pq_.compression. > Greetings, > > Andres Freund Thank you very much for review and explanations.
Hi,
I have a couple of comments regarding the last patch, mostly these are minor issues.
In src/backend/libpq/pqcomm.c, starting from the line 1114:
int
pq_getbyte_if_available(unsigned char *c)
{
int r;
Assert(PqCommReadingMsg);
if (PqRecvPointer < PqRecvLength || (0) > 0) // not easy to understand optimization (maybe add a comment?)
{
*c = PqRecvBuffer[PqRecvPointer++];
return 1;
}
return r; // returned value is not initialized
}
In src/interfaces/libpq/fe-connect.c, starting from the line 3255:
pqGetc(&algorithm, conn);
impl = zpq_get_algorithm_impl(algorithm);
{ // I believe that if (impl < 0) condition is missing here, otherwise there is always an error
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext(
"server is not supported requested compression algorithm %c\n"), algorithm);
goto error_return;
}
In configure, starting from the line 1587:
--without-zlib do not use Zlib
--with-zstd do not use zstd // is this correct?
Thanks
--
Daniil Zakhlystov
On Nov 1, 2020, at 12:08 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:Hi
On 31.10.2020 00:03, Andres Freund wrote:Hi,Sorry, may be I do not completely understand your suggestion.
On 2020-10-29 16:45:58 +0300, Konstantin Knizhnik wrote:I think it's pretty important for features like this to be able to fail- What does " and it is up to the client whether to continue workIt can not happen.
without compression or report error" actually mean for a libpq parameter?
The client request from server use of compressed protocol only if
"compression=XXX" was specified in connection string.
But XXX should be supported by client, otherwise this request will be
rejected.
So supported protocol string sent by client can never be empty.
softly when the feature is not available on the other side. Otherwise a
lot of applications need to have unnecessary version dependencies coded
into them.
Right now user jut specify that he wants to use compression.
Libpq client sends to the server list of supported algorithms
and server choose among them the best one is supported.
It sends it chose to the client and them are both using this algorithm.
Sorry, that in previous mail I have used incorrect samples: client is not explicitly specifying compression algorithm -
it just request compression. And server choose the most efficient algorithm which is supported both by client and server.
So client should not know names of the particular algorithms (i.e. zlib, zstd) and
choice is based on the assumption that server (or better say programmer)
knows better than user which algorithms is more efficient.
Last assumption me be contested because user better know which content will be send and which
algorithm is more efficient for this content. But right know when the choice is between zstd and zlib,
the first one is always better: faster and provides better quality.Ok, I will remove this phrase.To me that seems like unnecessary detail in the user oriented parts of- What is the point of the "streaming mode" reference?There are ways of performing compression:
- block mode when each block is individually compressed (compressor stores
dictionary in each compressed blocked)
- stream mode
Block mode allows to independently decompress each page. It is good for
implementing page or field compression (as pglz is used to compress toast
values).
But it is not efficient for compressing client-server protocol commands.
It seems to me to be important to explain that libpq is using stream mode
and why there is no pglz compressor
the docs at least.Yes, I agree that it provides more flexibility.It's pretty darn trivial to have a variable width protocol message,Why are compression methods identified by one byte identifiers? ThatIt is mostly for simplicity of implementation: it is always simple to work
seems unnecessarily small, given this is commonly a once-per-connection
action?
with fixed size messages (or with array of chars rather than array of
strings).
And I do not think that it can somehow decrease flexibility: this one-letter
algorihth codes are not visible for user. And I do not think that we
sometime will support more than 127 (or even 64 different compression
algorithms).
given that we have all the tooling for that. Having a variable length
descriptor allows us to extend the compression identifier with e.g. the
level without needing to change the protocol. E.g. zstd:9 or
zstd:level=9 or such.
I suggest using a variable length string as the identifier, and split it
at the first : for the lookup, and pass the rest to the compression
method.
And it is not a big problem to handle arbitrary strings instead of chars.
But right now my intention was to prevent user from choosing compression algorithm and especially specifying compression level
(which are algorithm-specific). Its matter of server-client handshake to choose best compression algorithm supported by both of them.
I can definitely rewrite it, but IMHO given to much flexibility for user will just complicate things without any positive effect.Definitely stream is flushed after writing each message.Really? We need to be able to ensure a message is flushed out to theThe protocol sounds to me like there's no way to enable/disableIt will significantly complicate implementation (because of buffering at
compression in an existing connection. To me it seems better to have an
explicit, client initiated, request to use a specific method of
compression (including none). That allows to enable compression for bulk
work, and disable it in other cases (e.g. for security sensitive
content, or for unlikely to compress well content).
different levels).
network at precise moments - otherwise a lot of stuff will break.
The problem is at receiver side: several messages can be sent without waiting response and will be read into the buffer. If first is compressed
and subsequent - not, then it will be not so trivial to handle it. I have already faced with this problem when compression is switched on:
backend may send some message right after acknowledging compression protocol. This message is already compressed but is delivered together with uncompressed compression acknowledgement message. I have solved this problem in switching on compression at client side,
but really do not want to solve it at arbitrary moments. And once again: my opinion is that too much flexibility is not so good here:
there is no sense to switch off compression for short messages (overhead of switching can be larger than compression itself).Sorry, but I still think that possibility to turn compression on/off on the fly is very doubtful idea.Also it is not clear to me who and how will control enabling/disablingThe client could still trigger that. I don't think it should ever be
compression in this case?
I can imagine that "\copy" should trigger compression. But what about
(server side) "copy" command?
server controlled.
And it will require extension of protocol (adding some extra functions to libpq library to control it).Sorry, if I was unclear. I am not specialist in security.And concerning security risks... In most cases such problem is not relevantThe trend is towards never trusting the network, even if internal, not
at all because both client and server are located within single reliable
network.
the opposite.It if security of communication really matters, you should not switchHuh?
compression in all cases (including COPY and other bulk data transfer).
It is very strange idea to let client to decide which data is "security
sensitive" and which not.
But it seems to be very dangerous when client (and not server) decides which data is "security sensitive"
and which not. Just an example: you have VerySecreteTable. And when you perform selects on this table,
you switch compression off. But then client want to execute COPY command for this table and as far as it requires bulk data transfer,
user decides to switch on compression.
Actually specking about security risks has no sense: if you want to provide security, then you use TLS. And it provides its own compression.
So there is no need to perform compression at libpq level. libpq comression is for the cases when you do not need SSL at all.My expectation was that pooler is installed in the same network as database and there is no needIt makes poolers a lot more expensive if they have to decompress andI think that would also make cross-version handling easier, because aI do not completely understand the problem with connection pooler.
newer client driver can send the compression request and handle the
error, without needing to reconnect or such.
Most importantly, I think such a design is basically a necessity to make
connection poolers to work in a sensible way.
Right now developers of Yandex Odyssey are trying to support libpq
compression in their pooler.
If them will be faced with some problems, I will definitely address
them.
then recompress again. It'd be awesome if even the decompression could
be avoided in at least some cases, but compression is the really
expensive part. So if a client connects to the pooler with
compression=zlib and an existing server connection is used, the pooler
should be able to switch the existing connection to zlib.
to compress traffic between pooler and backends.
But even if it is really needed I do not see problems here.
Definitely we need zpq library to support different compression algorithms in the same application.
But it is done.
To avoid pooler perform decompression+compression ist is necessary to send command header in raw format.
But it will greatly reduce effect of stream compression.Sorry, I do not understand you.But if you think that it is so important, I will try to implement it. ManyI don't think the server needs to know about what the client does,
questions arise in this case: which side should control compression level?
Should client affect compression level both at client side and at server
side? Or it should be possible to specify separately compression level for
client and for server?
compression level wise.
Compression takes place at sender side.
So both client compressing its messages before sending to server
and server compresses responses sent to the client.
In both cases compression level may be specified.
And it is not clear whether client should control compression level of the server.All I was saying is that I think you should not name it ""Facebook zstd",Please notice that it is internal encoding, user will specify+<para>I would prefer this just be referenced as zstd or zstandard, not
+ Used compression algorithm. Right now the following streaming compression algorithms are supported: 'f' - Facebook zstd, 'z' - zlib, 'n' - no compression.
+</para>
facebook zstd. There's an RFC (albeit only "informational"), and it
doesn't name facebook, except as an employer:
https://tools.ietf.org/html/rfc8478
psql -d "dbname=postgres compression=zstd"
If name "zstd" is not good, I can choose any other.
just "zstd".
Sorry, it was my error.
Right now compression name is not specified by user at all: just boolean value on/off.I think you should just use something like
pq_beginmessage(buf, 'z');
pq_sendbyte(buf, compression_method_byte);
pq_endmessage(buf);
like most of the backend does? And if you, as I suggest, use a variable
length compression identifier, use
pq_sendcountedtext(buf, compression_method, strlen(compression_method));
or such.
Sorry, I can not do it in this way because pq connection with client is not yet initialized.
So I can not use pq_endmessage here and have to call secure_write manually.I am not saying it's necessarily the best approach, but it doesn't seem
like it'd be hard to have zpq not send / receive the data from the
network, but just transform the data actually to-be-sent/received from
the network.
E.g. in pq_recvbuf() you could do something roughly like the following,
after a successful secure_read()
if (compression_enabled)
{
zpq_decompress(network_data, network_data_length,
&input_processed, &output_data, &output_data_length);
copy_data_into_recv_buffer(output_data, output_data_length);
network_data += input_processed;
network_data_length -= input_processed;
}
Sorry, it is not possible because to produce some output, compressor may need to perform multiple read.
And doing it in two different places, i.e. in pqsecure_read and secure_read is worse than doing it on once place - s it is done now.and the inverse on the network receive side. The advantage would be that
we can then more easily use zpq stuff in other places.
If we do go with the current callback model, I think we should at least
extend it so that a 'context' parameter is shuffled through zpq, so that
other code can work without global state.
Callback is has void* arg where arbitrary daat can be passed to callback.?: makes sense when it's either much shorter, or when it allows to avoidIsn't it a matter of style preference?+ r = PqStream? : doesn't make sense to me in this case. This should be an if/else.
+ ? zpq_read(PqStream, PqRecvBuffer + PqRecvLength,
+ PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed)
+ : secure_read(MyProcPort, PqRecvBuffer + PqRecvLength,
+ PQ_RECV_BUFFER_SIZE - PqRecvLength);
+ PqRecvLength += processed;
Why if/else is principle better than ?:
repeating a bunch of code. E.g. if it's just a conditional argument to a
function with a lot of other arguments.
But when, as here, it just leads to weirder formatting, it really just
makes the code harder to read.
From my point of view the main role of ?: construction is not just saving some bytes/lines of code.
It emphasizes that both parts of conditional construction produce the same result.
As in this case it is result of read operation. If you replace it with if-else, this relation between actions done in two branches is missed
(or at least less clear).I agree that sometimes ?: leads to more complex and obscure expressions.Yes, pretty clearly for me.
But to you think that if-else in this case will lead to more clear or
readable code?
r = PqStream
? zpq_read(PqStream, PqRecvBuffer + PqRecvLength,
PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed)
: secure_read(MyProcPort, PqRecvBuffer + PqRecvLength,
PQ_RECV_BUFFER_SIZE - PqRecvLength);
vs
if (PqStream)
r = zpq_read(PqStream, PqRecvBuffer + PqRecvLength,
PQ_RECV_BUFFER_SIZE - PqRecvLength, &processed);
else
r = secure_read(MyProcPort, PqRecvBuffer + PqRecvLength,
PQ_RECV_BUFFER_SIZE - PqRecvLength);
the if / else are visually more clearly distinct, the sole added
repetition is r =. And most importantly if / else are more common.
I agree that ?: is less clear than if/else. In most of modern programming languages there is no construction ?:
but if-else can be used as expression. And repetition is always bad, isn't it? Not just because of writing extra code,
but because of possible inconsistencies and errors (just like normal forms in databases).
In any case - I do not think that it is principle: if you prefer - I can replace it with if-elseAnother question is whether conditional expression here is really good idea.A branch is cheaper than indirect calls, FWIW>
I prefer to replace with indirect function call...
Really? Please notice that we do not compare just branch with indirect call, but branch+direct call vs. indirect call.
I didn't measure it. But in any case IMHO performance aspect is less important here than clearance and maintainability of code.
And indirect call is definitely better in this sense... Unfortunately I can not use it (certainly it is not absolutely impossible, but it requires much
more changes).I don't mean arguments as in configurable, I mean that you should add aWhat are the suggested way to specify them?+#define ZSTD_BUFFER_SIZE (8*1024)Add some arguments for choosing these parameters.
+#define ZSTD_COMPRESSION_LEVEL 1
I can not put them in GUCs (because them are also needed at client side).
comment explaining why 8KB / level 1 was chosen.
Sorry, will do.We support palloc in both(). But even without that, you can just haveI tried to avoid use of dynamic memory allocation because zpq is used both+/*Uh, doesn't this bake ZPQ_MAX_ALGORITHMS into the ABI? That seems
+ * Get list of the supported algorithms.
+ * Each algorithm is identified by one letter: 'f' - Facebook zstd, 'z' - zlib.
+ * Algorithm identifies are appended to the provided buffer and terminated by '\0'.
+ */
+void
+zpq_get_supported_algorithms(char algorithms[ZPQ_MAX_ALGORITHMS])
+{
+ int i;
+ for (i = 0; zpq_algorithms[i].name != NULL; i++)
+ {
+ Assert(i < ZPQ_MAX_ALGORITHMS);
+ algorithms[i] = zpq_algorithms[i].name();
+ }
+ Assert(i < ZPQ_MAX_ALGORITHMS);
+ algorithms[i] = '\0';
+}
entirely unnecessary?
in client and server environments with different memory allocation
policies.
one function to get the number of algorithms, and then have the caller
pass in a large enough array.
Certainly it is possible.
But why complicate implementation if it is internal function used only in single place in the code?And I afraid that using the _pq_ parameter stuff makes enabling ofI didn't intend to suggest that the _pq_ stuff should be used in a
compression even less user friendly.
client facing manner. What I suggesting is that if the connection string
contains compression=, the driver would internally translate that to
_pq_.compression to pass that on to the server, which would allow the
connection establishment to succeed, even if the server is a bit older.
New client can establish connection with old server if it is not using compression.
And if it wants to use compression, then _pq_.compression will not help much.
Old server will ignore it (unlike compression= option) but then client will wait for acknowledgement from
server for used compression algorithm and didn't receive one. Certainly it is possible to handle this situation (if we didn't
receive expected message) but why it is better than adding compression option to startup package?
But if you think that adding new options to startup package should be avoided as much as possible,
then I will replace it with _pq_.compression.Greetings,Thank you very much for review and explanations.
Andres Freund
Hi
Sorry, I don't understand it.
This is the code of pq_getbyte_if_available:
int
pq_getbyte_if_available(unsigned char *c)
{
int r;
Assert(PqCommReadingMsg);
if (PqRecvPointer < PqRecvLength || (r = pq_recvbuf(true)) > 0)
{
*c = PqRecvBuffer[PqRecvPointer++];
return 1;
}
return r;
}
So "return r" branch is executed when both conditions are false: (PqRecvPointer < PqRecvLength)
and ((r = pq_recvbuf(true)) > 0)
Last condition cause assignment of "r" variable.
I wonder how did you get this "returned value is not initialized" warning?
Is it produced by some static analyze tool or compiler?
In any case, I will initialize "r" variable to make compiler happy.
Sorry I have fixed this mistyping several days ago in GIT repository
git@github.com:postgrespro/libpq_compression.git
but did;t attach new version of the patch because I plan to make more changes as a result of Andres review.
Thank you for noting it: fixed.
On 01.11.2020 12:37, Daniil Zakhlystov wrote:
Hi,I have a couple of comments regarding the last patch, mostly these are minor issues.In src/backend/libpq/pqcomm.c, starting from the line 1114:intpq_getbyte_if_available(unsigned char *c){int r;Assert(PqCommReadingMsg);if (PqRecvPointer < PqRecvLength || (0) > 0) // not easy to understand optimization (maybe add a comment?){*c = PqRecvBuffer[PqRecvPointer++];return 1;}return r; // returned value is not initialized}
Sorry, I don't understand it.
This is the code of pq_getbyte_if_available:
int
pq_getbyte_if_available(unsigned char *c)
{
int r;
Assert(PqCommReadingMsg);
if (PqRecvPointer < PqRecvLength || (r = pq_recvbuf(true)) > 0)
{
*c = PqRecvBuffer[PqRecvPointer++];
return 1;
}
return r;
}
So "return r" branch is executed when both conditions are false: (PqRecvPointer < PqRecvLength)
and ((r = pq_recvbuf(true)) > 0)
Last condition cause assignment of "r" variable.
I wonder how did you get this "returned value is not initialized" warning?
Is it produced by some static analyze tool or compiler?
In any case, I will initialize "r" variable to make compiler happy.
In src/interfaces/libpq/fe-connect.c, starting from the line 3255:pqGetc(&algorithm, conn);impl = zpq_get_algorithm_impl(algorithm);{ // I believe that if (impl < 0) condition is missing here, otherwise there is always an errorappendPQExpBuffer(&conn->errorMessage,libpq_gettext("server is not supported requested compression algorithm %c\n"), algorithm);goto error_return;}
Sorry I have fixed this mistyping several days ago in GIT repository
git@github.com:postgrespro/libpq_compression.git
but did;t attach new version of the patch because I plan to make more changes as a result of Andres review.
In configure, starting from the line 1587:--without-zlib do not use Zlib--with-zstd do not use zstd // is this correct?
Thank you for noting it: fixed.
Attachment
It looks there was a different version of pq_getbyte_if_available on my side, my bad.
I’ve reapplied the patch and compiler is happy now, thanks!
—
Daniil Zakhlystov
On Nov 1, 2020, at 3:01 PM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:Hi<libpq-compression-22.patch>On 01.11.2020 12:37, Daniil Zakhlystov wrote:Hi,I have a couple of comments regarding the last patch, mostly these are minor issues.In src/backend/libpq/pqcomm.c, starting from the line 1114:intpq_getbyte_if_available(unsigned char *c){int r;Assert(PqCommReadingMsg);if (PqRecvPointer < PqRecvLength || (0) > 0) // not easy to understand optimization (maybe add a comment?){*c = PqRecvBuffer[PqRecvPointer++];return 1;}return r; // returned value is not initialized}
Sorry, I don't understand it.
This is the code of pq_getbyte_if_available:
int
pq_getbyte_if_available(unsigned char *c)
{
int r;
Assert(PqCommReadingMsg);
if (PqRecvPointer < PqRecvLength || (r = pq_recvbuf(true)) > 0)
{
*c = PqRecvBuffer[PqRecvPointer++];
return 1;
}
return r;
}
So "return r" branch is executed when both conditions are false: (PqRecvPointer < PqRecvLength)
and ((r = pq_recvbuf(true)) > 0)
Last condition cause assignment of "r" variable.
I wonder how did you get this "returned value is not initialized" warning?
Is it produced by some static analyze tool or compiler?
In any case, I will initialize "r" variable to make compiler happy.In src/interfaces/libpq/fe-connect.c, starting from the line 3255:pqGetc(&algorithm, conn);impl = zpq_get_algorithm_impl(algorithm);{ // I believe that if (impl < 0) condition is missing here, otherwise there is always an errorappendPQExpBuffer(&conn->errorMessage,libpq_gettext("server is not supported requested compression algorithm %c\n"), algorithm);goto error_return;}
Sorry I have fixed this mistyping several days ago in GIT repository
git@github.com:postgrespro/libpq_compression.git
but did;t attach new version of the patch because I plan to make more changes as a result of Andres review.In configure, starting from the line 1587:--without-zlib do not use Zlib--with-zstd do not use zstd // is this correct?
Thank you for noting it: fixed.
It seems to be very important to be able to measure network traffic between client and server, especially in case of using compression. Althought there are a lot of toll for monitoring network traffix for Linux and other OSes, I didn't find one which can easily calculate traffix for particular backends. This is why I have added pg_stat_network_traffic view which can be used to measure efficiency of protocol message compression for different algorithms and workloads. This is the result of network traffic of two backends one with enabled compression and another with disable compression after execution of "select * from pg_class" command: select * from pg_stat_network_traffic; pid | rx_raw_bytes | tx_raw_bytes | rx_compressed_bytes | tx_compressed_bytes -------+--------------+--------------+---------------------+--------------------- 22272 | 0 | 0 | 0 | 0 22274 | 0 | 0 | 0 | 0 22276 | 29 | 86327 | 38 | 10656 22282 | 73 | 86327 | 0 | 0 22270 | 0 | 0 | 0 | 0 22269 | 0 | 0 | 0 | 0 22271 | 0 | 0 | 0 | 0 (7 rows) -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 02.11.2020 19:32, Daniil Zakhlystov wrote: > Hi, > Currently, zpq_stream contains a check only for the tx buffered data - > zpq_buffered(). > I think that there should be the same functionality for rx buffered > data. For example, the zpq_buffered_rx(). > zpq_buffered_rx() returns a value greater than zero if there is any > data that was fetched by rx_func() but haven't been decompressed yet, > in any other case zpq_buffered_rx() returns zero. > In this case, I think that we may also rename the existing > zpq_buffered() to zpq_buffered_tx() for clarity. > > -- > Daniil Zakhlystov Please try the attached patch v24 which adds zpq_buffered_rx and zpq_buffered_tx functions. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Mon, 2 Nov 2020 at 15:03, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > > It seems to be very important to be able to measure network traffic > between client and server, especially in case of using compression. > Althought there are a lot of toll for monitoring network traffix for > Linux and other OSes, I didn't find one which can easily calculate > traffix for particular backends. > This is why I have added pg_stat_network_traffic view which can be used > to measure efficiency of protocol message compression for different > algorithms and workloads. I agree that seems like a useful feature to have, even without of the rest of the patch. > This is the result of network traffic of two backends one with enabled > compression and another with disable compression > after execution of "select * from pg_class" command: > > select * from pg_stat_network_traffic; > pid | rx_raw_bytes | tx_raw_bytes | rx_compressed_bytes | > tx_compressed_bytes > -------+--------------+--------------+---------------------+--------------------- > 22276 | 29 | 86327 | 38 > | 10656 > 22282 | 73 | 86327 | 0 > | 0 The current names and values of these columns are confusing me: What column contains the amount of bytes sent to/received from the client? Is the compression method of pid 22282 extremely efficient at compressing, or does it void the data (compresses down to 0 bytes)? I suggest having columns that contain the bytes sent to/received from the client before and after compression. If no compression was used, those numbers are expected to be equal. Example names are `rx_raw_bytes` and `rx_data_bytes`, `rx_received_bytes` and `rx_bytes_uncompressed`. Another option would be initializing / setting rx_compressed_bytes and tx_compressed_bytes to -1 or NULL for connections that do not utilize compression, to flag that compression is not used. -Matthias
On 02.11.2020 19:53, Matthias van de Meent wrote: > This is the result of network traffic of two backends one with enabled >> compression and another with disable compression >> after execution of "select * from pg_class" command: >> >> select * from pg_stat_network_traffic; >> pid | rx_raw_bytes | tx_raw_bytes | rx_compressed_bytes | >> tx_compressed_bytes >> -------+--------------+--------------+---------------------+--------------------- >> 22276 | 29 | 86327 | 38 >> | 10656 >> 22282 | 73 | 86327 | 0 >> | 0 > The current names and values of these columns are confusing me: > What column contains the amount of bytes sent to/received from the > client? Is the compression method of pid 22282 extremely efficient at > compressing, or does it void the data (compresses down to 0 bytes)? Names of the columns can be changed if you or somebody else will propose better alternatives. This view pg_stat_network_traffic reports traffic from server (backend) point of view, i.e. rx_bytes (received bytes) are commands sent from client to the server tx_bytes (transmitted bytes) are responses sent by server to the client. If compression is not used then rx_compressed_bytes = tx_compressed_bytes = 0 It seems to be more natural then assigning them the same values as (raw bytes). Because it can really happen that for BLOBs with already compressed data (video images or sound) compressed data will be almost the same as raw data even if compression is enabled. So it seems to be important to distinguished situations when data can not be compressed and when it is not compressed at all. > I suggest having columns that contain the bytes sent to/received > from the client before and after compression. If no compression was > used, those numbers are expected to be equal. Example names are > `rx_raw_bytes` and `rx_data_bytes`, `rx_received_bytes` and > `rx_bytes_uncompressed`. Another option would be initializing / > setting rx_compressed_bytes and tx_compressed_bytes to -1 or NULL for > connections that do not utilize compression, to flag that > compression is not used. > > > -Matthias
Hi, On 2020-10-31 22:25:36 +0500, Andrey Borodin wrote: > But the price of compression is 1 cpu for 500MB/s (zstd). With a > 20Gbps network adapters cost of recompressing all traffic is at most > ~4 cores. It's not quite that simple, because presumably each connection is going to be handled by one core at a time in the pooler. So it's easy to slow down peak throughput if you also have to deal with TLS etc. Greetings, Andres Freund
On 03.11.2020 18:08, Daniil Zakhlystov wrote: > Hi, > > Looks like I found an issue: there can be a situation when libpq > frontend hangs after zpq_read(). This may happen when the socket is > not read ready and we can't perform another read because we wait on > pqWait() but we actually have some buffered rx data. > > I think we should add a check if there is any buffered rx data before > calling pqWait() to avoid such hangs. > > -- > Daniil Zakhlystov Hi, thank you very much for detecting the problem and especially providing fix for it. I committed you pull request. Patch based on this PR is attached to this mail . The latest version of libpq sources can be also found in git repository: git@github.com:postgrespro/libpq_compression.git -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Mon, 2 Nov 2020 at 20:20, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > > On 02.11.2020 19:53, Matthias van de Meent wrote: > > This is the result of network traffic of two backends one with enabled > >> compression and another with disable compression > >> after execution of "select * from pg_class" command: > >> > >> select * from pg_stat_network_traffic; > >> pid | rx_raw_bytes | tx_raw_bytes | rx_compressed_bytes | > >> tx_compressed_bytes > >> -------+--------------+--------------+---------------------+--------------------- > >> 22276 | 29 | 86327 | 38 > >> | 10656 > >> 22282 | 73 | 86327 | 0 > >> | 0 > > The current names and values of these columns are confusing me: > > What column contains the amount of bytes sent to/received from the > > client? Is the compression method of pid 22282 extremely efficient at > > compressing, or does it void the data (compresses down to 0 bytes)? > Names of the columns can be changed if you or somebody else will propose > better alternatives. How about Xx_logical_bytes for raw the pg command stream data, and keeping Xx_compressed_bytes for the compressed data in/out? > This view pg_stat_network_traffic reports traffic from server (backend) > point of view, i.e. > rx_bytes (received bytes) are commands sent from client to the server > tx_bytes (transmitted bytes) are responses sent by server to the client. > > If compression is not used then rx_compressed_bytes = > tx_compressed_bytes = 0 > It seems to be more natural then assigning them the same values as (raw > bytes). > Because it can really happen that for BLOBs with already compressed data > (video images or sound) > compressed data will be almost the same as raw data even if compression > is enabled. > So it seems to be important to distinguished situations when data can > not be compressed and > when it is not compressed at all. Looking at it from that viewpoint, I agree. My primary reason for suggesting this was that it would be useful to expose how much data was transferred between the client and the server, which cannot be constructed from that view for compression-enabled connections. That is because the compression methods' counting only starts after some bytes have already been transferred, and the raw/logical counter starts deviating once compression is enabled.
On 05.11.2020 15:40, Matthias van de Meent wrote:
How about Xx_logical_bytes for raw the pg command stream data, and keeping Xx_compressed_bytes for the compressed data in/out?
Frankly speaking I do not like work "logical" in this context.
It is in any case physical bytes, received from the peer.
Speaking about compression or encryption, "raw" is much widely used
for uncompressed or plain data.
This view pg_stat_network_traffic reports traffic from server (backend) point of view, i.e. rx_bytes (received bytes) are commands sent from client to the server tx_bytes (transmitted bytes) are responses sent by server to the client. If compression is not used then rx_compressed_bytes = tx_compressed_bytes = 0 It seems to be more natural then assigning them the same values as (raw bytes). Because it can really happen that for BLOBs with already compressed data (video images or sound) compressed data will be almost the same as raw data even if compression is enabled. So it seems to be important to distinguished situations when data can not be compressed and when it is not compressed at all.Looking at it from that viewpoint, I agree. My primary reason for suggesting this was that it would be useful to expose how much data was transferred between the client and the server, which cannot be constructed from that view for compression-enabled connections. That is because the compression methods' counting only starts after some bytes have already been transferred, and the raw/logical counter starts deviating once compression is enabled.
Sorry, I do not understand your point.
This view reports network traffic from server's side.
But client's traffic information is "mirror" of this statistic: server_tx=client_rx and visa versa.
Yes, first few bytes exchanged by client and server during handshake are not compressed.
But them are correctly calculated as "raw bytes". And certainly this few bytes can not have any influence on
measured average compression ratio (the main goal of using this network traffic statistic from my point of view).
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Thu, 5 Nov 2020 at 17:01, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > > Sorry, I do not understand your point. > This view reports network traffic from server's side. > But client's traffic information is "mirror" of this statistic: server_tx=client_rx and visa versa. > > Yes, first few bytes exchanged by client and server during handshake are not compressed. > But them are correctly calculated as "raw bytes". And certainly this few bytes can not have any influence on > measured average compression ratio (the main goal of using this network traffic statistic from my point of view). As I understand it, the current metrics are as follows: Server |<- |<- Xx_raw_bytes | Compression | |<- Xx_compressed_bytes Client connection | Network From the views' name 'pg_stat_network_traffic', to me 'Xx_raw_bytes' would indicate the amount of bytes sent/received over the client connection (e.g. measured between the Client connection and Network part, or between the Server/Client connection and Compression/Client connection sections), because that is my natural understanding of 'raw tx network traffic'. This is why I proposed 'logical' instead of 'raw', as 'raw' is quite apparently understood differently when interpreted by different people, whereas 'logical' already implies that the value is an application logic-determined value (e.g. size before compression). The current name implies a 'network' viewpoint when observing this view, not the 'server'/'backend' viewpoint you describe. If the 'server'/'backend' viewpoint is the desired default viewpoint, then I suggest to rename the view to `pg_stat_network_compression`, as that moves the focus to the compression used, and subsequently clarifies `raw` as the raw application command data. If instead the name `pg_stat_network_traffic` is kept, I suggest changing the metrics collected to the following scheme: Server |<- |<- Xx_logical_bytes | Compression | |<- Xx_compressed_bytes (?) |<- |<- Xx_raw_bytes Client connection | Network This way, `raw` in the context of 'network_traffic' means "sent-over-the-connection"-data, and 'logical' is 'application logic' -data (as I'd expect from both a network as an application point of view). 'Xx_compressed_bytes' is a nice addition, but not strictly necessary, as you can subtract raw from logical to derive the bytes saved by compression.
On 2020-11-02 20:50, Andres Freund wrote: > On 2020-10-31 22:25:36 +0500, Andrey Borodin wrote: >> But the price of compression is 1 cpu for 500MB/s (zstd). With a >> 20Gbps network adapters cost of recompressing all traffic is at most >> ~4 cores. > > It's not quite that simple, because presumably each connection is going > to be handled by one core at a time in the pooler. So it's easy to slow > down peak throughput if you also have to deal with TLS etc. Also, current deployments of connection poolers use rather small machine sizes. Telling users you need 4 more cores per instance now to decompress and recompress all the traffic doesn't seem very attractive. Also, it's not unheard of to have more than one layer of connection pooling. With that, this whole design sounds a bit like a heat-generation system. ;-)
On 05.11.2020 21:07, Matthias van de Meent wrote: > On Thu, 5 Nov 2020 at 17:01, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Sorry, I do not understand your point. >> This view reports network traffic from server's side. >> But client's traffic information is "mirror" of this statistic: server_tx=client_rx and visa versa. >> >> Yes, first few bytes exchanged by client and server during handshake are not compressed. >> But them are correctly calculated as "raw bytes". And certainly this few bytes can not have any influence on >> measured average compression ratio (the main goal of using this network traffic statistic from my point of view). > As I understand it, the current metrics are as follows: > > Server > |<- |<- Xx_raw_bytes > | Compression > | |<- Xx_compressed_bytes > Client connection > | > Network > > From the views' name 'pg_stat_network_traffic', to me 'Xx_raw_bytes' > would indicate the amount of bytes sent/received over the client > connection (e.g. measured between the Client connection and Network > part, or between the Server/Client connection and Compression/Client > connection sections), because that is my natural understanding of > 'raw tx network traffic'. This is why I proposed 'logical' instead > of 'raw', as 'raw' is quite apparently understood differently when > interpreted by different people, whereas 'logical' already implies > that the value is an application logic-determined value (e.g. size > before compression). > > The current name implies a 'network' viewpoint when observing this > view, not the 'server'/'backend' viewpoint you describe. If the > 'server'/'backend' viewpoint is the desired default viewpoint, then > I suggest to rename the view to `pg_stat_network_compression`, as > that moves the focus to the compression used, and subsequently > clarifies `raw` as the raw application command data. > > If instead the name `pg_stat_network_traffic` is kept, I suggest > changing the metrics collected to the following scheme: > > Server > |<- |<- Xx_logical_bytes > | Compression > | |<- Xx_compressed_bytes (?) > |<- |<- Xx_raw_bytes > Client connection > | > Network > > This way, `raw` in the context of 'network_traffic' means > "sent-over-the-connection"-data, and 'logical' is 'application logic' > -data (as I'd expect from both a network as an application point of > view). 'Xx_compressed_bytes' is a nice addition, but not strictly > necessary, as you can subtract raw from logical to derive the bytes > saved by compression. Sorry, but "raw" in this context means "not transformed", i.e. not compressed. I have not used term uncompressed, because it assumes that there are "compressed" bytes which is not true if compression is not used. So "raw" bytes are not bytes which we sent through network - quite opposite: application writes "raw" (uncompressed) data, it is compressed ans then compressed bytes are sent. May be I am wrong, but term "logical" is much more confusing and overloaded than "raw". Especially taken in account that it is widely used in Postgres for logical replication. The antonym to "logical" is "physical", i.e. something materialized. But in case of data exchanged between client and server, which one can be named physical, which one logical? Did you ever heard about logical size of the file (assuming that may contain holes or be compressed by file system?) In zfs it is called "apparent" size. Also I do not understand at your picture why Xx_compressed_bytes may be different from Xx_raw_bytes?
On 05.11.2020 22:22, Peter Eisentraut wrote: > On 2020-11-02 20:50, Andres Freund wrote: >> On 2020-10-31 22:25:36 +0500, Andrey Borodin wrote: >>> But the price of compression is 1 cpu for 500MB/s (zstd). With a >>> 20Gbps network adapters cost of recompressing all traffic is at most >>> ~4 cores. >> >> It's not quite that simple, because presumably each connection is going >> to be handled by one core at a time in the pooler. So it's easy to slow >> down peak throughput if you also have to deal with TLS etc. > > Also, current deployments of connection poolers use rather small > machine sizes. Telling users you need 4 more cores per instance now > to decompress and recompress all the traffic doesn't seem very > attractive. Also, it's not unheard of to have more than one layer of > connection pooling. With that, this whole design sounds a bit like a > heat-generation system. ;-) Compression will be mostly useful for: 1. Replication protocol 2. COPY command 3. OLAP queries returning large result sets 4. Queries returning BLOBs/JSON It seems to be not so good idea to switch on compression for all connections. And cases describe above are usually affect only small number of backends. Also please notice, that compression may significantly reduce size of transferred data. An copying data between multiple levels of network protocol also consumes significant amount of CPU. Also number of system calls can be proportional to the size of transferred data and in many cases now performances if servers is limited by number of system calls, rather than by network throughput/latency and other factors. So I accept your arguments, but still think that picture is not so straightforward. This is why it is very interesting to me to know results of using compression with Odyssey in real production environment.
> 6 нояб. 2020 г., в 00:22, Peter Eisentraut <peter.eisentraut@enterprisedb.com> написал(а): > > On 2020-11-02 20:50, Andres Freund wrote: >> On 2020-10-31 22:25:36 +0500, Andrey Borodin wrote: >>> But the price of compression is 1 cpu for 500MB/s (zstd). With a >>> 20Gbps network adapters cost of recompressing all traffic is at most >>> ~4 cores. >> It's not quite that simple, because presumably each connection is going >> to be handled by one core at a time in the pooler. So it's easy to slow >> down peak throughput if you also have to deal with TLS etc. > > Also, current deployments of connection poolers use rather small machine sizes. Telling users you need 4 more cores perinstance now to decompress and recompress all the traffic doesn't seem very attractive. Also, it's not unheard of to havemore than one layer of connection pooling. With that, this whole design sounds a bit like a heat-generation system.;-) User should ensure good bandwidth between pooler and DB. At least they must be within one availability zone. This makes compressionbetween pooler and DB unnecessary. Cross-datacenter traffic is many times more expensive. I agree that switching between compression levels (including turning it off) seems like nice feature. But 1. Scope of its usefulness is an order of magnitude smaller than compression of the whole connection. 2. Protocol for this feature is significantly more complicated. 3. Restarted compression is much less efficient and effective. Can we design a protocol so that this feature may be implemented in future, currently focusing on getting things compressed?Are there any drawbacks in this approach? Best regards, Andrey Borodin.
** this is a plaintext version of the previous HTML-formatted message ** Hi, I’ve run a couple of pgbenchmarks using this patch with odyssey connection pooler, with client-to-pooler ZSTD compressionturned on. pgbench --builtin tpcb-like -t 75 --jobs=32 --client=1000 CPU utilization chart of the configuration above: https://storage.yandexcloud.net/usernamedt/odyssey-compression.png CPU overhead on average was about 10%. pgbench -i -s 1500 CPU utilization chart of the configuration above: https://storage.yandexcloud.net/usernamedt/odyssey-compression-i-s.png As you can see, there was not any noticeable difference in CPU utilization with ZSTD compression enabled or disabled. Regarding replication, I've made a couple of fixes for this patch, you can find them in this pull request https://github.com/postgrespro/libpq_compression/pull/3 With these fixes applied, I've run some tests using this patch with streaming physical replication on some large clusters.Here is the difference of network usage on the replica with ZSTD replication compression enabled compared to thereplica without replication compression: - on pgbench -i -s 1500 there was ~23x less network usage - on pgbench -T 300 --jobs=32 --client=640 there was ~4.5x less network usage - on pg_restore of the ~300 GB database there was ~5x less network usage To sum up, I think that the current version of the patch (with per-connection compression) is OK from the protocol pointof view except for the compression initialization part. As discussed, we can either do initialization before the startuppacket or move the compression to _pq_ parameter to avoid issues on older backends. Regarding switchable on the fly compression, although it introduces more flexibility, seems like that it will significantlyincrease the implementation complexity of both the frontend and backend. To support this approach in the future,maybe we should add something like the compression mode to protocol and name the current approach as “permanent” whilereserving the “switchable” compression type for future implementation? Thanks, Daniil Zakhlystov 06.11.2020, 11:58, "Andrey Borodin" <x4mmm@yandex-team.ru>: >> 6 нояб. 2020 г., в 00:22, Peter Eisentraut <peter.eisentraut@enterprisedb.com>: >> >> On 2020-11-02 20:50, Andres Freund wrote: >>> On 2020-10-31 22:25:36 +0500, Andrey Borodin wrote: >>>> But the price of compression is 1 cpu for 500MB/s (zstd). With a >>>> 20Gbps network adapters cost of recompressing all traffic is at most >>>> ~4 cores. >>> It's not quite that simple, because presumably each connection is going >>> to be handled by one core at a time in the pooler. So it's easy to slow >>> down peak throughput if you also have to deal with TLS etc. >> >> Also, current deployments of connection poolers use rather small machine sizes. Telling users you need 4 more cores perinstance now to decompress and recompress all the traffic doesn't seem very attractive. Also, it's not unheard of to havemore than one layer of connection pooling. With that, this whole design sounds a bit like a heat-generation system. ;-) > > User should ensure good bandwidth between pooler and DB. At least they must be within one availability zone. This makescompression between pooler and DB unnecessary. > Cross-datacenter traffic is many times more expensive. > > I agree that switching between compression levels (including turning it off) seems like nice feature. But > 1. Scope of its usefulness is an order of magnitude smaller than compression of the whole connection. > 2. Protocol for this feature is significantly more complicated. > 3. Restarted compression is much less efficient and effective. > > Can we design a protocol so that this feature may be implemented in future, currently focusing on getting things compressed?Are there any drawbacks in this approach? > > Best regards, Andrey Borodin.
Based on Andres review I have implemented the following changes in libpq_compression: 1. Make it possible to specify list of compression algorithms in connection string. 2. Make it possible to specify compression level. 3. Use "_pq_.compression" instead of "compression" in startup package. 4. Use full names instead of one-character encoding for compression algorithm names. So now it is possible to open connection in this way: psql "dbname=postgres compression=zstd:5,zlib" New version of the patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Tue, Nov 24, 2020 at 7:33 AM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > New version of the patch is attached. I read over the comments from Andres (and Peter) suggesting that this ought to be on-the-fly configurable. Here are some thoughts on making that work with the wire protocol: If the client potentially wants to use compression at some point it should include _pq_.compression in the startup message. The value associated with _pq_.compression should be a comma-separated list of compression methods which the client understands. If the server responds with a NegotiateProtocolVersion message, then it either includes _pq_.compression (in which case the server does not support compression) or it does not (in which case the server does support compression). If no NegotiateProtocolVersion message is returned, then the server is from before November 2017 (ae65f6066dc3d19a55f4fdcd3b30003c5ad8dbed) and compression is not supported. If the client requests compression and the server supports it, it should return a new SupportedCompressionTypes message following NegotiateProtocolMessage response. That should be a list of compression methods which the server understands. At this point, the clent and the server each know what methods the other understands. Each should now feel free to select a compression method the other side understands, and to switch methods whenever desired, as long as they only select from methods the other side has said that they understand. The patch seems to think that the compression method has to be the same in both directions and that it can never change, but there's no real reason for that. Let each side start out uncompressed and then let it issue a new SetCompressionMethod protocol message to switch the compression method whenever it wants. After sending that message it begins using the new compression type. The other side doesn't have to agree. That way, you don't have to worry about synchronizing the two directions. Each side is just telling the other what is choosing to do, from among the options the other side said it could understand. It's an interesting question whether it's best to "wrap" the compressed messages in some way. For example, imagine that instead of just compressing the bytes and sending them out, you send a message of some new type CompressedMessage whose payload is another protocol message. Then a piece of middleware could decide to decompress each message just enough to see whether it wants to do anything with the message and if not just forward it. It could also choose to inject its own messages into the message stream which wouldn't necessarily need to be compressed, because the wrapper allows mixing of compressed and uncompressed messages. The big disadvantage of this approach is that in many cases it will be advantageous to compress consecutive protocol messages as a unit. For example, when the extend query protocol is in use, the client will commonly send P-B-D-E-S maybe even in one network packet. It will compress better if all of those messages are compressed as one rather than separately. That consideration argues for the approach the patch actually takes (though the documentation isn't really very clear about what the patch is actually doing here so I might be misunderstanding). However, note that when the client or server does a "flush" this requires "flushing" at the compression layer also, so that all pending data can be sent. zlib can certainly do that; I assume other algorithms can too, but I don't really know. If there are algorithms that don't have that built in, this approach might be an issue. Another thing to think about is whether it's really beneficial to compress stuff like P-B-D-E-S. I would guess that the benefits here come mostly from compressing DataRow and CopyData messages, and that compressing messages that don't contain much payload data may just be a waste of CPU cycles. On the other hand, it's not impossible for it to win: the query might be long, and query text is probably highly compressible. Designing something that is specific to certain message types is probably a bridge too far, but at least I think this is a strong argument that the compression method shouldn't have to be the same in both directions. -- Robert Haas EDB: http://www.enterprisedb.com
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, passed Spec compliant: tested, failed Documentation: tested, failed Submission review -- Is the patch in a patch format which has context? (eg: context diff format) NO, need to fix Does it apply cleanly to the current git master? YES Does it include reasonable tests, necessary doc patches, etc? have docs, missing tests Usability review -- At the moment, the patch supports per-connection (permanent) compression. The frontend can specify the desired compressionalgorithms and compression levels and then negotiate the compression algorithm that is going to be used with thebackend. In current state patch is missing the ability to enable/disable the compression on the backend side, I thinkit might be not great from the usability side. Regarding on-the-fly configurable compression and different compression algorithms for each direction - these two ideas arepromising but tend to make the implementation more complex. However, the current implementation can be extended to supportthese approaches in the future. For example, we can specify switchable on-the-fly compression as ‘switchable’ algorithmand negotiate it like the regular compression algorithm (like we currently negotiate ‘zstd’ and ‘zlib’). ‘switchable’algorithm may then introduce new specific messages to Postgres protocol to make the on-the-fly compression magicwork. The same applies to Robert’s idea of the different compression algorithms for different directions - we can introduce itlater as a new compression algorithm with new specific protocol messages. Does the patch actually implement that? YES Do we want that? YES Do we already have it? NO Does it follow SQL spec, or the community-agreed behavior? To be discussed Does it include pg_dump support (if applicable)? not applicable Are there dangers? theoretically possible CRIME-like attack when using with SSL enabled Have all the bases been covered? To be discussed Feature test -- I’ve applied the patch, compiled, and tested it with configure options --enable-cassert and --enable-debug turned on. I’vetested the following scenarios: 1. make check ======================= All 201 tests passed. ======================= 2. make check-world initially failed with: ============== running regression test queries ============== test postgres_fdw ... FAILED 4465 ms ============== shutting down postmaster ============== ====================== 1 of 1 tests failed. ====================== The differences that caused some tests to fail can be viewed in the file "/xxx/xxx/review/postgresql/contrib/postgres_fdw/regression.diffs". A copy of the test summary that you see above issaved in the file "/xxx/xxx/review/postgresql/contrib/postgres_fdw/regression.out". All tests passed after replacing ‘gsslib, target_session_attrs’ with ‘gsslib, compression, target_session_attrs’ in line8914 of postgresql/contrib/postgres_fdw/expected/postgres_fdw.out 3. simple psql utility usage psql -d "host=xxx port=5432 dbname=xxx user=xxx compression=1" 4. pgbench tpcb-like w/ SSL turned ON pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=require compression=1" --builtin tpcb-like -t 70 --jobs=32 --client=700 5. pgbench tpcb-like w/ SSL turned OFF pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=disable compression=1" --builtin tpcb-like -t 70 --jobs=32 --client=700 6. pgbench initialization w/ SSL turned ON pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=require compression=1" -i -s 500 7. pgbench initialization w/ SSL turned OFF pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=disable compression=1" -i -s 500 8. Streaming physical replication. Recovery-related parameters recovery_target_timeline = 'latest' primary_conninfo = 'host=xxx port=5432 user=repl application_name=xxx compression=1' primary_slot_name = 'xxx' restore_command = 'some command' 9. This compression has been implemented in an experimental build of odyssey connection pooler and tested with ~1500 syntheticsimultaneous clients configuration and ~300 GB databases. During the testing, I’ve reported and fixed some of the issues. Does the feature work as advertised? YES Are there corner cases the author has failed to consider? NO Are there any assertion failures or crashes? NO Performance review -- Does the patch slow down simple tests? NO If it claims to improve performance, does it? YES Does it slow down other things? Using compression may add a CPU overhead. This mostly depends on compression algorithm and chosen compression level. Duringtesting with ZSTD algorithm and compression level 1 there was about 10% of CPU overhead in read/write balanced scenariosand almost no overhead in mostly read scenarios. Coding review -- In protocol.sgml: > It can be just boolean values enabling or disabling compression > ("true"/"false", "on"/"off", "yes"/"no", "1"/"0"), "auto" or explicit list of compression algorithms > separated by comma with optional specification of compression level: "zlib,zstd:5". But in fe-protocol3.c: > if (pg_strcasecmp(value, "true") == 0 || > pg_strcasecmp(value, "yes") == 0 || > pg_strcasecmp(value, "on") == 0 || > pg_strcasecmp(value, "any") == 0 || > pg_strcasecmp(value, "1") == 0) > { I believe there is some mismatch - in docs, there is an “auto” parameter, but in code “auto” is missing, but “any” exists.Actually, I propose to remove both “auto” and “any” parameters because they work the same way as “true/on/yes/1” butappear like something else. In fe-protocol3.c: >#define pq_read_conn(conn) \ > (conn->zstream \ > ? zpq_read(conn->zstream, conn->inBuffer + conn->inEnd, \ > conn->inBufSize - conn->inEnd) \ > : pqsecure_read(conn, conn->inBuffer + conn->inEnd, \ > conn->inBufSize - conn->inEnd)) I think there should be some comment regarding the read function choosing logic. Same for zpq_write calls. Also, pq_read_connis defined as a macros, but there is no macros for pq_write_conn. In configure.ac: > if test "$with_zstd" = yes; then > AC_CHECK_LIB(zstd, ZSTD_decompressStream, [], > [AC_MSG_ERROR([zstd library not found > If you have zstd already installed, see config.log for details on the > failure. It is possible the compiler isn't looking in the proper directory. > Use --without-zstd to disable zstd support.])]) > fi > if test "$with_zstd" = yes; then > AC_CHECK_HEADER(zstd.h, [], [AC_MSG_ERROR([zstd header not found > If you have zstd already installed, see config.log for details on the > failure. It is possible the compiler isn't looking in the proper directory. > Use --without-zstd to disable zstd support.])]) > fi Looks like the rows with --without-zstd are incorrect. In fe-connect.c: > if (index == (char)-1) > { > appendPQExpBuffer(&conn->errorMessage, > libpq_gettext( > "server is not supported requested compression algorithms %s\n"), > conn->compression); > goto error_return; > } Right now this error might be displayed in two cases: Backend support compression, but it is somehow disabled/turned off Backend support compression, but does not support requested algorithms I think that it is a good idea to differentiate these two cases. Maybe define the following behavior somewhere in docs: “When connecting to an older backend, which does not support compression, or in the case when the backend support compressionbut for some reason wants to disable it, the backend will just ignore the _pq_.compression parameter and won’tsend the compressionAck message to the frontend.” To sum up, I think that the current implementation already introduces good benefits. As I proposed in the Usability review,we may introduce the new approaches later as separate compression 'algorithms'. Thanks, Daniil Zakhlystov
On Tue, Nov 24, 2020 at 12:35 PM Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: > To sum up, I think that the current implementation already introduces good benefits. As I proposed in the Usability review,we may introduce the new approaches later as separate compression 'algorithms'. I don't think the current patch is so close to being committable that we shouldn't be considering what we really want to have here. It's one thing to say, well, this patch is basically done, let's not start redesigning it now. But that's not the case here. For example, I don't see any committer accepting the comments in zpq_stream.c as adequate, or the documentation, either. Some comments that have been made previously, like Andres's remark about the non-standard message construction in pq_configure(), have not been addressed, and I do not think any committer is going to agree with the idea that the novel method chosen by the patch is superior here, not least but not only because it seems like it's endian-dependent. That function also uses goto, which anybody thinking of committing this will surely try to get rid of, and I'm pretty sure the sscanf() isn't good enough to reject trailing garbage, and the error message that follows is improperly capitalized. I'm sure there's other stuff, too: this is just based on a quick look. Before we start worrying about any of that stuff in too much detail, I think it makes a lot of sense to step back and consider the design. Honestly, the work of changing the design might be smaller than the amount of cleanup the patch needs. But even if it's larger, it's probably not vastly larger. And in any case, I quite disagree with the idea that we should commit to a user-visible interface that exposes a subset of the functionality that we needed and then try to glue the rest of the functionality on top of it later. If we make a libpq connection option called compression that controls the type of compression that is used in both direction, then how exactly would we extend that later to allow for different compression in the two directions? Some syntax like compression=zlib/none, where the value before the slash controls one direction and the value after the slash controls the other? Maybe. But on the other hand, maybe it's better to have separate connection options for client compression and server compression. Or, maybe the kind of compression used by the server should be controlled via a GUC rather than a connection option. Or, maybe none of that is right and we should stick with the approach the patch currently takes. But it's not like we can do something for v1 and then just change things randomly later: there will be backward-compatibility to worry about. So the time to talk about the general approach here is now, before anything gets committed, before the project has committed itself to any particular design. If we decide in that discussion that certain things can be left for the future, that's fine. If we've have discussed how they could be added without breaking backward compatibility, even better. But we can't just skip over having that discussion. -- Robert Haas EDB: http://www.enterprisedb.com
I completely agree that backward-compatibility is important here. I think that it is a good idea to clarify how the compression establishment works in the current version of the patch: 1. Frontend send the startup packet which may look like this: _pq_.compression = 'zlib,zstd' (I omitted the variations with compression levels for clarity) Then, on the backend, there are two possible cases: 2.1 If the backend is too old and doesn't know anything about the compression or if the compression is disabled on the backend,it just ignores the compression parameter 2.2 In the other case, the backend intersects the client compression method with its own supported ones and responds withcompressionAck message which contains the index of the chosen compression method (or '-1' if it doesn't support any ofthe methods provided). If the frontend receives the compressionAck message, there is also two cases: 3.1 If compressionAck contains '-1', do not initiate compression 3.2 In the other case, initialize the chosen compression method immediately. My idea is that we can add new compression approaches in the future and initialize them differently on step 3.2. For example, in the case of switchable compression: 1. Client sends a startup packet with _pq_.compression = 'switchable,zlib,zstd' - it means that client wants switchable compressionor permanent zlib/zstd compression. Again, two main cases on the backend: 2.1 Backend doesn't know about any compression or compression turned off => ignore the _pq_.compression 2.2.1 If the backend doesn't have switchable compression implemented, it won't have 'switchable' in his supported methods.So it will simply discard this method in the process of the intersection of the client and frontend compression methodsand respond with some compressionAck message - choose permanent zlib, zstd, or nothing (-1). 2.2.2 If the backend supports switchable on the fly compression, it will have 'switchable' in his supported methods so itmay choose 'switchable' in his compressionAck response. After that, on the frontend side: 3.1 If compressionAck contains '-1', do not initiate compression 3.2.1 If compressionAck has 'zstd' or 'zlib' as the chosen compression method, init permanent streaming compression immediately. 3.2.2 If compressionAck has 'switchable' as the chosen compression method, init the switchable compression. Initializationmay involve sending some additional messages to the backend to negotiate the details like the supported switchableon the fly compression methods or any other details. The same applies to the compression with the different algorithms in each direction. We can call it, for example, 'directional-specific'and init differently on step 3.2. The key is that we don't even have to decide the exact initializationprotocol for 'switchable' and 'direction-specific'. It may be added in the future. Basically, this is what I’ve meant in my previous message about the future expansion of the current design, I hope that Imanaged to clarify it. Thanks, Daniil Zakhlystov > On Nov 24, 2020, at 11:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Nov 24, 2020 at 12:35 PM Daniil Zakhlystov > <usernamedt@yandex-team.ru> wrote: >> To sum up, I think that the current implementation already introduces good benefits. As I proposed in the Usability review,we may introduce the new approaches later as separate compression 'algorithms'. > > I don't think the current patch is so close to being committable that > we shouldn't be considering what we really want to have here. It's one > thing to say, well, this patch is basically done, let's not start > redesigning it now. But that's not the case here. For example, I don't > see any committer accepting the comments in zpq_stream.c as adequate, > or the documentation, either. Some comments that have been made > previously, like Andres's remark about the non-standard message > construction in pq_configure(), have not been addressed, and I do not > think any committer is going to agree with the idea that the novel > method chosen by the patch is superior here, not least but not only > because it seems like it's endian-dependent. That function also uses > goto, which anybody thinking of committing this will surely try to get > rid of, and I'm pretty sure the sscanf() isn't good enough to reject > trailing garbage, and the error message that follows is improperly > capitalized. I'm sure there's other stuff, too: this is just based on > a quick look. > > Before we start worrying about any of that stuff in too much detail, I > think it makes a lot of sense to step back and consider the design. > Honestly, the work of changing the design might be smaller than the > amount of cleanup the patch needs. But even if it's larger, it's > probably not vastly larger. And in any case, I quite disagree with the > idea that we should commit to a user-visible interface that exposes a > subset of the functionality that we needed and then try to glue the > rest of the functionality on top of it later. If we make a libpq > connection option called compression that controls the type of > compression that is used in both direction, then how exactly would we > extend that later to allow for different compression in the two > directions? Some syntax like compression=zlib/none, where the value > before the slash controls one direction and the value after the slash > controls the other? Maybe. But on the other hand, maybe it's better to > have separate connection options for client compression and server > compression. Or, maybe the kind of compression used by the server > should be controlled via a GUC rather than a connection option. Or, > maybe none of that is right and we should stick with the approach the > patch currently takes. But it's not like we can do something for v1 > and then just change things randomly later: there will be > backward-compatibility to worry about. So the time to talk about the > general approach here is now, before anything gets committed, before > the project has committed itself to any particular design. If we > decide in that discussion that certain things can be left for the > future, that's fine. If we've have discussed how they could be added > without breaking backward compatibility, even better. But we can't > just skip over having that discussion. > > -- > Robert Haas > EDB: http://www.enterprisedb.com
On 24.11.2020 21:35, Robert Haas wrote:
On Tue, Nov 24, 2020 at 12:35 PM Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote:To sum up, I think that the current implementation already introduces good benefits. As I proposed in the Usability review, we may introduce the new approaches later as separate compression 'algorithms'.I don't think the current patch is so close to being committable that we shouldn't be considering what we really want to have here. It's one thing to say, well, this patch is basically done, let's not start redesigning it now. But that's not the case here. For example, I don't see any committer accepting the comments in zpq_stream.c as adequate, or the documentation, either. Some comments that have been made previously, like Andres's remark about the non-standard message construction in pq_configure(), have not been addressed, and I do not think any committer is going to agree with the idea that the novel method chosen by the patch is superior here, not least but not only because it seems like it's endian-dependent. That function also uses goto, which anybody thinking of committing this will surely try to get rid of, and I'm pretty sure the sscanf() isn't good enough to reject trailing garbage, and the error message that follows is improperly capitalized. I'm sure there's other stuff, too: this is just based on a quick look. Before we start worrying about any of that stuff in too much detail, I think it makes a lot of sense to step back and consider the design. Honestly, the work of changing the design might be smaller than the amount of cleanup the patch needs. But even if it's larger, it's probably not vastly larger. And in any case, I quite disagree with the idea that we should commit to a user-visible interface that exposes a subset of the functionality that we needed and then try to glue the rest of the functionality on top of it later. If we make a libpq connection option called compression that controls the type of compression that is used in both direction, then how exactly would we extend that later to allow for different compression in the two directions? Some syntax like compression=zlib/none, where the value before the slash controls one direction and the value after the slash controls the other? Maybe. But on the other hand, maybe it's better to have separate connection options for client compression and server compression. Or, maybe the kind of compression used by the server should be controlled via a GUC rather than a connection option. Or, maybe none of that is right and we should stick with the approach the patch currently takes. But it's not like we can do something for v1 and then just change things randomly later: there will be backward-compatibility to worry about. So the time to talk about the general approach here is now, before anything gets committed, before the project has committed itself to any particular design. If we decide in that discussion that certain things can be left for the future, that's fine. If we've have discussed how they could be added without breaking backward compatibility, even better. But we can't just skip over having that discussion. -- Robert Haas EDB: http://www.enterprisedb.com
First of all thank you for review.
And I completely agree with you that this patch is not ready for committing -
at least documentation written in my bad english has to be checked and fixed. Also I have not tesyed it much at Windows
and other non-unix systems.
I do not want to discuss small technical things like sending compression message
in pq_configure. I have answered Andres why I can not use standard functions in this case:
just because this function is called in initialization of connection and so connection handle is not ready yet.
But this definitely can be changed (although it is not endian-dependent: libpq messages are using big-endian order).
Also it seems to be strange to discuss presence of "goto" in the code: we are not "puritans", are we? ;)
Yes, I also try to avoid use of goto-s as much as possible as them can make code less readable.
But complete prohibiting of goto-s and replacing them with artificial loops and redundant checks seems to be
a kind of radical approaches which rarely lead to something good. By the way - there 3 thousands gotos in Postgres code
(may be some of them are in generated code - I have not checked:)
So lets discuss more fundamental things, like your suggestion for complete redesign of compression support.
I am not against such discussion, although I personally do not think that there are a lot of topics for discussion here.
Definitely I do not want to say that my implementation is perfect and it can not be reimplemented in better way. Certainly it can.
But compression itself, especially compression of protocol messages is not a novel area. It was implemented many times in many
different systems (recently it was added to mysql) and it is hard to find some "afflatus" here.
We can suggest many different things which can make compression more complex and flexible:
- be able to switch it on/off on the fly
- use different compression algorithms in different directions
- be able to change compression algorithm and compression level on the fly
- dynamically choose the best compression algorithm based on data stream content
...
I am sure that after thinking few moments any of us can add several more items to this list.
If we try to develop protocol which will be able to support all of them,
then most likely it will be too complicated, too expensive and inefficient.
There is general rule: the more flexible some thing is, the less efficient it is...
Before suggesting any of this advanced feature, it is necessary to have strong arguments why it is actually needed
and which problem it is trying to fix.
Right now, "statu quo" is that we have few well known compression algorithms: zlib, zstd, lz4,... which characteristics are not so
different. For some of them it is possible to say that algorithm A is almost always better than algorithm B, for example
modern zstd almost always better than old zlib. But zlib is available almost everywhere and is already supported by default by postgres.
Some algorithms are faster, some provide better compression ratio. But for small enough protocol messages speed is much more
important than compression quality.
Sorry, for my may be too "chaotic" arguments. I really do not think that there is much sense in supporting
a lot of different compression algorithms.
Both zstd or lz4 will provide comparable speed and compression ratio. And zlib can be used as universal solution.
And definitely it will be much better than lack of any compression.
What makes me really worry is that there are several places in Postgres which requires compression and all of them are implemented
in their own way: TOAST compression, WAL compression, backup compression,... zedheap compression.
There is now custom compression methods patch at commit fest, which propose ... very flexible way of fixing just one of
this cases. And which is completely orthogonal to the proposed libpq compression.
And my main concern is this awful code duplication and the fact that postgres is still using its own pglz implementation
which is much worse (according to zedstore benchmarks and my own measurements) than lz4 and any other popular algorithms.
And I do not believe that situation will be changed in PG14 or PG15.
So why do we need to be able to switch compression on/off?
High CPU overhead? Daniil measurements don't prove it... Compression should not be used for all connections.
There are some well known connections for which it will be useful: first of all replication protocol.
If client uploads data to the database using COPY command or perform some complex OLAP queries returning huge results
then compression also may be useful.
I can not imagine scenario when it is necessary to be able to switch on/off compression on the fly.
And it is not clear who (user himself, libpq library, JDBC driver,...?) and how will control it.
Possibility to use different algorithms in different directions seems to be even less sensible.
Especially if we do not use data-specific compression algorithms. And using such algorithms is very very
doubtful, taken in account that data between client and server is now mostly transferred in text format
(ok, in case of replication it is not true).
Dynamic switch of used compression algorithm depending on content of data stream is also interesting
but practically useless idea, because modern compression algorithms are doing them best trying to adopt to
the input data.
Finally I just want to notice once again that from my point of view compression is not the area where we need to invent
something new. It is enough to provide the similar functionality which is available at most of other communication protocols.
IMHO there is no sense to allow user to specify particular algorithm and compression level.
Like it is done in SSL: you can switch compression on or off. You can not specify algorithm, compression level...
and moreover somehow dynamically control compression on the fly.
Unfortunately your proposal to throw away this patch, start design discussion from the scratch most likely mean
that we will not have compression in some observable feature. And it is a pity, because I see many real use cases
where compression is actually needed and where Oracle wins Postgres several times just because it is sending less data
through network.
On 24.11.2020 20:34, Daniil Zakhlystov wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, failed > Implements feature: tested, passed > Spec compliant: tested, failed > Documentation: tested, failed > > Submission review > -- > Is the patch in a patch format which has context? (eg: context diff format) > NO, need to fix > Does it apply cleanly to the current git master? > YES > Does it include reasonable tests, necessary doc patches, etc? > have docs, missing tests > > Usability review > -- > At the moment, the patch supports per-connection (permanent) compression. The frontend can specify the desired compressionalgorithms and compression levels and then negotiate the compression algorithm that is going to be used with thebackend. In current state patch is missing the ability to enable/disable the compression on the backend side, I thinkit might be not great from the usability side. > > Regarding on-the-fly configurable compression and different compression algorithms for each direction - these two ideasare promising but tend to make the implementation more complex. However, the current implementation can be extendedto support these approaches in the future. For example, we can specify switchable on-the-fly compression as ‘switchable’algorithm and negotiate it like the regular compression algorithm (like we currently negotiate ‘zstd’ and ‘zlib’).‘switchable’ algorithm may then introduce new specific messages to Postgres protocol to make the on-the-fly compressionmagic work. > > The same applies to Robert’s idea of the different compression algorithms for different directions - we can introduce itlater as a new compression algorithm with new specific protocol messages. > > Does the patch actually implement that? > YES > > Do we want that? > YES > > Do we already have it? > NO > > Does it follow SQL spec, or the community-agreed behavior? > To be discussed > > Does it include pg_dump support (if applicable)? > not applicable > > Are there dangers? > theoretically possible CRIME-like attack when using with SSL enabled > > Have all the bases been covered? > To be discussed > > > Feature test > -- > > I’ve applied the patch, compiled, and tested it with configure options --enable-cassert and --enable-debug turned on. I’vetested the following scenarios: > > 1. make check > ======================= > All 201 tests passed. > ======================= > > 2. make check-world > initially failed with: > ============== running regression test queries ============== > test postgres_fdw ... FAILED 4465 ms > ============== shutting down postmaster ============== > ====================== > 1 of 1 tests failed. > ====================== > The differences that caused some tests to fail can be viewed in the > file "/xxx/xxx/review/postgresql/contrib/postgres_fdw/regression.diffs". A copy of the test summary that you see aboveis saved in the file "/xxx/xxx/review/postgresql/contrib/postgres_fdw/regression.out". > > All tests passed after replacing ‘gsslib, target_session_attrs’ with ‘gsslib, compression, target_session_attrs’ in line8914 of postgresql/contrib/postgres_fdw/expected/postgres_fdw.out > > 3. simple psql utility usage > psql -d "host=xxx port=5432 dbname=xxx user=xxx compression=1" > > 4. pgbench tpcb-like w/ SSL turned ON > pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=require compression=1" --builtin tpcb-like -t 70 --jobs=32 --client=700 > > 5. pgbench tpcb-like w/ SSL turned OFF > pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=disable compression=1" --builtin tpcb-like -t 70 --jobs=32 --client=700 > > 6. pgbench initialization w/ SSL turned ON > pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=require compression=1" -i -s 500 > > 7. pgbench initialization w/ SSL turned OFF > pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=disable compression=1" -i -s 500 > > 8. Streaming physical replication. Recovery-related parameters > recovery_target_timeline = 'latest' > primary_conninfo = 'host=xxx port=5432 user=repl application_name=xxx compression=1' > primary_slot_name = 'xxx' > restore_command = 'some command' > > 9. This compression has been implemented in an experimental build of odyssey connection pooler and tested with ~1500 syntheticsimultaneous clients configuration and ~300 GB databases. > During the testing, I’ve reported and fixed some of the issues. > > Does the feature work as advertised? > YES > Are there corner cases the author has failed to consider? > NO > Are there any assertion failures or crashes? > NO > > > Performance review > -- > > Does the patch slow down simple tests? > NO > > If it claims to improve performance, does it? > YES > > Does it slow down other things? > Using compression may add a CPU overhead. This mostly depends on compression algorithm and chosen compression level. Duringtesting with ZSTD algorithm and compression level 1 there was about 10% of CPU overhead in read/write balanced scenariosand almost no overhead in mostly read scenarios. > > > Coding review > -- > > In protocol.sgml: >> It can be just boolean values enabling or disabling compression >> ("true"/"false", "on"/"off", "yes"/"no", "1"/"0"), "auto" or explicit list of compression algorithms >> separated by comma with optional specification of compression level: "zlib,zstd:5". > But in fe-protocol3.c: >> if (pg_strcasecmp(value, "true") == 0 || >> pg_strcasecmp(value, "yes") == 0 || >> pg_strcasecmp(value, "on") == 0 || >> pg_strcasecmp(value, "any") == 0 || >> pg_strcasecmp(value, "1") == 0) >> { > I believe there is some mismatch - in docs, there is an “auto” parameter, but in code “auto” is missing, but “any” exists.Actually, I propose to remove both “auto” and “any” parameters because they work the same way as “true/on/yes/1” butappear like something else. > > > In fe-protocol3.c: >> #define pq_read_conn(conn) \ >> (conn->zstream \ >> ? zpq_read(conn->zstream, conn->inBuffer + conn->inEnd, \ >> conn->inBufSize - conn->inEnd) \ >> : pqsecure_read(conn, conn->inBuffer + conn->inEnd, \ >> conn->inBufSize - conn->inEnd)) > I think there should be some comment regarding the read function choosing logic. Same for zpq_write calls. Also, pq_read_connis defined as a macros, but there is no macros for pq_write_conn. > > In configure.ac: >> if test "$with_zstd" = yes; then >> AC_CHECK_LIB(zstd, ZSTD_decompressStream, [], >> [AC_MSG_ERROR([zstd library not found >> If you have zstd already installed, see config.log for details on the >> failure. It is possible the compiler isn't looking in the proper directory. >> Use --without-zstd to disable zstd support.])]) >> fi >> if test "$with_zstd" = yes; then >> AC_CHECK_HEADER(zstd.h, [], [AC_MSG_ERROR([zstd header not found >> If you have zstd already installed, see config.log for details on the >> failure. It is possible the compiler isn't looking in the proper directory. >> Use --without-zstd to disable zstd support.])]) >> fi > Looks like the rows with --without-zstd are incorrect. > > In fe-connect.c: >> if (index == (char)-1) >> { >> appendPQExpBuffer(&conn->errorMessage, >> libpq_gettext( >> "server is not supported requested compression algorithms %s\n"), >> conn->compression); >> goto error_return; >> } > Right now this error might be displayed in two cases: > Backend support compression, but it is somehow disabled/turned off > Backend support compression, but does not support requested algorithms > > I think that it is a good idea to differentiate these two cases. Maybe define the following behavior somewhere in docs: > > “When connecting to an older backend, which does not support compression, or in the case when the backend support compressionbut for some reason wants to disable it, the backend will just ignore the _pq_.compression parameter and won’tsend the compressionAck message to the frontend.” > > > To sum up, I think that the current implementation already introduces good benefits. As I proposed in the Usability review,we may introduce the new approaches later as separate compression 'algorithms'. > > Thanks, > Daniil Zakhlystov Thank you for review. New version of the patch addressing reported issues is attached. I added libpq_comression GUC to be able to prohibit compression and server side if for some (security, high CPU load,..) reasons it is not desired.
Attachment
Hi, > On Nov 24, 2020, at 11:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > So the time to talk about the > general approach here is now, before anything gets committed, before > the project has committed itself to any particular design. If we > decide in that discussion that certain things can be left for the > future, that's fine. If we've have discussed how they could be added > without breaking backward compatibility, even better. But we can't > just skip over having that discussion. > If the client requests compression and the server supports it, it > should return a new SupportedCompressionTypes message following > NegotiateProtocolMessage response. That should be a list of > compression methods which the server understands. At this point, the > clent and the server each know what methods the other understands. > Each should now feel free to select a compression method the other > side understands, and to switch methods whenever desired, as long as > they only select from methods the other side has said that they > understand. The patch seems to think that the compression method has > to be the same in both directions and that it can never change, but > there's no real reason for that. Let each side start out uncompressed > and then let it issue a new SetCompressionMethod protocol message to > switch the compression method whenever it wants. After sending that > message it begins using the new compression type. The other side > doesn't have to agree. That way, you don't have to worry about > synchronizing the two directions. Each side is just telling the other > what is choosing to do, from among the options the other side said it > could understand. I’ve read your suggestions about the switchable on-the-fly independent for each direction compression. While the proposed protocol seems straightforward, the ability to switch compression mode in an arbitrary moment significantlycomplexifies the implementation which may lead to the lower adoption of the really useful feature in customfrontends/backends. However, I don’t mean by this that we shouldn’t support switchable compression. I propose that we can offer two compressionmodes: permanent (which is implemented in the current state of the patch) and switchable on-the-fly. Permanentcompression allows us to deliver a robust solution that is already present in some databases. Switchable compressionallows us to support more complex scenarios in cases when the frontend and backend really need it and can afforddevelopment effort to implement it. I’ve made a draft of the protocol that may cover both these compression modes, also the following protocol supports independentfrontend and backend compression. In StartupPacket _pq_.compression frontend will specify the: 1. Supported compression modes in the order of preference. For example: “permanent, switchable” means that the frontend supports both permanent and switchable modes and prefer to usethe permanent mode. 2. List of the compression algorithms which the frontend is able to decompress in the order of preference. For example: “zlib:1,3,5;zstd:7,8;uncompressed” means that frontend is able to: - decompress zlib with 1,3 or 5 compression levels - decompress zstd with 7 or 8 compression levels - “uncompressed” at the end means that the frontend agrees to receive uncompressed messages. If there is no “uncompressed”compression algorithm specified it means that the compression is required. After receiving the StartupPacket message from the frontend, the backend will either ignore the _pq_.compression as an unknownparameter (if the backend is before November 2017) or respond with the CompressionAck message which will include: 1. Index of the chosen compression mode or -1 if doesn’t support any of the compression modes send by the frontend. In the case of the startup packet from the previous example: It may be ‘0’ if the server chose permanent mode,’1’ if switchable, or ‘-1’ if the server doesn’t support any of these. 2. List of the compression algorithms which the backend is able to decompress in the order of preference. For example, “zstd:2,4;uncompressed;zlib:7” means that the backend is able to: -decompress zstd with 2 and 4 compression levels -work in uncompressed mode -decompress zlib with compression level 7 After sending the CompressionAck message, the backend will also send the SetCompressionMessage with one of the following: - Index of the chosen backend compression algorithm followed by the index of the chosen compression level. In this case,the frontend now should use the chosen decompressor for incoming messages, the backend should also use the chosen compressorfor outgoing messages. - '-1', if the backend doesn’t support the compression using any of the algorithms sent by the frontend. In this case, thefrontend must terminate the connection after receiving this message. After receiving the SetCompressionMessage from the backend, the frontend should also reply with SetCompressionMessage withone of the following: - Index of the chosen frontend compression algorithm followed by the index of the chosen compression level. In this case,the backend now should use the chosen decompressor for incoming messages, the frontend should also use the chosen compressorfor outgoing messages. - '-1', if the frontend doesn’t support the compression using any of the algorithms sent by the backend. In this case, thefrontend should terminate the connection after sending this message. After that sequence of messages, the frontend and backend may continue the usual conversation. In the case of permanent compressionmode, further use of SetCompressionMessage is prohibited both on the frontend and backend sites. Supported compression and decompression methods are configured using GUC parameters: compress_algorithms = ‘...’ // default value is ‘uncompressed’ decompress_algorithms = ‘...’ // default value is ‘uncompressed’ Please, let me know if I was unclear somewhere in the protocol description so I can clarify the things that I might havemissed. I would appreciate hearing your opinion on the proposed protocol. Thanks, Daniil Zakhlystov
On 24.11.2020 22:47, Daniil Zakhlystov wrote: > I completely agree that backward-compatibility is important here. > > I think that it is a good idea to clarify how the compression establishment works in the current version of the patch: > > 1. Frontend send the startup packet which may look like this: > > _pq_.compression = 'zlib,zstd' (I omitted the variations with compression levels for clarity) > > Then, on the backend, there are two possible cases: > 2.1 If the backend is too old and doesn't know anything about the compression or if the compression is disabled on thebackend, it just ignores the compression parameter > 2.2 In the other case, the backend intersects the client compression method with its own supported ones and responds withcompressionAck message which contains the index of the chosen compression method (or '-1' if it doesn't support any ofthe methods provided). > > If the frontend receives the compressionAck message, there is also two cases: > 3.1 If compressionAck contains '-1', do not initiate compression > 3.2 In the other case, initialize the chosen compression method immediately. > > My idea is that we can add new compression approaches in the future and initialize them differently on step 3.2. > > For example, in the case of switchable compression: > > 1. Client sends a startup packet with _pq_.compression = 'switchable,zlib,zstd' - it means that client wants switchablecompression or permanent zlib/zstd compression. > > Again, two main cases on the backend: > 2.1 Backend doesn't know about any compression or compression turned off => ignore the _pq_.compression > > 2.2.1 If the backend doesn't have switchable compression implemented, it won't have 'switchable' in his supported methods.So it will simply discard this method in the process of the intersection of the client and frontend compression methodsand respond with some compressionAck message - choose permanent zlib, zstd, or nothing (-1). > > 2.2.2 If the backend supports switchable on the fly compression, it will have 'switchable' in his supported methods soit may choose 'switchable' in his compressionAck response. > > After that, on the frontend side: > 3.1 If compressionAck contains '-1', do not initiate compression > > 3.2.1 If compressionAck has 'zstd' or 'zlib' as the chosen compression method, init permanent streaming compression immediately. > > 3.2.2 If compressionAck has 'switchable' as the chosen compression method, init the switchable compression. Initializationmay involve sending some additional messages to the backend to negotiate the details like the supported switchableon the fly compression methods or any other details. > > The same applies to the compression with the different algorithms in each direction. We can call it, for example, 'directional-specific'and init differently on step 3.2. The key is that we don't even have to decide the exact initializationprotocol for 'switchable' and 'direction-specific'. It may be added in the future. > > Basically, this is what I’ve meant in my previous message about the future expansion of the current design, I hope thatI managed to clarify it. > > Thanks, > > Daniil Zakhlystov > >> On Nov 24, 2020, at 11:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Tue, Nov 24, 2020 at 12:35 PM Daniil Zakhlystov >> <usernamedt@yandex-team.ru> wrote: >>> To sum up, I think that the current implementation already introduces good benefits. As I proposed in the Usability review,we may introduce the new approaches later as separate compression 'algorithms'. >> I don't think the current patch is so close to being committable that >> we shouldn't be considering what we really want to have here. It's one >> thing to say, well, this patch is basically done, let's not start >> redesigning it now. But that's not the case here. For example, I don't >> see any committer accepting the comments in zpq_stream.c as adequate, >> or the documentation, either. Some comments that have been made >> previously, like Andres's remark about the non-standard message >> construction in pq_configure(), have not been addressed, and I do not >> think any committer is going to agree with the idea that the novel >> method chosen by the patch is superior here, not least but not only >> because it seems like it's endian-dependent. That function also uses >> goto, which anybody thinking of committing this will surely try to get >> rid of, and I'm pretty sure the sscanf() isn't good enough to reject >> trailing garbage, and the error message that follows is improperly >> capitalized. I'm sure there's other stuff, too: this is just based on >> a quick look. >> >> Before we start worrying about any of that stuff in too much detail, I >> think it makes a lot of sense to step back and consider the design. >> Honestly, the work of changing the design might be smaller than the >> amount of cleanup the patch needs. But even if it's larger, it's >> probably not vastly larger. And in any case, I quite disagree with the >> idea that we should commit to a user-visible interface that exposes a >> subset of the functionality that we needed and then try to glue the >> rest of the functionality on top of it later. If we make a libpq >> connection option called compression that controls the type of >> compression that is used in both direction, then how exactly would we >> extend that later to allow for different compression in the two >> directions? Some syntax like compression=zlib/none, where the value >> before the slash controls one direction and the value after the slash >> controls the other? Maybe. But on the other hand, maybe it's better to >> have separate connection options for client compression and server >> compression. Or, maybe the kind of compression used by the server >> should be controlled via a GUC rather than a connection option. Or, >> maybe none of that is right and we should stick with the approach the >> patch currently takes. But it's not like we can do something for v1 >> and then just change things randomly later: there will be >> backward-compatibility to worry about. So the time to talk about the >> general approach here is now, before anything gets committed, before >> the project has committed itself to any particular design. If we >> decide in that discussion that certain things can be left for the >> future, that's fine. If we've have discussed how they could be added >> without breaking backward compatibility, even better. But we can't >> just skip over having that discussion. >> >> -- >> Robert Haas >> EDB: http://www.enterprisedb.com I attach new version of the patch with one more bug fix and minimal support of future protocol extension. Now if you define environment PGCOMPRESSION='zstd' (or zlib) then it is possible to pass all regression tests with enabled compression. I wonder if we are going to support protocol compression in PG14? If so, I will try to fix all reported issues with code style and include in handshake protocol some general mechanism which allow in future to support some advanced features. Right now client sends to the server list of the supported protocols (as comma separated list with with optional compression level, i.e. "ztsd:10,zlib") and server replies with index of used compression algorithm in this list (or -1 if compression is disabled). The minimal change in handshake protocol from my point of view is to add to client message arbitrary extension part which is not currently interpreted by server: "ztsd:10,zlib;arbitrary text" So all text after ';' will be currently ignored by backend but may be interpreted in future. When old backend connects to new server (extended part is missed), then server responds with algorithm index (as it is done now) When new backend connects to old server, then server just ignores extension and responds with algorithm index. When new backend connects to new server, then server may interpret extended part and return either algorithm index, either some larger response and based on message size new client will recognize that server is using extended response format and properly interpret response. I do not think that we really need to discuss now advanced features (like switching algorithm or compression level on the fly or be able to use different algorithms in different directions). But I am open for this discussion. I have only one suggestion: before discussing any advanced feature, let's try to formulate 1. Why do we need it (what are the expected advantages)? 2. How it can be used? Assume that we will wan to implement smarter algorithm choice. The question number 1 is which algorithms we are going to support. Right now I have supported zstd (demonstrating best quality/speed results on most payloads I have tested) and zlib (available almost everywhere and supported by default by Postgres). We can also add lz4 which is also very fast and may be more compact/popular than zstd. I do not know some other algorithms which are much better for traffic compression than this two. There are certainly type specific compression algorithms, but the only really useful algorithm I know is compression of monotonic sequence of integers and floats. Not sure that it is relevant for libpq. So there seems to be sense to switch lz4 with zstd on the fly or use lz4 for compression of traffic from server to client and zstd - from client to server (or visa versa). And more important question - if we really want to switch algorithms on the fly: who and how will do it? Do we want user to explicitly control it (something like "\compression on" psql command)? Or there should be some API for application? How it can be supported for example by JDBC driver? I do not have answers for this questions... P.S. I am grateful for the criticism. Sorry if some my answers are considered as impolite. I just a developer and for me writing code is much easier than writing e-mails.
Attachment
On Thu, Nov 26, 2020 at 8:15 AM Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: > However, I don’t mean by this that we shouldn’t support switchable compression. I propose that we can offer two compressionmodes: permanent (which is implemented in the current state of the patch) and switchable on-the-fly. Permanentcompression allows us to deliver a robust solution that is already present in some databases. Switchable compressionallows us to support more complex scenarios in cases when the frontend and backend really need it and can afforddevelopment effort to implement it. I feel that one thing that may be getting missed here is that my suggestions were intended to make this simpler, not more complicated. Like, in the design I proposed, switchable compression is not a separate form of compression and doesn't require any special support. Both sides are just allowed to set the compression method; theoretically, they could set it more than once. Similarly, I don't intend the possibility of using different compression algorithms in the two directions as a request for an advanced feature so much as a way of simplifying the protocol. Like, in the protocol that you proposed previously, you've got a four-phase handshake to set up compression. The startup packet carries initial information from the client, and the server then sends CompressionAck, and then the client sends SetCompressionMethod, and then the server sends SetCompressionMethod. This system is fairly complex, and it requires some form of interlocking. Once the client has sent a SetCompressionMethod message, it cannot send any other protocol message until it receives a SetCompressionMethod message back from the server. Otherwise, it doesn't know whether the server actually responded with SetCompressionMethod as well, or whether it sent say ErrorResponse or NoticeResponse or something. In the former case it needs to send compressed data going forward; in the latter uncompressed; but it can't know which until it seems the server message. And keep in mind that control isn't necessarily with libpq at this point, because non-blocking mode could be in use. This is all solvable, but the way I proposed it, you don't have that problem. You never need to wait for a message from the other end before being able to send a message yourself. Similarly, allowing different compression methods in the two directions may seem to make things more complicated, but I don't think it really is. Arguably it's simpler. The instant the server gets the startup packet, it can issue SetCompressionMethod. The instant the client gets SupportedCompressionTypes, it can issue SetCompressionMethod. So there's practically no hand-shaking at all. You get a single protocol message and you immediately respond by setting the compression method and then you just send compressed messages after that. Perhaps the time at which you begin receiving compressed data will be a little different than the time at which you begin sending it, or perhaps compression will only ever be used in one direction. But so what? The code really does need to care. You just need to keep track of the active compression mode in each direction, and that's it. And again, if you allow the compression method to be switched at any time, you just have to know how what to do when you get a SetCompressionMethod. If you only allow it to be changed once, to set it initially, then you have to ADD code to reject that message the next time it's sent. If that ends up avoiding a significant amount of complexity somewhere else then I don't have a big problem with it, but if it doesn't, it's simpler to allow it whenever than to restrict it to only once. > 2. List of the compression algorithms which the frontend is able to decompress in the order of preference. > For example: > “zlib:1,3,5;zstd:7,8;uncompressed” means that frontend is able to: > - decompress zlib with 1,3 or 5 compression levels > - decompress zstd with 7 or 8 compression levels > - “uncompressed” at the end means that the frontend agrees to receive uncompressed messages. If there is no “uncompressed”compression algorithm specified it means that the compression is required. I think that there's no such thing as being able to decompress some compression levels with the same algorithm but not others. The level only controls behavior on the compression side. So, the client can only send zlib data if the server can decompress it, but the server need not advertise which levels it can decompress, because it's all or nothing. > Supported compression and decompression methods are configured using GUC parameters: > > compress_algorithms = ‘...’ // default value is ‘uncompressed’ > decompress_algorithms = ‘...’ // default value is ‘uncompressed’ This raises an interesting question which I'm not quite sure about. It doesn't seem controversial to assert that the client must be able to advertise which algorithms it does and does not support, and likewise for the server. After all, just because we offer lz4, say, as an option doesn't mean every PostgreSQL build will be performed --with-lz4. But, how should the compression algorithm that actually gets used be controlled? One can imagine that the client is in charge of the compression algorithm and the compression level in both directions. If we insist on those being the same, the client says something like compression=lz4:1 and then it uses that algorithm and instructs the server to do the same; otherwise there might be separate connection parameters for client-compression and server-compression, or some kind of syntax that lets you specify both using a single parameter. On the other hand, one could take a whole different approach and imagine the server being in charge of both directions, like having a GUC that is set on the server. Clients advertise what they can support, and the server tells them to do whatever the GUC says they must. That sounds awfully heavy-handed, but it has the advantage of letting the server administrator set site policy. One can also imagine combination approaches, like letting the server GUC define the default but allowing the client to override using a connection parameter. Or even putting each side in charge of what it sends: the GUC controls the what the server tries to do, provided the client can support it; and the connection parameter controls the client behavior, provided the server can support it. I am not really sure what's best here, but it's probably something we need to think about a bit before we get too deep into this. I'm tentatively inclined to think that the server should have a GUC that defines the *allowable* compression algorithms so that the administrator can disable algorithms that are compiled into the binary but which she does not want to permit (e.g. because a security problem was discovered in a relevant library). The default can simply be 'all', meaning everything the binary supports. And then the rest of the control should be on the client side, so that the server GUC can never influence the selection of which algorithm is actually chosen, but only rule things out. But that is just a tentative opinion; maybe it's not the right idea. -- Robert Haas EDB: http://www.enterprisedb.com
Hi, Robert! First of all, thanks for your detailed reply. > On Dec 3, 2020, at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > Like, in the protocol that you proposed previously, you've got a > four-phase handshake to set up compression. The startup packet carries > initial information from the client, and the server then sends > CompressionAck, and then the client sends SetCompressionMethod, and > then the server sends SetCompressionMethod. This system is fairly > complex, and it requires some form of interlocking. I proposed a slightly different handshake (three-phase): 1. At first, the client sends _pq_.compression parameter in startup packet 2. Server replies with CompressionAck and following it with SetCompressionMethod message. These two might be combined but I left them like this for symmetry reasons. In most cases they will arrive as one piece without any additional delay. 3. Client replies with SetCompressionMethod message. The handshake like above allows forbidding the uncompressed client-to-server or/and server-to-client communication. For example, if the client did not explicitly specify ‘uncompressed’ in the supported decompression methods list, and the server does not support any of the other compression algorithms sent by the client, the server will send back SetCompressionMethod with ‘-1’ index. After receiving this message, the client will terminate the connection. > On Dec 3, 2020, at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > And again, if you allow the compression method to be switched at any > time, you just have to know how what to do when you get a > SetCompressionMethod. If you only allow it to be changed once, to set > it initially, then you have to ADD code to reject that message the > next time it's sent. If that ends up avoiding a significant amount of > complexity somewhere else then I don't have a big problem with it, but > if it doesn't, it's simpler to allow it whenever than to restrict it > to only once. Yes, there is actually some amount of complexity involved in implementing the switchable on-the-fly compression. Currently, compression itself operates on a different level, independently of libpq protocol. By allowing the compression to be switchable on the fly, we need to solve these tasks: 1. When the new portion of bytes comes to the decompressor from the socket.read() call, there may be a situation when the first part of these bytes is a compressed fragment and the other is part is uncompressed, or worse, in a single portion of new bytes, there may be the end of some ZLIB compressed message and the beginning of the ZSTD compressedmessage. The problem is that we don’t know the exact end of the ZLIB compressed message before decompressing the entire chunk of newbytes and reading the SetCompressionMethod message. Moreover, streaming compression by itself may involve some internal buffering, which also complexifies this problem. 2. When sending the new portion of bytes, it may be not sufficient to keep track of only the current compression method. There may be a situation when there could be multiple SetCompressionMessages in PqSendBuffer (backend) or conn->outBuffer(frontend). It means that it is not enough to simply track the current compression method but also keep track of all compression method switches in PqSendBuffer or conn->outBuffer. Also, same as for decompression, internal buffering of streaming compression makes the situation more complex in this case too. Despite that the above two problems might be solvable, I doubt if we should oblige to solve these problems not only in libpq, but in all other third-party Postgres protocol libraries since the exact areas of application for switchable compressionare not clear yet. I agree with Konstantine’s point of view on this one: > And more important question - if we really want to switch algorithms on > the fly: who and how will do it? > Do we want user to explicitly control it (something like "\compression > on" psql command)? > Or there should be some API for application? > How it can be supported for example by JDBC driver? > I do not have answers for this questions... However, as previously mentioned in the thread, it might be useful in the future and we should design a protocol that supports it so we won’t have any problems with backward compatibility. So, basically, this was the only reason to introduce the two separate compression modes - switchable and permanent. In the latest patch, Konstantin introduced the extension part. So in the future versions, we can introduce the switchablecompression handling in this extension part. By now, let the permanent compression be the default mode. > On Dec 3, 2020, at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > I think that there's no such thing as being able to decompress some > compression levels with the same algorithm but not others. The level > only controls behavior on the compression side. So, the client can > only send zlib data if the server can decompress it, but the server > need not advertise which levels it can decompress, because it's all or > nothing. Depending on the chosen compression algorithm, compression level may affect the decompression speed and memory usage. That's why I think that it may be nice for the server to forbid some compression levels with high CPU / memory usage requiredfor decompression. > On Dec 3, 2020, at 2:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > On the other hand, one could take a whole different > approach and imagine the server being in charge of both directions, > like having a GUC that is set on the server. Clients advertise what > they can support, and the server tells them to do whatever the GUC > says they must. That sounds awfully heavy-handed, but it has the > advantage of letting the server administrator set site policy. I personally think that this approach is the most practical one. For example: In the server’s postgresql.conf: compress_algorithms = ‘uncompressed' // means that the server forbids any server-to-client compression decompress_algorithms = 'zstd:7,8;uncompressed' // means that the server can only decompress zstd with compression ratio7 and 8 or communicate with uncompressed messages In the client connection string: “… compression=zlib:1,3,5;zstd:6,7,8;uncompressed …” // means that the client is able to compress/decompress zlib, zstd,or communicate with uncompressed messages For the sake of simplicity, the client’s “compression” parameter in the connection string is basically an analog of the server’scompress_algorithms and decompress_algorithms. So the negotiation process for the above example would look like this: 1. Client sends startup packet with “algorithms=zlib:1,3,5;zstd:6,7,8;uncompressed;” Since there is no compression mode specified, assume that the client wants permanent compression. In future versions, the client can turn request the switchable compression after the ‘;’ at the end of the message 2. Server replies with two messages: - CompressionAck message containing “algorithms=zstd:7,8;uncompressed;” Where the algorithms section basically matches the “decompress_algorithms” server GUC parameter. In future versions, the server can specify the chosen compression mode after the ‘;’ at the end of the message - Following SetCompressionMethod message containing “alg_idx=1;level_idx=1” which essentially means that the server chose zstd with compression level 7 for server-to-client compression. Every next messagefrom the server is now compressed with zstd 3. Client replies with SetCompressionMethod message containing “alg_idx=0” which means that the client chose the uncompressed client-to-server messaging. Actually, the client had no other options, because the “uncompressed” was the only option leftafter the intersection of compression algorithms from the connection string and algorithms received from the server in the CompressionAck message. Every next message from the client is now being sent uncompressed. — Daniil Zakhlystov
On Tue, Dec 8, 2020 at 9:42 AM Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: > I proposed a slightly different handshake (three-phase): > > 1. At first, the client sends _pq_.compression parameter in startup packet > 2. Server replies with CompressionAck and following it with SetCompressionMethod message. > These two might be combined but I left them like this for symmetry reasons. In most cases they > will arrive as one piece without any additional delay. > 3. Client replies with SetCompressionMethod message. I think that's pretty similar to what I proposed, actually, except I think that SetCompressionMethod in one direction should be decoupled from SetCompressionMethod in the other direction, so that those things don't have to be synchronized with respect to each other. Each affects its own direction only. > Yes, there is actually some amount of complexity involved in implementing the switchable on-the-fly compression. > Currently, compression itself operates on a different level, independently of libpq protocol. By allowing > the compression to be switchable on the fly, we need to solve these tasks: > > 1. When the new portion of bytes comes to the decompressor from the socket.read() call, there may be > a situation when the first part of these bytes is a compressed fragment and the other is part is uncompressed, or worse, > in a single portion of new bytes, there may be the end of some ZLIB compressed message and the beginning of the ZSTD compressedmessage. > The problem is that we don’t know the exact end of the ZLIB compressed message before decompressing the entire chunk ofnew bytes > and reading the SetCompressionMethod message. Moreover, streaming compression by itself may involve some internal buffering, > which also complexifies this problem. > > 2. When sending the new portion of bytes, it may be not sufficient to keep track of only the current compression method. > There may be a situation when there could be multiple SetCompressionMessages in PqSendBuffer (backend) or conn->outBuffer(frontend). > It means that it is not enough to simply track the current compression method but also keep track of all compression method > switches in PqSendBuffer or conn->outBuffer. Also, same as for decompression, > internal buffering of streaming compression makes the situation more complex in this case too. Good points. I guess you need to arrange to "flush" at the compression layer as well as the libpq layer so that you don't end up with data stuck in the compression buffers. Another idea is that you could have a new message type that says "hey, the payload of this is 1 or more compressed messages." It uses the most-recently set compression method. This would make switching compression methods easier since the SetCompressionMethod message itself could always be sent uncompressed and/or not take effect until the next compressed message. It also allows for a prudential decision not to bother compressing messages that are short anyway, which might be useful. On the downside it adds a little bit of overhead. Andres was telling me on a call that he liked this approach; I'm not sure if it's actually best, but have you considered this sort of approach? > I personally think that this approach is the most practical one. For example: > > In the server’s postgresql.conf: > > compress_algorithms = ‘uncompressed' // means that the server forbids any server-to-client compression > decompress_algorithms = 'zstd:7,8;uncompressed' // means that the server can only decompress zstd with compression ratio7 and 8 or communicate with uncompressed messages > > In the client connection string: > > “… compression=zlib:1,3,5;zstd:6,7,8;uncompressed …” // means that the client is able to compress/decompress zlib, zstd,or communicate with uncompressed messages > > For the sake of simplicity, the client’s “compression” parameter in the connection string is basically an analog of theserver’s compress_algorithms and decompress_algorithms. > So the negotiation process for the above example would look like this: > > 1. Client sends startup packet with “algorithms=zlib:1,3,5;zstd:6,7,8;uncompressed;” > Since there is no compression mode specified, assume that the client wants permanent compression. > In future versions, the client can turn request the switchable compression after the ‘;’ at the end of the message > > 2. Server replies with two messages: > - CompressionAck message containing “algorithms=zstd:7,8;uncompressed;” > Where the algorithms section basically matches the “decompress_algorithms” server GUC parameter. > In future versions, the server can specify the chosen compression mode after the ‘;’ at the end of the message > > - Following SetCompressionMethod message containing “alg_idx=1;level_idx=1” which > essentially means that the server chose zstd with compression level 7 for server-to-client compression. Every next messagefrom the server is now compressed with zstd > > 3. Client replies with SetCompressionMethod message containing “alg_idx=0” which means that the client chose the uncompressed > client-to-server messaging. Actually, the client had no other options, because the “uncompressed” was the only option leftafter the intersection of > compression algorithms from the connection string and algorithms received from the server in the CompressionAck message. > Every next message from the client is now being sent uncompressed. I still think this is excessively baroque and basically useless. Nobody wants to allow compression levels 1, 3, and 5 but disallow 2 and 4. At the very most, somebody might want to start a maximum or minimum level. But even that I think is pretty pointless. Check out the "Decompression Time" and "Decompression Speed" sections from this link: https://www.rootusers.com/gzip-vs-bzip2-vs-xz-performance-comparison/ This shows that decompression time and speed is basically independent of compression method for all three of these compressors; to the extent that there is a difference, higher compression levels are generally slightly faster to decompress. I don't really see the argument for letting either side be proscriptive here. Deciding with algorithms you're willing to accept is totally reasonable since different things may be supported, security concerns, etc. but deciding you're only willing to accept certain levels seems unuseful. It's also unenforceable, I think, since the receiving side has no way of knowing what the sender actually did. -- Robert Haas EDB: http://www.enterprisedb.com
> On Dec 10, 2020, at 1:39 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > I still think this is excessively baroque and basically useless. > Nobody wants to allow compression levels 1, 3, and 5 but disallow 2 > and 4. At the very most, somebody might want to start a maximum or > minimum level. But even that I think is pretty pointless. Check out > the "Decompression Time" and "Decompression Speed" sections from this > link: > > https://www.rootusers.com/gzip-vs-bzip2-vs-xz-performance-comparison/ > > This shows that decompression time and speed is basically independent > of compression method for all three of these compressors; to the > extent that there is a difference, higher compression levels are > generally slightly faster to decompress. I don't really see the > argument for letting either side be proscriptive here. Deciding with > algorithms you're willing to accept is totally reasonable since > different things may be supported, security concerns, etc. but > deciding you're only willing to accept certain levels seems unuseful. > It's also unenforceable, I think, since the receiving side has no way > of knowing what the sender actually did. I agree that decompression time and speed are basically the same for different compression ratios for most algorithms. But it seems like that this may not be true for memory usage. Check out these links: http://mattmahoney.net/dc/text.html and https://community.centminmod.com/threads/round-4-compression-comparison-benchmarks-zstd-vs-brotli-vs-pigz-vs-bzip2-vs-xz-etc.18669/ According to these sources, zstd uses significantly more memory while decompressing the data which has been compressed withhigh compression ratios. So I’ll test the different ZSTD compression ratios with the current version of the patch and post the results later thisweek. > On Dec 10, 2020, at 1:39 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > > Good points. I guess you need to arrange to "flush" at the compression > layer as well as the libpq layer so that you don't end up with data > stuck in the compression buffers. I think that “flushing” the libpq and compression buffers before setting the new compression method will help to solve issuesonly at the compressing (sender) side but won't help much on the decompressing (receiver) side. In the current version of the patch, the decompressor acts as a proxy between secure_read and PqRecvBuffer / conn->inBuffer.It is unaware of the Postgres protocol and will fail to do anything other than decompressing the bytes received from the secure_read function and appending them tothe PqRecvBuffer. So the problem is that we can’t decouple the compressed bytes from the uncompressed ones (actually ZSTD detects the compressedblock end, but some other algorithms don’t). We may introduce some hinges to control the decompressor behavior from the underlying levels after reading the SetCompressionMethodmessage from PqRecvBuffer, but I don’t think that it is the correct approach. > On Dec 10, 2020, at 1:39 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > Another idea is that you could have a new message type that says "hey, > the payload of this is 1 or more compressed messages." It uses the > most-recently set compression method. This would make switching > compression methods easier since the SetCompressionMethod message > itself could always be sent uncompressed and/or not take effect until > the next compressed message. It also allows for a prudential decision > not to bother compressing messages that are short anyway, which might > be useful. On the downside it adds a little bit of overhead. Andres > was telling me on a call that he liked this approach; I'm not sure if > it's actually best, but have you considered this sort of approach? This may help to solve the above issue. For example, we may introduce the CompressedData message: CompressedData (F & B) Byte1(‘m’) // I am not so sure about the ‘m’ identifier :) Identifies the message as compressed data. Int32 Length of message contents in bytes, including self. Byten Data that forms part of a compressed data stream. Basically, it wraps some chunk of compressed data (like the CopyData message). On the sender side, the compressor will wrap all outgoing message chunks into the CopyData messages. On the receiver side, some intermediate component between the secure_read and the decompressor will do the following: 1. Read the next 5 bytes (type and length) from the buffer 2.1 If the message type is other than CompressedData, forward it straight to the PqRecvBuffer / conn->inBuffer. 2.2 If the message type is CompressedData, forward its contents to the current decompressor. What do you think of this approach? — Daniil Zakhlystov
Hello all, I’ve finally read the whole thread (it was huge). It is extremely sad that this patch hang without progress for such a longtime. It seems that the main problem in discussion is that everyone has its own view what problems should be solve withthis patch. Here are some of positions (not all of them): 1. Add a compression for networks with a bad bandwidth (and make a patch as simple and maintainable as possible) - author’sposition. 2. Don’t change current network protocol and related code much. 3. Refactor compression API (and network compression as well) 4. Solve cloud provider’s problems: on demand buy network bandwidth with CPU utilisation and vice versa. All of these requirements have a different nature and sometimes conflict with each other. Without clearly formed requirementsthis patch would never be released. Anyway, I have rebased it to the current master branch, applied pgindent, tested on MacOS and fixed a MacOS specific problemwith strcpy in build_compressors_list(): it has an undefined behaviour when source and destination strings overlap. - *client_compressors = src = dst = strdup(value); + *client_compressors = src = strdup(value); + dst = strdup(value); According to my very simple tests with randomly generated data, zstd gives about 3x compression (zlib has a little worsecompression ratio and a little bigger CPU utilisation). It seems to be a normal ratio for any streaming data - Greenplumalso uses zstd/zlib to compress append optimised tables and compression ratio is usually about 3-5x. Also accordingto my Greenplum experience, the most commonly used zstd ratio is 1, while for zlib it is usually in a range of 1-5.CPU and execution time were not affected much according to uncompressed data (but my tests were very simple and theyshould not be treated as reliable). Best regards, Denis Smirnov | Developer sd@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
Attachment
On 17.12.2020 16:39, Denis Smirnov wrote:
Hello all, I’ve finally read the whole thread (it was huge). It is extremely sad that this patch hang without progress for such a long time. It seems that the main problem in discussion is that everyone has its own view what problems should be solve with this patch. Here are some of positions (not all of them): 1. Add a compression for networks with a bad bandwidth (and make a patch as simple and maintainable as possible) - author’s position. 2. Don’t change current network protocol and related code much. 3. Refactor compression API (and network compression as well) 4. Solve cloud provider’s problems: on demand buy network bandwidth with CPU utilisation and vice versa. All of these requirements have a different nature and sometimes conflict with each other. Without clearly formed requirements this patch would never be released. Anyway, I have rebased it to the current master branch, applied pgindent, tested on MacOS and fixed a MacOS specific problem with strcpy in build_compressors_list(): it has an undefined behaviour when source and destination strings overlap. - *client_compressors = src = dst = strdup(value); + *client_compressors = src = strdup(value); + dst = strdup(value); According to my very simple tests with randomly generated data, zstd gives about 3x compression (zlib has a little worse compression ratio and a little bigger CPU utilisation). It seems to be a normal ratio for any streaming data - Greenplum also uses zstd/zlib to compress append optimised tables and compression ratio is usually about 3-5x. Also according to my Greenplum experience, the most commonly used zstd ratio is 1, while for zlib it is usually in a range of 1-5. CPU and execution time were not affected much according to uncompressed data (but my tests were very simple and they should not be treated as reliable).Best regards, Denis Smirnov | Developer sd@arenadata.io Arenadata | Godovikova 9-17, Moscow 129085 Russia
Thank you very much for reporting the problem.
Sorry, but you fix is entirely correct: it is necessary to assign dst to *client_compressors, not *src. Also extra strdup requires correspondent memory deallocation.
I prepared new version of the patch with this fix + some other code refactoring requested by reviewers (like sending ack message in more standard way).
I am maintaining this code in git@github.com:postgrespro/libpq_compression.git repository.
I will be pleased if anybody, who wants to suggest any bug fixes/improvements of libpq compression, create pull requests: it will be much easier for me to merge them.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Hi! I’ve fixed an issue with compression level parsing in this PR https://github.com/postgrespro/libpq_compression/pull/4 Also, did a couple of pgbenchmarks to measure database resource usage with different compression levels. Firstly, I measured the bidirectional compression scenario, i.e. database had to do both compression and decompression: Database setup: pgbench "host=xxx dbname=xxx port=5432 user=xxx” -i -s 500 Test run: pgbench "host=xxx dbname=xxx port=5432 user=xxx compression=zstd:(1/3/5/7/9/11/13/15/17/19/20)" --builtin tpcb-like -t 50--jobs=64 --client=700 When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example,here is the Postgresql application memory usage: No compression - 1.2 GiB ZSTD zstd:1 - 1.4 GiB zstd:7 - 4.0 GiB zstd:13 - 17.7 GiB zstd:19 - 56.3 GiB zstd:20 - 109.8 GiB - did not succeed zstd:21, zstd:22 > 140 GiB Postgres process crashes (out of memory) ZLIB zlib:1 - 1.35 GiB zlib:5 - 1.35 GiB zlib:9 - 1.35 GiB Full report with CPU/Memory/Network consumption graph is available here: https://docs.google.com/document/d/1qakHcsabZhV70GfSEOjFxmlUDBe21p7DRoPrDPAjKNg Then, I’ve disabled the compression for the backend and decompression for the frontend and measured the resource usage for single-directional compression scenario (frontend compression only, backend decompressiononly): ZSTD For all ZSTD compression levels, database host resource usage was roughly the same, except the Committed Memory (Committed_AS): no compression - 44.4 GiB zstd:1 - 45.0 GiB zstd:3 - 46.1 GiB zstd:5 - 46.1 GiB zstd:7 - 46.0 GiB zstd:9 - 46.0 GiB zstd:11 - 47.4 GiB zstd:13 - 47.4 GiB zstd:15 - 47.4 GiB zstd:17 - 50.3 GiB zstd:19 - 50.1 GiB zstd:20 - 66.8 GiB zstd:21 - 88.7 GiB zstd:22 - 123.9 GiB ZLIB For all ZLIB compression level, database host resource usage was roughly the same. Full report with CPU/Memory/Network consumption graph is available here: https://docs.google.com/document/d/1gI7c3_YvcL5-PzeK65P0pIY-4BI9KBDwlfPpGhYxrNg To sum up, there is actually almost no difference when decompressing the different compression levels, except the Committed_ASsize. — Daniil Zakhlystov
On Mon, Dec 14, 2020 at 12:53 PM Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: > > On Dec 10, 2020, at 1:39 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > Good points. I guess you need to arrange to "flush" at the compression > > layer as well as the libpq layer so that you don't end up with data > > stuck in the compression buffers. > > I think that “flushing” the libpq and compression buffers before setting the new compression method will help to solveissues only at the compressing (sender) side > but won't help much on the decompressing (receiver) side. Hmm, I assumed that if the compression buffers were flushed on the sending side, and if all the data produced on the sending side were transmitted to the receiver, the receiving side would then return everything up to the point of the flush. However, now that I think about it, there's no guarantee that any particular compression library would actually behave that way. I wonder what actually happens in practice with the libraries we care about? > This may help to solve the above issue. For example, we may introduce the CompressedData message: > > CompressedData (F & B) > > Byte1(‘m’) // I am not so sure about the ‘m’ identifier :) > Identifies the message as compressed data. > > Int32 > Length of message contents in bytes, including self. > > Byten > Data that forms part of a compressed data stream. > > Basically, it wraps some chunk of compressed data (like the CopyData message). > > On the sender side, the compressor will wrap all outgoing message chunks into the CopyData messages. > > On the receiver side, some intermediate component between the secure_read and the decompressor will do the following: > 1. Read the next 5 bytes (type and length) from the buffer > 2.1 If the message type is other than CompressedData, forward it straight to the PqRecvBuffer / conn->inBuffer. > 2.2 If the message type is CompressedData, forward its contents to the current decompressor. > > What do you think of this approach? I'm not sure about the details, but the general idea seems like it might be worth considering. If we choose a compression method that is intended for streaming compression and decompression and whose library handles compression flushes sensibly, then we might not really need to go this way to make it work. But, on the other hand, this method has a certain elegance that just compressing everything lacks, and might allow some useful flexibility. On the third hand, restarting compression for every new set of messages might really hurt the compression ratio in some scenarios. I'm not sure what is best. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: > When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example,here is the Postgresql application memory usage: > > No compression - 1.2 GiB > > ZSTD > zstd:1 - 1.4 GiB > zstd:7 - 4.0 GiB > zstd:13 - 17.7 GiB > zstd:19 - 56.3 GiB > zstd:20 - 109.8 GiB - did not succeed > zstd:21, zstd:22 > 140 GiB > Postgres process crashes (out of memory) Good grief. So, suppose we add compression and support zstd. Then, can unprivileged user capable of connecting to the database can negotiate for zstd level 1 and then choose to actually send data compressed at zstd level 22, crashing the server if it doesn't have a crapton of memory? Honestly, I wouldn't blame somebody for filing a CVE if we allowed that sort of thing to happen. I'm not sure what the solution is, but we can't leave a way for a malicious client to consume 140GB of memory on the server *per connection*. I assumed decompression memory was going to measured in kB or MB, not GB. Honestly, even at say L7, if you've got max_connections=100 and a user who wants to make trouble, you have a really big problem. Perhaps I'm being too pessimistic here, but man that's a lot of memory. -- Robert Haas EDB: http://www.enterprisedb.com
On 12/22/20 6:56 PM, Robert Haas wrote: > On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov > <usernamedt@yandex-team.ru> wrote: >> When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example,here is the Postgresql application memory usage: >> >> No compression - 1.2 GiB >> >> ZSTD >> zstd:1 - 1.4 GiB >> zstd:7 - 4.0 GiB >> zstd:13 - 17.7 GiB >> zstd:19 - 56.3 GiB >> zstd:20 - 109.8 GiB - did not succeed >> zstd:21, zstd:22 > 140 GiB >> Postgres process crashes (out of memory) > > Good grief. So, suppose we add compression and support zstd. Then, can > unprivileged user capable of connecting to the database can negotiate > for zstd level 1 and then choose to actually send data compressed at > zstd level 22, crashing the server if it doesn't have a crapton of > memory? Honestly, I wouldn't blame somebody for filing a CVE if we > allowed that sort of thing to happen. I'm not sure what the solution > is, but we can't leave a way for a malicious client to consume 140GB > of memory on the server *per connection*. I assumed decompression > memory was going to measured in kB or MB, not GB. Honestly, even at > say L7, if you've got max_connections=100 and a user who wants to make > trouble, you have a really big problem. > > Perhaps I'm being too pessimistic here, but man that's a lot of memory. > Maybe I'm just confused, but my assumption was this means there's a memory leak somewhere - that we're not resetting/freeing some piece of memory, or so. Why would zstd need so much memory? It seems like a pretty serious disadvantage, so how could it become so popular? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Dec 22, 2020 at 07:15:23PM +0100, Tomas Vondra wrote: > > > On 12/22/20 6:56 PM, Robert Haas wrote: > >On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov > ><usernamedt@yandex-team.ru> wrote: > >>When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example,here is the Postgresql application memory usage: > >> > >>No compression - 1.2 GiB > >> > >>ZSTD > >>zstd:1 - 1.4 GiB > >>zstd:7 - 4.0 GiB > >>zstd:13 - 17.7 GiB > >>zstd:19 - 56.3 GiB > >>zstd:20 - 109.8 GiB - did not succeed > >>zstd:21, zstd:22 > 140 GiB > >>Postgres process crashes (out of memory) > > > >Good grief. So, suppose we add compression and support zstd. Then, can > >unprivileged user capable of connecting to the database can negotiate > >for zstd level 1 and then choose to actually send data compressed at > >zstd level 22, crashing the server if it doesn't have a crapton of > >memory? Honestly, I wouldn't blame somebody for filing a CVE if we > >allowed that sort of thing to happen. I'm not sure what the solution > >is, but we can't leave a way for a malicious client to consume 140GB > >of memory on the server *per connection*. I assumed decompression > >memory was going to measured in kB or MB, not GB. Honestly, even at > >say L7, if you've got max_connections=100 and a user who wants to make > >trouble, you have a really big problem. > > > >Perhaps I'm being too pessimistic here, but man that's a lot of memory. > > > > Maybe I'm just confused, but my assumption was this means there's a > memory leak somewhere - that we're not resetting/freeing some piece > of memory, or so. Why would zstd need so much memory? It seems like > a pretty serious disadvantage, so how could it become so popular? > > > regards > Hi, It looks like the space needed for decompression is between 1kb and 3.75tb: https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#window_descriptor Sheesh! Looks like it would definitely need to be bounded to control resource use. Regards, Ken
> 22 дек. 2020 г., в 23:15, Tomas Vondra <tomas.vondra@enterprisedb.com> написал(а): > > > > On 12/22/20 6:56 PM, Robert Haas wrote: >> On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov >> <usernamedt@yandex-team.ru> wrote: >>> When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example,here is the Postgresql application memory usage: >>> >>> No compression - 1.2 GiB >>> >>> ZSTD >>> zstd:1 - 1.4 GiB >>> zstd:7 - 4.0 GiB >>> zstd:13 - 17.7 GiB >>> zstd:19 - 56.3 GiB >>> zstd:20 - 109.8 GiB - did not succeed >>> zstd:21, zstd:22 > 140 GiB >>> Postgres process crashes (out of memory) >> Good grief. So, suppose we add compression and support zstd. Then, can >> unprivileged user capable of connecting to the database can negotiate >> for zstd level 1 and then choose to actually send data compressed at >> zstd level 22, crashing the server if it doesn't have a crapton of >> memory? Honestly, I wouldn't blame somebody for filing a CVE if we >> allowed that sort of thing to happen. I'm not sure what the solution >> is, but we can't leave a way for a malicious client to consume 140GB >> of memory on the server *per connection*. I assumed decompression >> memory was going to measured in kB or MB, not GB. Honestly, even at >> say L7, if you've got max_connections=100 and a user who wants to make >> trouble, you have a really big problem. >> Perhaps I'm being too pessimistic here, but man that's a lot of memory. > > Maybe I'm just confused, but my assumption was this means there's a memory leak somewhere - that we're not resetting/freeingsome piece of memory, or so. Why would zstd need so much memory? It seems like a pretty serious disadvantage,so how could it become so popular? AFAIK it's 700 clients. Does not seem like super high price for big traffic\latency reduction. Best regards, Andrey Borodin.
On 12/22/20 7:31 PM, Andrey Borodin wrote: > > >> 22 дек. 2020 г., в 23:15, Tomas Vondra <tomas.vondra@enterprisedb.com> написал(а): >> >> >> >> On 12/22/20 6:56 PM, Robert Haas wrote: >>> On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov >>> <usernamedt@yandex-team.ru> wrote: >>>> When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For example,here is the Postgresql application memory usage: >>>> >>>> No compression - 1.2 GiB >>>> >>>> ZSTD >>>> zstd:1 - 1.4 GiB >>>> zstd:7 - 4.0 GiB >>>> zstd:13 - 17.7 GiB >>>> zstd:19 - 56.3 GiB >>>> zstd:20 - 109.8 GiB - did not succeed >>>> zstd:21, zstd:22 > 140 GiB >>>> Postgres process crashes (out of memory) >>> Good grief. So, suppose we add compression and support zstd. Then, can >>> unprivileged user capable of connecting to the database can negotiate >>> for zstd level 1 and then choose to actually send data compressed at >>> zstd level 22, crashing the server if it doesn't have a crapton of >>> memory? Honestly, I wouldn't blame somebody for filing a CVE if we >>> allowed that sort of thing to happen. I'm not sure what the solution >>> is, but we can't leave a way for a malicious client to consume 140GB >>> of memory on the server *per connection*. I assumed decompression >>> memory was going to measured in kB or MB, not GB. Honestly, even at >>> say L7, if you've got max_connections=100 and a user who wants to make >>> trouble, you have a really big problem. >>> Perhaps I'm being too pessimistic here, but man that's a lot of memory. >> >> Maybe I'm just confused, but my assumption was this means there's a memory leak somewhere - that we're not resetting/freeingsome piece of memory, or so. Why would zstd need so much memory? It seems like a pretty serious disadvantage,so how could it become so popular? > > AFAIK it's 700 clients. Does not seem like super high price for big traffic\latency reduction. > I don't see aby benchmark results in this thread, allowing me to make that conclusion, and I find it hard to believe that 200MB/client is a sensible trade-off. It assumes you have that much memory, and it may allow easy DoS attack (although maybe it's not worse than e.g. generating a lot of I/O or running expensive function). Maybe allowing limiting the compression level / decompression buffer size in postgresql.conf would be enough. Or maybe allow disabling such compression algorithms altogether. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > I don't see aby benchmark results in this thread, allowing me to make > that conclusion, and I find it hard to believe that 200MB/client is a > sensible trade-off. > It assumes you have that much memory, and it may allow easy DoS attack > (although maybe it's not worse than e.g. generating a lot of I/O or > running expensive function). Maybe allowing limiting the compression > level / decompression buffer size in postgresql.conf would be enough. Or > maybe allow disabling such compression algorithms altogether. The link Ken pointed at suggests that restricting the window size to 8MB is a common compromise. It's not clear to me what that does to the achievable compression ratio. Even 8MB could be an annoying cost if it's being paid per-process, on both the server and client sides. regards, tom lane
On 12/22/20 8:03 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> I don't see aby benchmark results in this thread, allowing me to make >> that conclusion, and I find it hard to believe that 200MB/client is a >> sensible trade-off. > >> It assumes you have that much memory, and it may allow easy DoS attack >> (although maybe it's not worse than e.g. generating a lot of I/O or >> running expensive function). Maybe allowing limiting the compression >> level / decompression buffer size in postgresql.conf would be enough. Or >> maybe allow disabling such compression algorithms altogether. > > The link Ken pointed at suggests that restricting the window size to > 8MB is a common compromise. It's not clear to me what that does to > the achievable compression ratio. Even 8MB could be an annoying cost > if it's being paid per-process, on both the server and client sides. > Possibly, but my understanding is that's merely a recommendation for the decoder library (e.g. libzstd), and it's not clear to me if/how that relates to the compression level or how to influence it. From the results shared by Daniil, the per-client overhead seems way higher than 8MB, so either libzstd does not respect this recommendation or maybe there's something else going on. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > On 12/22/20 8:03 PM, Tom Lane wrote: >> The link Ken pointed at suggests that restricting the window size to >> 8MB is a common compromise. It's not clear to me what that does to >> the achievable compression ratio. Even 8MB could be an annoying cost >> if it's being paid per-process, on both the server and client sides. > Possibly, but my understanding is that's merely a recommendation for the > decoder library (e.g. libzstd), and it's not clear to me if/how that > relates to the compression level or how to influence it. > From the results shared by Daniil, the per-client overhead seems way > higher than 8MB, so either libzstd does not respect this recommendation > or maybe there's something else going on. I'd assume that there's a direct correlation between the compression level setting and the window size; but I've not studied the libzstd docs in enough detail to know what it is. regards, tom lane
On Tue, Dec 22, 2020 at 2:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'd assume that there's a direct correlation between the compression level > setting and the window size; but I've not studied the libzstd docs in > enough detail to know what it is. But there is a privilege boundary between the sender and the receiver. What's alleged here is that the sender can do a thing which causes the receiver to burn through tons of memory. It doesn't help anything to say, well, the sender ought to use a window size of N or less. What if they don't? -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Dec 22, 2020 at 2:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'd assume that there's a direct correlation between the compression level >> setting and the window size; but I've not studied the libzstd docs in >> enough detail to know what it is. > But there is a privilege boundary between the sender and the receiver. > What's alleged here is that the sender can do a thing which causes the > receiver to burn through tons of memory. It doesn't help anything to > say, well, the sender ought to use a window size of N or less. What if > they don't? The receiver rejects the data as though it were corrupt. regards, tom lane
I wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> But there is a privilege boundary between the sender and the receiver. >> What's alleged here is that the sender can do a thing which causes the >> receiver to burn through tons of memory. It doesn't help anything to >> say, well, the sender ought to use a window size of N or less. What if >> they don't? > The receiver rejects the data as though it were corrupt. (Having said that, I don't know whether it's possible for the user of libzstd to specify such behavior. But if it isn't, that's a CVE-worthy problem in libzstd.) regards, tom lane
On 22.12.2020 22:03, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> I don't see aby benchmark results in this thread, allowing me to make >> that conclusion, and I find it hard to believe that 200MB/client is a >> sensible trade-off. >> It assumes you have that much memory, and it may allow easy DoS attack >> (although maybe it's not worse than e.g. generating a lot of I/O or >> running expensive function). Maybe allowing limiting the compression >> level / decompression buffer size in postgresql.conf would be enough. Or >> maybe allow disabling such compression algorithms altogether. > The link Ken pointed at suggests that restricting the window size to > 8MB is a common compromise. It's not clear to me what that does to > the achievable compression ratio. Even 8MB could be an annoying cost > if it's being paid per-process, on both the server and client sides. > > regards, tom lane Please notice that my original intention was to not give to a user (client) possibility to choose compression algorithm and compression level at all. All my previous experiments demonstrate that using compression level larger than default only significantly decrease speed, but not compression ratio. Especially for compressing of protocol messages. Moreover, on some dummy data (like generated by pgbench) zstd with default compression level (1) shows better compression ratio than with higher levels. I have to add possibility to specify compression level and suggested compression algorithms because it was requested by reviewers. But I still think that it was wrong idea and this results just prove prove it. More flexibility is not always good... Now there is a discussion concerning a way to switch compression algorithm on the fly (particular case: toggling compression for individual ibpq messages). IMHO it is once again excessive flexibility which just increase complexity and gives nothing good in practice).
Hi! I’ve contacted Yann Collet (developer of ZSTD) and told him about our discussion. Here is his comment: > Hi Daniil > • Is this an expected behavior of ZSTD to consume more memory during the decompression of data that was compressedwith a high compression ratio? > > I assume that the target application is employing the streaming mode. > In which case, yes, the memory usage is directly dependent on the Window size, and the Window size tend to increase withcompression level. > > • how we can restrict the maximal memory usage during decompression? > > There are several ways. > > • From a decompression perspective > > the first method is to _not_ use the streaming mode, > and employ the direct buffer-to-buffer compression instead, > like ZSTD_decompress() for example. > In which case, the decompressor will not need additional memory, it will only employ the provided buffers. > > This however entirely depends on the application and can therefore be unpractical. > It’s fine when decompressing small blocks, it’s not when decompressing gigantic streams of data. > > The second method is more straightforward : set a limit to the window size that the decoder accepts to decode. > This is the ZSTD_d_windowLogMax parameter, documented here : https://github.com/facebook/zstd/blob/v1.4.7/lib/zstd.h#L536 > > This can be set to any arbitrary power of 2 limit. > A frame requiring more than this value will be rejected by the decoder, precisely to avoid sustaining large memory requirements. > > Lastly, note that, in presence of a large window size requirement, the decoder will allocate a correspondingly large buffer, > but will not necessarily use it. > For example, if a frame generated with streaming mode at level 22 declares a 128 MB window size, but effectively only contains~200 KB of data, > the buffer will only use 200 KB. > The rest of the buffer is “allocated” from an address space perspective but is not “used” and therefore does not reallyoccupy physical RAM space. > This is a capability of all modern OS and contributes to minimizing the impact of outsized window sizes. > > > • From a compression perspective > > Knowing the set limitation, the compressor should be compliant, and avoid going above the threshold. > One way to do it is to limit the compression level to those which remain below the set limit. > For example, if the limit is 8 MB, all levels <= 19 will be compatible, as they require 8 MB max (and generally less). > > Another method is to manually set a window size, so that it doesn’t exceed the limit. > This is the ZSTD_c_windowLog parameter, which is documented here : https://github.com/facebook/zstd/blob/v1.4.7/lib/zstd.h#L289 > > Another complementary way is to provide the source size when it’s known. > By default, the streaming mode doesn’t know the input size, since it’s supposed to receive it in multiple blocks. > It will only discover it at the end, by which point it’s too late to use this information in the frame header. > This can be solved, by providing the source size upfront, before starting compression. > This is the function ZSTD_CCtx_setPledgedSrcSize(), documented here : https://github.com/facebook/zstd/blob/v1.4.7/lib/zstd.h#L483 > Of course, then the total amount of data in the frame must be exact, otherwise it’s detected as an error. > > Taking again the previous example of compressing 200 KB with level 22, on knowing the source size, > the compressor will resize the window to fit the input, and therefore employ 200 KB, instead of 128 MB. > This information will be present in the header, and the decompressor will also be able to use 200 KB instead of 128 MB. > Also, presuming the decompressor has a hard limit set to 8 MB (for example), the header using a 200 KB window size willpass and be properly decoded, while the header using 128 MB will be rejected. > This method is cumulative with the one setting a manual window size (the compressor will select the smallest of both). > > > So yes, memory consumption is a serious topic, and there are tools in the `zstd` library to deal with it. > > > Hope it helps > > Best Regards > > Yann Collet After reading Yann’s advice I repeated yesterday single-directional decompression benchmarks with ZSTD_d_windowLogMax setto 23, i.e 8MB max window size. Total committed memory (Committed_AS) size for ZSTD compression levels 1-19 was pretty much the same: Committed_AS baseline (size without any benchmark running) - 42.4 GiB Scenario Committed_AS Committed_AS - Baseline no compression 44,36 GiB 1,05 GiB ZSTD:1 45,03 GiB 1,06 GiB ZSTD:5 46,06 GiB 1,09 GiB ZSTD:9 46,00 GiB 1,08 GiB ZSTD:13 47,46 GiB 1,12 GiB ZSTD:17 50,23 GiB 1,18 GiB ZSTD:19 50,21 GiB 1,18 GiB As for ZSTD levels higher than 19, decompressor returned the appropriate error (excerpt from PostgreSQL server log): LOG: failed to decompress data: Frame requires too much memory for decoding Full benchmark report: https://docs.google.com/document/d/1LI8hPzMkzkdQLf7pTN-LXPjIJdjN33bEAqVJj0PLnHA Pull request with max window size limit: https://github.com/postgrespro/libpq_compression/pull/5 This should fix the possible attack vectors related to high ZSTD compression levels. — Daniil Zakhlystov
On Thu, Dec 17, 2020 at 05:54:28PM +0300, Konstantin Knizhnik wrote: > I am maintaining this code in > git@github.com:postgrespro/libpq_compression.git repository. > I will be pleased if anybody, who wants to suggest any bug > fixes/improvements of libpq compression, create pull requests: it will be > much easier for me to merge them. Thanks for working on this. I have a patch for zstd compression in pg_dump so I looked at your patch. I'm attaching some language fixes. > +zstd_create(int level, zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg, char* rx_data, size_t rx_data_size) > +zlib_create(int level, zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg, char* rx_data, size_t rx_data_size) > +build_compressors_list(PGconn *conn, char** client_compressors, bool build_descriptors) Are you able to run pg_indent to fix all the places where "*" is before the space ? (And similar style issues). There are several compression patches in the commitfest, I'm not sure how much they need to be coordinated, but for sure we should coordinate the list of compressions available at compile time. Maybe there should be a central structure for this, or maybe just the ID numbers of compressors should be a common enum/define. In my patch, I have: +struct compressLibs { + const CompressionAlgorithm alg; + const char *name; /* Name in -Z alg= */ + const char *suffix; /* file extension */ + const int defaultlevel; /* Default compression level */ +}; Maybe we'd also want to store the "magic number" of each compression library. Maybe there'd also be a common parsing of compression options. You're supporting a syntax like zlib,zstd:5, but zstd also supports long-range, checksum, and rsyncable modes (rsyncable is relevant to pg_dump, but not to libpq). I think your patch has an issue here. You have this: src/interfaces/libpq/fe-connect.c + pqGetc(&resp, conn); + index = resp; + if (index == (char)-1) + { + appendPQExpBuffer(&conn->errorMessage, + libpq_gettext( + "server is not supported requested compression algorithms%s\n"), + conn->compression); + goto error_return; + } + Assert(!conn->zstream); + conn->zstream = zpq_create(conn->compressors[index].impl, + conn->compressors[index].level, + (zpq_tx_func)pqsecure_write, (zpq_rx_func)pqsecure_read,conn, + &conn->inBuffer[conn->inCursor], conn->inEnd-conn->inCursor); This takes the "index" returned by the server and then accesses conn->compressors[index] without first checking if the index is out of range, so a malicious server could (at least) crash the client by returning index=666. I suggest that there should be an enum of algorithms, which is constant across all servers. They would be unconditionally included and not #ifdef depending on compilation options. That would affect the ZpqAlgorithm data structure, which would include an ID number similar to src/bin/pg_dump/compress_io.h:typedef enum...CompressionAlgorithm; The CompressionAck would send the ID rather than the "index". A protocol analyzer like wireshark could show "Compression: Zstd". You'd have to verify that the ID is supported (and not bogus). Right now, when I try to connect to an unpatched server, I get: psql: error: expected authentication request from server, but received v +/* + * Array with all supported compression algorithms. + */ +static ZpqAlgorithm const zpq_algorithms[] = +{ +#if HAVE_LIBZSTD + {zstd_name, zstd_create, zstd_read, zstd_write, zstd_free, zstd_error, zstd_buffered_tx, zstd_buffered_rx}, +#endif +#if HAVE_LIBZ + {zlib_name, zlib_create, zlib_read, zlib_write, zlib_free, zlib_error, zlib_buffered_tx, zlib_buffered_rx}, +#endif + {no_compression_name} +}; In config.sgml, it says that libpq_compression defaults to on (on the server side), but in libpq.sgml it says that it defaults to off (on the client side). Is that what's intended ? I would've thought the defaults would match, or that the server would enforce a default more conservative than the client's (the DBA should probably have to explicitly enable compression, and need to "opt-in" rather than "opt-out"). Maybe instead of a boolean, this should be a list of permitted compression algorithms. This allows the admin to set a "policy" rather than using the server's hard-coded preferences. This could be important to disable an algorithm at run-time if there's a vulnerability found, or performance problem, or buggy client, or for diagnostics, or performance testing. Actually, I think it may be important to allow the admin to disable not just an algorithm, but also its options. It should be possible for the server to allow "zstd" compression but not "zstd --long", or zstd:99 or arbitrarily large window sizes. This seems similar to SSL cipher strings. I think we'd want to be able to allow (or prohibit) at least alg:* (with options) or alg (with default options). Your patch documents "libpq_compression=auto" but that doesn't work: WARNING: none of specified algirthms auto is supported by client I guess you mean "any" which is what's implemented. I suggest to just remove that. I think maybe your patch should include a way to trivially change the client's compile-time default: "if (!conn->compression) conn->compression = DefaultCompression" Your patch warns if *none* of the specified algorithms are supported, but I wonder if it should warn if *any* of them are unsupported (like if someone writes libz instead of zlib, or zst/libzstd instead of zstd, which I guess about half of us will do). $ LD_LIBRARY_PATH=./src/interfaces/libpq PGCOMPRESSION=abc,def src/bin/psql/psql 'host=localhost port=1111' WARNING: none of the specified algorithms are supported by client: abc,def $ LD_LIBRARY_PATH=./src/interfaces/libpq PGCOMPRESSION=abc,zlib src/bin/psql/psql 'host=localhost port=1111' (no warning) The libpq_compression GUC can be set for a user, like ALTER ROLE .. SET ... Is there any utility in making it configurable by client address? Or by encryption? I'm thinking of a policy like "do not allow compression from LOCAL connection" or "do not allow compression on encrypted connection". Maybe this would be somehow integrated into pg_hba. But maybe it's not needed (a separate user could be used to allow/disallow compression). -- Justin
Attachment
On 09.01.2021 23:31, Justin Pryzby wrote: > On Thu, Dec 17, 2020 at 05:54:28PM +0300, Konstantin Knizhnik wrote: >> I am maintaining this code in >> git@github.com:postgrespro/libpq_compression.git repository. >> I will be pleased if anybody, who wants to suggest any bug >> fixes/improvements of libpq compression, create pull requests: it will be >> much easier for me to merge them. > Thanks for working on this. > I have a patch for zstd compression in pg_dump so I looked at your patch. > I'm attaching some language fixes. Thank you very much. I applied your patch on top of pull request of Daniil Zakhlystov who has implemented support of using different compressors in different direction. Frankly speaking I still very skeptical concerning too much flexibility in compression configuration: - toggle compression on the fly - using different compression algorithms in both directions - toggle compression on the fly According to Daniil's results there is only 30% differences in compression ration between zstd:1 and zstd:19. But making it possible to specify arbitrary compression level we give user of a simple tool to attack server (cause CPU/memory exhaustion). > >> +zstd_create(int level, zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg, char* rx_data, size_t rx_data_size) >> +zlib_create(int level, zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg, char* rx_data, size_t rx_data_size) >> +build_compressors_list(PGconn *conn, char** client_compressors, bool build_descriptors) > Are you able to run pg_indent to fix all the places where "*" is before the > space ? (And similar style issues). Also done by Daniil. > > There are several compression patches in the commitfest, I'm not sure how much > they need to be coordinated, but for sure we should coordinate the list of > compressions available at compile time. > > Maybe there should be a central structure for this, or maybe just the ID > numbers of compressors should be a common enum/define. In my patch, I have: > > +struct compressLibs { > + const CompressionAlgorithm alg; > + const char *name; /* Name in -Z alg= */ > + const char *suffix; /* file extension */ > + const int defaultlevel; /* Default compression level */ > +}; > > Maybe we'd also want to store the "magic number" of each compression library. > Maybe there'd also be a common parsing of compression options. > > You're supporting a syntax like zlib,zstd:5, but zstd also supports long-range, > checksum, and rsyncable modes (rsyncable is relevant to pg_dump, but not to > libpq). There are at least three places in Postgres where compression is used right : 1. TOAST and extened attributes compression. 2. pg_basebackup compression 3. pg_dump compression And there are also patches for 4. page compression 5. protocol compression It is awful that all this five places are using compression in their own way. It seems to me that compression (as well as all other system dependent stuff as socket IO, file IO, synchronization primitives,...) should be extracted to SAL (system-abstract layer) where then can be used both by backend, frontend and utilities. Including external utilities, like pg_probackup, pg_bouncer, Odyssey, ... Unfortunately such refactoring requires so much efforts, that it can have any chance to be committed if this work will be coordinated by one of the core committers. > I think your patch has an issue here. You have this: > > src/interfaces/libpq/fe-connect.c > > + pqGetc(&resp, conn); > + index = resp; > + if (index == (char)-1) > + { > + appendPQExpBuffer(&conn->errorMessage, > + libpq_gettext( > + "server is not supported requested compression algorithms%s\n"), > + conn->compression); > + goto error_return; > + } > + Assert(!conn->zstream); > + conn->zstream = zpq_create(conn->compressors[index].impl, > + conn->compressors[index].level, > + (zpq_tx_func)pqsecure_write, (zpq_rx_func)pqsecure_read,conn, > + &conn->inBuffer[conn->inCursor], conn->inEnd-conn->inCursor); > > This takes the "index" returned by the server and then accesses > conn->compressors[index] without first checking if the index is out of range, > so a malicious server could (at least) crash the client by returning index=666. Thank you for pointed this problem. I will add the check, although I think that problem of malicious server is less critical than malicious client. Also such "byzantine" server may return wrong data, which in many cases is more fatal than crash of a client. > I suggest that there should be an enum of algorithms, which is constant across > all servers. They would be unconditionally included and not #ifdef depending > on compilation options. I do not think that it is possible (even right now, it is possible to build Postgres without zlib support). Also if new compression algorithms are added, then in any case we have to somehow handle situation when old client is connected to new server and visa versa. > > That would affect the ZpqAlgorithm data structure, which would include an ID > number similar to > src/bin/pg_dump/compress_io.h:typedef enum...CompressionAlgorithm; > > The CompressionAck would send the ID rather than the "index". > A protocol analyzer like wireshark could show "Compression: Zstd". > You'd have to verify that the ID is supported (and not bogus). > > Right now, when I try to connect to an unpatched server, I get: > psql: error: expected authentication request from server, but received v Thank you for pointing it: fixed. > +/* > + * Array with all supported compression algorithms. > + */ > +static ZpqAlgorithm const zpq_algorithms[] = > +{ > +#if HAVE_LIBZSTD > + {zstd_name, zstd_create, zstd_read, zstd_write, zstd_free, zstd_error, zstd_buffered_tx, zstd_buffered_rx}, > +#endif > +#if HAVE_LIBZ > + {zlib_name, zlib_create, zlib_read, zlib_write, zlib_free, zlib_error, zlib_buffered_tx, zlib_buffered_rx}, > +#endif > + {no_compression_name} > +}; > > In config.sgml, it says that libpq_compression defaults to on (on the server > side), but in libpq.sgml it says that it defaults to off (on the client side). > Is that what's intended ? I would've thought the defaults would match, or that > the server would enforce a default more conservative than the client's (the DBA > should probably have to explicitly enable compression, and need to "opt-in" > rather than "opt-out"). Yes, it is intended behavior: libpq_compression GUC allows to prohibit compression requests fro clients if due to some reasons (security, CPU consumption is not desired). But by default server should support compression if it is requested by client. But client should not request compression by default: it makes sense only for queries returning large result sets or transferring a lot of data (liek COPY). > > Maybe instead of a boolean, this should be a list of permitted compression > algorithms. This allows the admin to set a "policy" rather than using the > server's hard-coded preferences. This could be important to disable an > algorithm at run-time if there's a vulnerability found, or performance problem, > or buggy client, or for diagnostics, or performance testing. Actually, I think > it may be important to allow the admin to disable not just an algorithm, but > also its options. It should be possible for the server to allow "zstd" > compression but not "zstd --long", or zstd:99 or arbitrarily large window > sizes. This seems similar to SSL cipher strings. I think we'd want to be able > to allow (or prohibit) at least alg:* (with options) or alg (with default > options). Sorry, may be you are looking not at the latest version of the patch? Right now "compression" parameter accepts not only boolean values but also list of suggested algorithms with optional compression level, like "zstd:1,zlib" > Your patch documents "libpq_compression=auto" but that doesn't work: > WARNING: none of specified algirthms auto is supported by client > I guess you mean "any" which is what's implemented. > I suggest to just remove that. It is some inconsistency with documentation. It seems to me that I have already fixed it. "auto" was renamed to "any", > I think maybe your patch should include a way to trivially change the client's > compile-time default: > "if (!conn->compression) conn->compression = DefaultCompression" It can be done using PG_COMPRESSION environment variable. Do we need some other mechanism for it? > > Your patch warns if *none* of the specified algorithms are supported, but I > wonder if it should warn if *any* of them are unsupported (like if someone > writes libz instead of zlib, or zst/libzstd instead of zstd, which I guess > about half of us will do). Ugh... Handling mistyping in connection string seems to be not so good idea (from my point of view). And situation when some of algorithms is not supported by server seems to normal (if new client connects to old server). So I do not want to produce warning in this case. I once again want to repeat my opinion: choosing of compression algorithms should be done automatically. Client should just make an intention to use compression (using compression=on or compression=any) and server should choose most efficient algorithm which is supported by both of them. > $ LD_LIBRARY_PATH=./src/interfaces/libpq PGCOMPRESSION=abc,def src/bin/psql/psql 'host=localhost port=1111' > WARNING: none of the specified algorithms are supported by client: abc,def > $ LD_LIBRARY_PATH=./src/interfaces/libpq PGCOMPRESSION=abc,zlib src/bin/psql/psql 'host=localhost port=1111' > (no warning) > > The libpq_compression GUC can be set for a user, like ALTER ROLE .. SET ... > Is there any utility in making it configurable by client address? Or by > encryption? I'm thinking of a policy like "do not allow compression from LOCAL > connection" or "do not allow compression on encrypted connection". Maybe this > would be somehow integrated into pg_hba. But maybe it's not needed (a separate > user could be used to allow/disallow compression). There is definitely no sense to use compression together with encryption: if compression is desired in this case, it cane be done at SSL level. But I do not think that we need more sophisticated mechanism to prohibit compression requests at server level. Yes, it is possible to grant privileges to use compression to some particular role. Or to prohibit it for SSL connections. But I can't imagine some natural arguments for it. May be I am wrong. I will be pleased of somebody can describe some realistic scenarios when any of this features may be needed (taken in account that compression of libpq traffic is not something very bad, insecure or expensive, which can harm server). Misuse of libpq compression may just consume that extra CPU (but not so much to overload server). In any case: we do not have such mechanism to restrict of use SSL connections for some particular roles or disable it for local connection. Why do we need it for compression, which is very similar with encryption? New version of libpq compression patch is attached. It can be also be found at git@github.com:postgrespro/libpq_compression.git -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 1/11/21 2:53 PM, Konstantin Knizhnik wrote: > > ... > > New version of libpq compression patch is attached. > It can be also be found at git@github.com:postgrespro/libpq_compression.git > Seems it bit-rotted already, so here's a slightly fixed version. 1) Fixes the MSVC makefile. The list of files is sorted alphabetically, so I've added the file at the end. 2) Fixes duplicate OID. It's a good practice to assign OIDs from the end of the range, to prevent collisions during development. Other than that, I wonder what's the easiest way to run all tests with compression enabled. ISTM it'd be nice to add pg_regress option forcing a particular compression algorithm to be used, or something similar. I'd like a convenient way to pass this through a valgrind, for example. Or how do we get this tested on a buildfarm? I'm not convinced it's very user-friendly to not have a psql option enabling compression. It's true it can be enabled in a connection string, but I doubt many people will notice that. The sgml docs need a bit more love / formatting. The lines in libpq.sgml are far too long, and there are no tags whatsoever. Presumably zlib/zstd should be marked as <literal>, and so on. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On Mon, Jan 11, 2021 at 04:53:51PM +0300, Konstantin Knizhnik wrote: > On 09.01.2021 23:31, Justin Pryzby wrote: > > I suggest that there should be an enum of algorithms, which is constant across > > all servers. They would be unconditionally included and not #ifdef depending > > on compilation options. > > I do not think that it is possible (even right now, it is possible to build > Postgres without zlib support). > Also if new compression algorithms are added, then in any case we have to > somehow handle situation when > old client is connected to new server and visa versa. I mean an enum of all compression supported in the master branch, starting with ZLIB = 1. I think this applies to libpq compression (which requires client and server to handle a common compression algorithm), and pg_dump (output of which needs to be read by pg_restore), but maybe not TOAST, the patch for which supports extensions, with dynamically allocated OIDs. > > In config.sgml, it says that libpq_compression defaults to on (on the server > > side), but in libpq.sgml it says that it defaults to off (on the client side). > > Is that what's intended ? I would've thought the defaults would match, or that > > the server would enforce a default more conservative than the client's (the DBA > > should probably have to explicitly enable compression, and need to "opt-in" > > rather than "opt-out"). > > Yes, it is intended behavior: libpq_compression GUC allows to prohibit > compression requests fro clients if due to some reasons (security, CPU > consumption is not desired). > But by default server should support compression if it is requested by > client. It's not clear to me if that's true.. It may be what's convenient for you, especially during development, but that doesn't mean it's safe or efficient or what's generally desirable to everyone. > But client should not request compression by default: it makes sense only for > queries returning large result sets or transferring a lot of data (liek > COPY). I think you're making assumptions about everyone's use of the tools, and it's better if the DBA makes that determination. The clients aren't generally under the admin's control, and if they don't request compression, then it's not used. If they request compression, then the DBA still has control over whether it's allowed. We agree that it should be disabled by default, but I suggest that it's most flexible if client's makes the request and allow the server to decide. By default the server should deny/ignore the request, with the client gracefully falling back to no compression. Compression would have little effect on most queries, especially at default level=1. > > Maybe instead of a boolean, this should be a list of permitted compression > > algorithms. This allows the admin to set a "policy" rather than using the > > server's hard-coded preferences. This could be important to disable an > > algorithm at run-time if there's a vulnerability found, or performance problem, > > or buggy client, or for diagnostics, or performance testing. Actually, I think > > it may be important to allow the admin to disable not just an algorithm, but > > also its options. It should be possible for the server to allow "zstd" > > compression but not "zstd --long", or zstd:99 or arbitrarily large window > > sizes. This seems similar to SSL cipher strings. I think we'd want to be able > > to allow (or prohibit) at least alg:* (with options) or alg (with default > > options). > > Sorry, may be you are looking not at the latest version of the patch? > Right now "compression" parameter accepts not only boolean values but also > list of suggested algorithms with optional compression level, like > "zstd:1,zlib" You're talking about the client compression param. I'm suggesting that the server should allow fine-grained control of what compression algs are permitted at *runtime*. This would allow distributions to compile with all compression libraries enabled at configure time, and still allow an DBA to disable one without recompiling. > > Your patch documents "libpq_compression=auto" but that doesn't work: > > WARNING: none of specified algirthms auto is supported by client > > I guess you mean "any" which is what's implemented. > > I suggest to just remove that. > It is some inconsistency with documentation. > It seems to me that I have already fixed it. "auto" was renamed to "any", > > I think maybe your patch should include a way to trivially change the client's > > compile-time default: > > "if (!conn->compression) conn->compression = DefaultCompression" > > It can be done using PG_COMPRESSION environment variable. > Do we need some other mechanism for it? That's possible with environment variable or connection string. I'm proposing to simplify change of its default when recompiling, just like this one: src/interfaces/libpq/fe-connect.c:#define DefaultHost "localhost" Think this would be a 2 line change, and makes the default less hardcoded. > > Your patch warns if *none* of the specified algorithms are supported, but I > > wonder if it should warn if *any* of them are unsupported (like if someone > > writes libz instead of zlib, or zst/libzstd instead of zstd, which I guess > > about half of us will do). > Ugh... Handling mistyping in connection string seems to be not so good idea > (from my point of view). > And situation when some of algorithms is not supported by server seems to > normal (if new client connects to old server). > So I do not want to produce warning in this case. I'm not talking about warning if an alg is "not supported by server", but rather if it's "not supported by client". I think there should be a warning if a connection string specifies two libraries, but one is misspelled. > I once again want to repeat my opinion: choosing of compression algorithms > should be done automatically. > Client should just make an intention to use compression (using > compression=on or compression=any) > and server should choose most efficient algorithm which is supported by both > of them. I think it's good if it *can* be automatic. But it's also good if the DBA can implement a simple "policy" about what's (not) supported. If the user requests a compression and it's not known on the *client* side I think it should warn. BTW I think the compression should be shown in psql \conninfo > > src/interfaces/libpq/fe-connect.c > > > > + pqGetc(&resp, conn); > > + index = resp; > > + if (index == (char)-1) > > + { > > + appendPQExpBuffer(&conn->errorMessage, > > + libpq_gettext( > > + "server is not supported requested compression algorithms%s\n"), > > + conn->compression); > > + goto error_return; > > + } > > + Assert(!conn->zstream); > > + conn->zstream = zpq_create(conn->compressors[index].impl, > > + conn->compressors[index].level, > > + (zpq_tx_func)pqsecure_write, (zpq_rx_func)pqsecure_read,conn, > > + &conn->inBuffer[conn->inCursor], conn->inEnd-conn->inCursor); > > > > This takes the "index" returned by the server and then accesses > > conn->compressors[index] without first checking if the index is out of range, > > so a malicious server could (at least) crash the client by returning index=666. > > Thank you for pointed this problem. I will add the check, although I think > that problem of malicious server is less critical than malicious client. Also > such "byzantine" server may return wrong data, which in many cases is more > fatal than crash of a client. + if ((unsigned)index >= conn->n_compressors) + { + appendPQExpBuffer(&conn->errorMessage, + libpq_gettext( + "server returns incorrect compression aslogirhm index:%d\n"), Now, you're checking that the index is not too large, but not that it's not too small. > > Right now, when I try to connect to an unpatched server, I get: > > psql: error: expected authentication request from server, but received v > > Thank you for pointing it: fixed. I tested that I can connect to unpatced server, thanks. I have to confess that I don't know how this works, or what it has to do with what the commit message claims it does ? Are you saying I can use zlib in one direction and zstd in the other ? How would I do that ? > Author: Konstantin Knizhnik <knizhnik@garret.ru> > Date: Mon Jan 11 16:41:52 2021 +0300 > Making it possible to specify different compresion algorithms in both directions + else if (conn->n_compressors != 0 && beresp == 'v') /* negotiate protocol version*/ + { + appendPQExpBuffer(&conn->errorMessage, + libpq_gettext( + "server is not supporting libpqcompression\n")); + goto error_return; -- Justin
On 11.01.2021 20:38, Tomas Vondra wrote: > On 1/11/21 2:53 PM, Konstantin Knizhnik wrote: >> ... >> >> New version of libpq compression patch is attached. >> It can be also be found at git@github.com:postgrespro/libpq_compression.git >> > Seems it bit-rotted already, so here's a slightly fixed version. > > 1) Fixes the MSVC makefile. The list of files is sorted alphabetically, > so I've added the file at the end. > > 2) Fixes duplicate OID. It's a good practice to assign OIDs from the end > of the range, to prevent collisions during development. Thank you > > Other than that, I wonder what's the easiest way to run all tests with > compression enabled. ISTM it'd be nice to add pg_regress option forcing > a particular compression algorithm to be used, or something similar. I'd > like a convenient way to pass this through a valgrind, for example. Or > how do we get this tested on a buildfarm? I run regression tests with PG_COMPRESSION environment variable set to "true". Do we need some other way (like pg_regress options to run tests with compression enabled? > I'm not convinced it's very user-friendly to not have a psql option > enabling compression. It's true it can be enabled in a connection > string, but I doubt many people will notice that. > > The sgml docs need a bit more love / formatting. The lines in libpq.sgml > are far too long, and there are no tags whatsoever. Presumably zlib/zstd > should be marked as <literal>, and so on. > > > regards > Thank you, I will fix it.
On 12.01.2021 4:20, Justin Pryzby wrote:
On Mon, Jan 11, 2021 at 04:53:51PM +0300, Konstantin Knizhnik wrote:On 09.01.2021 23:31, Justin Pryzby wrote:I suggest that there should be an enum of algorithms, which is constant across all servers. They would be unconditionally included and not #ifdef depending on compilation options.I do not think that it is possible (even right now, it is possible to build Postgres without zlib support). Also if new compression algorithms are added, then in any case we have to somehow handle situation when old client is connected to new server and visa versa.I mean an enum of all compression supported in the master branch, starting with ZLIB = 1. I think this applies to libpq compression (which requires client and server to handle a common compression algorithm), and pg_dump (output of which needs to be read by pg_restore), but maybe not TOAST, the patch for which supports extensions, with dynamically allocated OIDs.
Sorry, I do not understand the goal of introducing this enum.
Algorithms are in any case specified by name. And internally there is no need in such enum,
at least in libpq compression.
In config.sgml, it says that libpq_compression defaults to on (on the server side), but in libpq.sgml it says that it defaults to off (on the client side). Is that what's intended ? I would've thought the defaults would match, or that the server would enforce a default more conservative than the client's (the DBA should probably have to explicitly enable compression, and need to "opt-in" rather than "opt-out").Yes, it is intended behavior: libpq_compression GUC allows to prohibit compression requests fro clients if due to some reasons (security, CPU consumption is not desired). But by default server should support compression if it is requested by client.It's not clear to me if that's true.. It may be what's convenient for you, especially during development, but that doesn't mean it's safe or efficient or what's generally desirable to everyone.
Definitely it is only my point of view, may be DBA will have different opinions.
I think that compression is not dangerousness or resource consuming feature which should be disabled by default.
At least there is on GUC prohibiting SSL connections and them are even more expensive.
In any case, I will be glad to collect votes whether compression requests should be be default rejected by server or not.
But client should not request compression by default: it makes sense only for queries returning large result sets or transferring a lot of data (liek COPY).I think you're making assumptions about everyone's use of the tools, and it's better if the DBA makes that determination. The clients aren't generally under the admin's control, and if they don't request compression, then it's not used. If they request compression, then the DBA still has control over whether it's allowed. We agree that it should be disabled by default, but I suggest that it's most flexible if client's makes the request and allow the server to decide. By default the server should deny/ignore the request, with the client gracefully falling back to no compression.
I think that compression will be most efficient for internal connections (i.e. replication, bulk data loading, ...)
which are mostly controlled by DBA.
I mostly agree with you here, except that compression level 1 gives little effect.Compression would have little effect on most queries, especially at default level=1.
All my experiments both bith page level compression and libpq compression shows that default compression level
provides optimal balance between compression ratio and compression speed.
In Daniil's benchmark the difference between compression ration 1 and 19 in zstd is only 30%.
Right now "compression" parameter accepts not only boolean values but alsolist of suggested algorithms with optional compression level, like "zstd:1,zlib"You're talking about the client compression param. I'm suggesting that the server should allow fine-grained control of what compression algs are permitted at *runtime*. This would allow distributions to compile with all compression libraries enabled at configure time, and still allow an DBA to disable one without recompiling.
Frankly speaking I do not see much sense in it.
My point of view is the following: there are several well known and widely used compression algorithms now:
zlib, zstd, lz4. There are few others but results of various my benchmarks (mostly for page level compression)
shows that zstd provides best compression ratio/speed balance and lz4 - the best speed.
zlib is available almost everywhere and postgres by default is configured with zlib.
So it can be considered as default compression algorithm available everywhere.
If Postgres was built with zstd or lz4 (not currently supported for libpq compression) then it should be used instead of zlib
because it is faster and provides better compression quality.
So I do not see any other arguments for enabling or disabling some particular compression algorithms by DBA or suggesting it by client.
That's possible with environment variable or connection string. I'm proposing to simplify change of its default when recompiling, just like this one: src/interfaces/libpq/fe-connect.c:#define DefaultHost "localhost" Think this would be a 2 line change, and makes the default less hardcoded.
Sorry, my intention was the following: there is list of supported algorithms in order of decreasing their efficiency
(yes, I realize that in general case it may be not always possible to provide partial order on the set of supported compression algorithms,
but right now with zstd and zlib it is trivial).
Client sends to the server its list of supported algorithms (or explicitly specified by user) and server choose most efficient and supported among them.
So there is no need to specify some default in this schema. We can change order of algorithms in the array to affect choice of default algorithm.
Makes sense. I add this check.Your patch warns if *none* of the specified algorithms are supported, but I wonder if it should warn if *any* of them are unsupported (like if someone writes libz instead of zlib, or zst/libzstd instead of zstd, which I guess about half of us will do).Ugh... Handling mistyping in connection string seems to be not so good idea (from my point of view). And situation when some of algorithms is not supported by server seems to normal (if new client connects to old server). So I do not want to produce warning in this case.I'm not talking about warning if an alg is "not supported by server", but rather if it's "not supported by client". I think there should be a warning if a connection string specifies two libraries, but one is misspelled.
I think it's good if it *can* be automatic. But it's also good if the DBA can implement a simple "policy" about what's (not) supported. If the user requests a compression and it's not known on the *client* side I think it should warn. BTW I think the compression should be shown in psql \conninfo
Good point: done.
+ if ((unsigned)index >= conn->n_compressors) + { + appendPQExpBuffer(&conn->errorMessage, + libpq_gettext( + "server returns incorrect compression aslogirhm index: %d\n"), Now, you're checking that the index is not too large, but not that it's not too small.
Sorry, it was my "fad" to write most optimal code which is completely not relevant in this place.
I specially use unsigned comparison to perform check using just one comparison instead of two.
I have added comment here.
Right now, when I try to connect to an unpatched server, I get: psql: error: expected authentication request from server, but received vThank you for pointing it: fixed.I tested that I can connect to unpatced server, thanks. I have to confess that I don't know how this works, or what it has to do with what the commit message claims it does ? Are you saying I can use zlib in one direction and zstd in the other ? How would I do that ?
Sorry, this feature was suggested by Andres and partly implemented by Daniil.
I always think that it is useless and overkill feature.
New version of the patch with the suggested changes is attached.
Attachment
On Tue, Jan 12, 2021 at 08:44:43AM +0300, Konstantin Knizhnik wrote: > On 11.01.2021 20:38, Tomas Vondra wrote: > > 1) Fixes the MSVC makefile. The list of files is sorted alphabetically, > > so I've added the file at the end. > Thank you This is still failing the windows build. I think you need something like this, which I have in my zstd/pg_dump patch. --- a/src/tools/msvc/Solution.pm +++ b/src/tools/msvc/Solution.pm @@ -307,6 +307,7 @@ sub GenerateFiles HAVE_LIBXML2 => undef, HAVE_LIBXSLT => undef, HAVE_LIBZ => $self->{options}->{zlib} ? 1 : undef, + HAVE_LIBZSTD => $self->{options}->{zstd} ? 1 : undef, I think we should come up with an minimal, prelimininary 0001 patch which is common between the 3 compression patches (or at least the two using zstd). The ./configure changes and a compressionlibs struct would also be included. I'm planning to do something like this with the next revision of my patchset. -- Justin
On 12.01.2021 18:38, Justin Pryzby wrote: > On Tue, Jan 12, 2021 at 08:44:43AM +0300, Konstantin Knizhnik wrote: >> On 11.01.2021 20:38, Tomas Vondra wrote: >>> 1) Fixes the MSVC makefile. The list of files is sorted alphabetically, >>> so I've added the file at the end. >> Thank you > This is still failing the windows build. > > I think you need something like this, which I have in my zstd/pg_dump patch. > > --- a/src/tools/msvc/Solution.pm > +++ b/src/tools/msvc/Solution.pm > @@ -307,6 +307,7 @@ sub GenerateFiles > HAVE_LIBXML2 => undef, > HAVE_LIBXSLT => undef, > HAVE_LIBZ => $self->{options}->{zlib} ? 1 : undef, > + HAVE_LIBZSTD => $self->{options}->{zstd} ? 1 : undef, > Thank you. > I think we should come up with an minimal, prelimininary 0001 patch which is > common between the 3 compression patches (or at least the two using zstd). The > ./configure changes and a compressionlibs struct would also be included. I'm > planning to do something like this with the next revision of my patchset. > It will be great to have such common preliminary patch including zstd support. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> 12 янв. 2021 г., в 20:47, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> написал(а): > >> I think we should come up with an minimal, prelimininary 0001 patch which is >> common between the 3 compression patches (or at least the two using zstd). The >> ./configure changes and a compressionlibs struct would also be included. I'm >> planning to do something like this with the next revision of my patchset. >> > It will be great to have such common preliminary patch including zstd support. +1. I'd also rebase my WAL FPI patch on top of common code. Best regards, Andrey Borodin.
Hi everyone, I’ve been making some experiments with an on-the-fly compression switch lately and have some updates. > On Dec 22, 2020, at 10:42 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > > Hmm, I assumed that if the compression buffers were flushed on the > sending side, and if all the data produced on the sending side were > transmitted to the receiver, the receiving side would then return > everything up to the point of the flush. However, now that I think > about it, there's no guarantee that any particular compression library > would actually behave that way. I wonder what actually happens in > practice with the libraries we care about? > I'm not sure about the details, but the general idea seems like it > might be worth considering. If we choose a compression method that is > intended for streaming compression and decompression and whose library > handles compression flushes sensibly, then we might not really need to > go this way to make it work. But, on the other hand, this method has a > certain elegance that just compressing everything lacks, and might > allow some useful flexibility. On the third hand, restarting > compression for every new set of messages might really hurt the > compression ratio in some scenarios. I'm not sure what is best. Earlier in the thread, we’ve discussed introducing a new message type (CompressedMessage) so I came up with the two main approaches to send compressed data: 1. Sending the compressed message type without the message length followed by continuous compressed data. 2. Sending the compressed data packed into messages with specified length (pretty much like CopyData). The first approach allows sending raw compressed data without any additional framing, but has some downsides: - to determine the end of compressed data, it is required to decompress the entire compressed data - in most cases (at least in ZSTD and ZLIB), it is required to end the compression stream so the decompressor can determine the end of compressed data on the receiving side. After that, it is required to init a new compression context (for example, in case of ZSTD, start a new frame) which may lead to a worse compression ratio. The second approach has some overhead because it requires framing the compressed data into messages with specified length(chunks), but I see the following advantages: - CompressedMessage is being sent like any other Postgres protocol message and we always know the size of compressed data from the message header so it is not required to actually decompress the data to determine the end of compressed data - This approach does not require resetting the compression context so compression can continue even if there are some uncompressed messages between two CompressedMessage messages So I’ve implemented the second approach with the following message compression criteria: if the message type is CopyData or DataRow, it should be compressed, otherwise, send the message uncompressed I’ve compared this approach with permanent compression in the following scenarios: - pg_restore of IMDB database (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2QYZBT) - pgbench "host=x dbname=testdb port=5432 user=testuser compression=zstd:1" --builtin tpcb-like -t 400 --jobs=64 --client=600 The detailed report with CPU/memory/network load is available here: https://docs.google.com/document/d/13qEUpIjh2NNOOW_8NZOFUohRSEIro0R2xdVs-2Ug8Ts pg_restore of IMDB database test results Chunked compression with only CopyData or DataRow compression (second approach): time: real 2m27.947s user 0m45.453s sys 0m3.113s RX bytes diff, human: 1.8837M TX bytes diff, human: 1.2810G Permanent compression: time: real 2m15.810s user 0m42.822s sys 0m2.022s RX bytes diff, human: 2.3274M TX bytes diff, human: 1.2761G Without compression: time: real 2m38.245s user 0m18.946s sys 0m2.443s RX bytes diff, human: 5.6117M TX bytes diff, human: 3.8227G Also, I’ve run pgbench tests and measured the CPU load. Since chunked compression did not compress any messages except for CopyData or DataRow, it demonstrated lower CPU usage compared to the permanent compression, full report with graphs is available in the Google doc above. Pull request with the second approach implemented: https://github.com/postgrespro/libpq_compression/pull/7 Also, in this pull request, I’ve made the following changes: - extracted the general-purpose streaming compression API into the separate structure (ZStream) so it can be used in otherplaces without tx_func and rx_func, maybe the other compression patches can utilize it? - made some refactoring of ZpqStream - moved the SSL and ZPQ buffered read data checks into separate function pqReadPending What do you think of the results above? I think that the implemented approach is viable, but maybe I missed something inmy tests. Maybe we can choose the other compression criteria (for example, compress only messages with length more than X bytes), I am not sure if the current compression criteria provides the best results. Thanks, Daniil Zakhlystov
On 08.02.2021 22:23, Daniil Zakhlystov wrote: > Hi everyone, > > I’ve been making some experiments with an on-the-fly compression switch lately and have some updates. > > ... > pg_restore of IMDB database test results > > Chunked compression with only CopyData or DataRow compression (second approach): > time: > real 2m27.947s > user 0m45.453s > sys 0m3.113s > RX bytes diff, human: 1.8837M > TX bytes diff, human: 1.2810G > > Permanent compression: > time: > real 2m15.810s > user 0m42.822s > sys 0m2.022s > RX bytes diff, human: 2.3274M > TX bytes diff, human: 1.2761G > > Without compression: > time: > real 2m38.245s > user 0m18.946s > sys 0m2.443s > RX bytes diff, human: 5.6117M > TX bytes diff, human: 3.8227G > > > Also, I’ve run pgbench tests and measured the CPU load. Since chunked compression did not compress any messages > except for CopyData or DataRow, it demonstrated lower CPU usage compared to the permanent compression, full report withgraphs > is available in the Google doc above. > > Pull request with the second approach implemented: > https://github.com/postgrespro/libpq_compression/pull/7 > > Also, in this pull request, I’ve made the following changes: > - extracted the general-purpose streaming compression API into the separate structure (ZStream) so it can be used in otherplaces without tx_func and rx_func, > maybe the other compression patches can utilize it? > - made some refactoring of ZpqStream > - moved the SSL and ZPQ buffered read data checks into separate function pqReadPending > > What do you think of the results above? I think that the implemented approach is viable, but maybe I missed something inmy tests. Sorry, but my interpretation of your results is completely different: permanent compression is faster than chunked compression (2m15 vs. 2m27) and consumes less CPU (44 vs 48 sec). Size of RX data is slightly larger - 0.5Mb but TX size is smaller - 5Mb. So permanent compression is better from all points of view: it is faster, consumes less CPU and reduces network traffic! From my point of view your results just prove my original opinion that possibility to control compression on the fly and use different compression algorithm for TX/RX data just complicates implementation and given no significant advantages. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hi! > On 09.02.2021 09:06, Konstantin Knizhnik wrote: > > Sorry, but my interpretation of your results is completely different: > permanent compression is faster than chunked compression (2m15 vs. 2m27) > and consumes less CPU (44 vs 48 sec). > Size of RX data is slightly larger - 0.5Mb but TX size is smaller - 5Mb. > So permanent compression is better from all points of view: it is > faster, consumes less CPU and reduces network traffic! > > From my point of view your results just prove my original opinion that > possibility to control compression on the fly and use different > compression algorithm for TX/RX data > just complicates implementation and given no significant advantages. When I mentioned the lower CPU usage, I was referring to the pgbench test results in attached google doc, where chunked compression demonstrated lower CPU usage compared to the permanent compression. I made another (a little bit larger) pgbench test to demonstrate this: Pgbench test parameters: Data load pgbench -i -s 100 Run configuration pgbench --builtin tpcb-like -t 1500 --jobs=64 --client==600" Pgbench test results: No compression latency average = 247.793 ms tps = 2421.380067 (including connections establishing) tps = 2421.660942 (excluding connections establishing) real 6m11.818s user 1m0.620s sys 2m41.087s RX bytes diff, human: 703.9221M TX bytes diff, human: 772.2580M Chunked compression (compress only CopyData and DataRow messages) latency average = 249.123 ms tps = 2408.451724 (including connections establishing) tps = 2408.719578 (excluding connections establishing) real 6m13.819s user 1m18.800s sys 2m39.941s RX bytes diff, human: 707.3872M TX bytes diff, human: 772.1594M Permanent compression latency average = 250.312 ms tps = 2397.005945 (including connections establishing) tps = 2397.279338 (excluding connections establishing) real 6m15.657s user 1m54.281s sys 2m37.187s RX bytes diff, human: 610.6932M TX bytes diff, human: 513.2225M As you can see in the above results, user CPU time (1m18.800s vs 1m54.281s) is significantly smaller in chunked compression because it doesn’t try to compress all of the packets. Here is the summary from my POV, according to these and previous tests results: 1. Permanent compression always brings the highest compression ratio 2. Permanent compression might be not worthwhile in load different from COPY data / Replication / BLOBs/JSON queries 3. Chunked compression allows to compress only well compressible messages and save the CPU cycles by not compressing theothers 4. Chunked compression introduces some traffic overhead compared to the permanent (1.2810G vs 1.2761G TX data on pg_restoreof IMDB database dump, according to results in my previous message) 5. From the protocol point of view, chunked compression seems a little bit more flexible: - we can inject some uncompressed messages at any time without the need to decompress/compress the compressed data - we can potentially switch the compression algorithm at any time (but I think that this might be over-engineering) Given the summary above, I think it’s time to make a decision on which path we should take and make the final list of goalsthat need to be reached in this patch to make it committable. Thanks, Daniil Zakhlystov
On 11.02.2021 16:09, Daniil Zakhlystov wrote: > Hi! > >> On 09.02.2021 09:06, Konstantin Knizhnik wrote: >> >> Sorry, but my interpretation of your results is completely different: >> permanent compression is faster than chunked compression (2m15 vs. 2m27) >> and consumes less CPU (44 vs 48 sec). >> Size of RX data is slightly larger - 0.5Mb but TX size is smaller - 5Mb. >> So permanent compression is better from all points of view: it is >> faster, consumes less CPU and reduces network traffic! >> >> From my point of view your results just prove my original opinion that >> possibility to control compression on the fly and use different >> compression algorithm for TX/RX data >> just complicates implementation and given no significant advantages. > When I mentioned the lower CPU usage, I was referring to the pgbench test results in attached > google doc, where chunked compression demonstrated lower CPU usage compared to the permanent compression. > > I made another (a little bit larger) pgbench test to demonstrate this: > > Pgbench test parameters: > > Data load > pgbench -i -s 100 > > Run configuration > pgbench --builtin tpcb-like -t 1500 --jobs=64 --client==600" > > Pgbench test results: > > No compression > latency average = 247.793 ms > tps = 2421.380067 (including connections establishing) > tps = 2421.660942 (excluding connections establishing) > > real 6m11.818s > user 1m0.620s > sys 2m41.087s > RX bytes diff, human: 703.9221M > TX bytes diff, human: 772.2580M > > Chunked compression (compress only CopyData and DataRow messages) > latency average = 249.123 ms > tps = 2408.451724 (including connections establishing) > tps = 2408.719578 (excluding connections establishing) > > real 6m13.819s > user 1m18.800s > sys 2m39.941s > RX bytes diff, human: 707.3872M > TX bytes diff, human: 772.1594M > > Permanent compression > latency average = 250.312 ms > tps = 2397.005945 (including connections establishing) > tps = 2397.279338 (excluding connections establishing) > > real 6m15.657s > user 1m54.281s > sys 2m37.187s > RX bytes diff, human: 610.6932M > TX bytes diff, human: 513.2225M > > > As you can see in the above results, user CPU time (1m18.800s vs 1m54.281s) is significantly smaller in > chunked compression because it doesn’t try to compress all of the packets. Well, but permanent compression provides some (not so large) reducing of traffic, while for chunked compression network traffic is almost the same as with no-compression, but it consumes more CPU. Definitely pgbench queries are not the case where compression should be used: both requests and responses are too short to make compression efficient. So in this case compression should not be used at all. From my point of view, "chunked compression" is not a good compromise between no-compression and permanent-compression cases, but it combines drawbacks of two approaches: doesn't reduce traffic but consume more CPU. > > Here is the summary from my POV, according to these and previous tests results: > > 1. Permanent compression always brings the highest compression ratio > 2. Permanent compression might be not worthwhile in load different from COPY data / Replication / BLOBs/JSON queries > 3. Chunked compression allows to compress only well compressible messages and save the CPU cycles by not compressing theothers > 4. Chunked compression introduces some traffic overhead compared to the permanent (1.2810G vs 1.2761G TX data on pg_restoreof IMDB database dump, according to results in my previous message) > 5. From the protocol point of view, chunked compression seems a little bit more flexible: > - we can inject some uncompressed messages at any time without the need to decompress/compress the compressed data > - we can potentially switch the compression algorithm at any time (but I think that this might be over-engineering) > > Given the summary above, I think it’s time to make a decision on which path we should take and make the final list of goalsthat need to be reached in this patch to make it committable. > > Thanks, > > Daniil Zakhlystov
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed Hi, I've compared the different libpq compression approaches in the streaming physical replication scenario. Test setup Three hosts: first is used for pg_restore run, second is master, third is the standby replica. In each test run, I've run the pg_restore of the IMDB database (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2QYZBT) and measured the received traffic on the standby replica. Also, I've enlarged the ZPQ_BUFFER_SIZE buffer in all versions because too small buffer size (8192 bytes) lead to more system calls to socket read/write and poor compression in the chunked-reset scenario. Scenarios: chunked use streaming compression, wrap compressed data into CompressedData messages and preserve the compression context betweenmultiple CompressedData messages. https://github.com/usernamedt/libpq_compression/tree/chunked-compression chunked-reset use streaming compression, wrap compressed data into CompressedData messages and reset the compression context on each CompressedDatamessage. https://github.com/usernamedt/libpq_compression/tree/chunked-reset permanent use streaming compression, send raw compressed stream without any wrapping https://github.com/usernamedt/libpq_compression/tree/permanent-w-enlarged-buffer Tested compression levels ZSTD, level 1 ZSTD, level 5 ZSTD, level 9 Scenario Replica rx, mean, MB uncompressed 6683.6 ZSTD, level 1 Scenario Replica rx, mean, MB chunked-reset 2726 chunked 2694 permanent 2694.3 ZSTD, level 5 Scenario Replica rx, mean, MB chunked-reset 2234.3 chunked 2123 permanent 2115.3 ZSTD, level 9 Scenario Replica rx, mean, MB chunked-reset 2153.6 chunked 1943 permanent 1941.6 Full report with additional data and resource usage graphs is available here https://docs.google.com/document/d/1a5bj0jhtFMWRKQqwu9ag1PgDF5fLo7Ayrw3Uh53VEbs Based on these results, I suggest sticking with chunked compression approach which introduces more flexibility and contains almost no overhead compared to permanent compression. Also, later we may introduce some setting to control should we reset the compression context in each message without breaking the backward compatibility. -- Daniil Zakhlystov The new status of this patch is: Ready for Committer
On Thu, Mar 18, 2021 at 07:30:09PM +0000, Daniil Zakhlystov wrote: > The new status of this patch is: Ready for Committer The CF bot is failing , because the last patch sent to the list is from January: | Latest attachment (libpq-compression-31.patch) at 2021-01-12 14:05:22 from Konstantin Knizhnik ... The most recent messages have all had links to github repos without patches attached. Also, the branches I've looked at on the github repos all have messy history, and need to be squished into a coherent commit or set of commits. Would you send a patch to the list with the commits as you propose to merge them ? Also, I'm not sure, but I think you may find that the zstd configure.ac should use pkgconfig. This allowed the CIs to compile these patches. Without pkg-config, the macos CI couldn't find (at least) LZ4.k https://commitfest.postgresql.org/32/2813/ https://commitfest.postgresql.org/32/3015/ Also, in those patches, we have separate "not for merge" patches which enable the compression library by default. This allows the CIs to exercise the feature. -- Justin
On Thu, Mar 18, 2021 at 08:02:32PM -0500, Justin Pryzby wrote: > On Thu, Mar 18, 2021 at 07:30:09PM +0000, Daniil Zakhlystov wrote: > > The new status of this patch is: Ready for Committer > > The CF bot is failing , because the last patch sent to the list is from January: > | Latest attachment (libpq-compression-31.patch) at 2021-01-12 14:05:22 from Konstantin Knizhnik ... > > The most recent messages have all had links to github repos without patches > attached. > > Also, the branches I've looked at on the github repos all have messy history, > and need to be squished into a coherent commit or set of commits. > > Would you send a patch to the list with the commits as you propose to merge > them ? This needs some significant effort: - squish commits; * maybe make an 0001 commit supporting zlib, and an 0002 commit adding zstd; * add an 0003 patch to enable zlib compression by default (for CI - not for merge); - rebase to current master; - compile without warnings; - assign OIDs at the end of the ranges given by find-unused-oids to avoid conflicts with patches being merged to master. Currently, make check-world gets stuck in several places when I use zlib. There's commit messages claiming that the compression can be "asymmetric" between client-server vs server-client, but the commit seems to be unrelated, and the protocol documentation doesn't describe how this works. Previously, I suggested that the server should have a "policy" GUC defining which compression methods are allowed. Possibly including compression "level". For example, the default might be to allow zstd, but only up to level 9. This: + /* Initialise values and NULL flags arrays */ + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); can be just: + bool nulls[PG_STAT_NETWORK_TRAFFIC_COLS] = {false}; since values is fully populated below. typo: aslogirhm I wrote about Solution.pm earlier in this thread, and I see that it's in Konstantin's repo since Jan 12, but it's not in yours (?) so I think windows build will fail. Likewise, commit 1a946e14e in his branch adds compression into to psql \connninfo based on my suggestion, but it's missing in your branch. I've added Daniil as a 2nd author and set back to "waiting on author".
Hi, thanks for your review! > On Mar 19, 2021, at 11:28 AM, Justin Pryzby <pryzby@telsasoft.com> wrote: > > This needs some significant effort: > > - squish commits; > * maybe make an 0001 commit supporting zlib, and an 0002 commit adding zstd; > * add an 0003 patch to enable zlib compression by default (for CI - not for merge); > - rebase to current master; I’ve rebased the chunked compression branch and attached it to this message as two diff patches: 0001-Add-zlib-and-zstd-streaming-compression.patch - this patch introduces general functionality for zlib and zstd streamingcompression 0002-Implement-libpq-compression.patch - this patch introduces libpq chunked compression > - compile without warnings; > - assign OIDs at the end of the ranges given by find-unused-oids to avoid > conflicts with patches being merged to master. Done > Currently, make check-world gets stuck in several places when I use zlib. > > There's commit messages claiming that the compression can be "asymmetric" > between client-server vs server-client, but the commit seems to be unrelated, > and the protocol documentation doesn't describe how this works. > > Previously, I suggested that the server should have a "policy" GUC defining > which compression methods are allowed. Possibly including compression "level". > For example, the default might be to allow zstd, but only up to level 9. Support for different compression methods in each direction is implemented in zpq_stream.c, the only thing left to do is to define the GUC settings to control this behavior and make adjustments to the compressionhandshake process. Earlier in the thread, I’ve discussed the introduction of compress_algorithms and decompress_algorithms GUC settings withRobert Haas. The main issue with the decompress_algorithms setting is that the receiving side can’t effectively enforce the actual compressionlevel chosen by the sending side. So I propose to add the two options to the client and server GUC: 1. compress_algorithms setting with the ability to specify the exact compression level for each algorithm 2. decompress_algorithms to control which algorithms are supported for decompression of the incoming messages For example: Server compress_algorithms = ’zstd:2; zlib:5’ // use the zstd with compression level 2 or zlib with compression level 5 for outgoingmessages decompress_algorithms = ‘zstd; zlib’ // allow the zstd and zlib algorithms for incoming messages Client compress_algorithms = ’zstd; zlib:3’ // use the zstd with default compression level (1) or zlib with compression level 3for outgoing messages decompress_algorithms = ‘zstd’ // allow the zstd algorithm for incoming messages Robert, If I missed something from our previous discussion, please let me know. If this approach is OK, I'll implement it. > This: > + /* Initialise values and NULL flags arrays */ > + MemSet(values, 0, sizeof(values)); > + MemSet(nulls, 0, sizeof(nulls)); > > can be just: > + bool nulls[PG_STAT_NETWORK_TRAFFIC_COLS] = {false}; > since values is fully populated below. > The current implementation matches the other functions in pgstatfuncs.c so I think that it is better not to change it. > typo: aslogirhm Fixed > I wrote about Solution.pm earlier in this thread, and I see that it's in > Konstantin's repo since Jan 12, but it's not in yours (?) so I think windows > build will fail. > > Likewise, commit 1a946e14e in his branch adds compression into to psql > \connninfo based on my suggestion, but it's missing in your branch. I've rebased the patch. Now it includes Konstantin's fixes. — Daniil Zakhlystov
Attachment
Updated patch version with fixed compiler issues, sorry for noise.
Attachment
On Wed, Apr 21, 2021 at 2:38 PM Ian Zagorskikh <izagorskikh@cloudlinux.com> wrote: > > Hi all! > > I took a look at proposed patches and found several typos/mistakes. Where should I send my comments? In this thread ordirectly to the authors? > I feel it is good to send comments here. This is what we normally do for all the patches developed in this mailing list. -- With Regards, Amit Kapila.
All, thanks!
On Wed, Apr 21, 2021 at 10:47 AM Michael Paquier <michael@paquier.xyz> wrote:
Patch reviews had better be posted on the community lists. This way,
if the patch is left dead by the authors (things happen in life), then
somebody else could move on with the patch without having to worry
about this kind of issues twice.
--
Michael
Let me drop my humble 2 cents in this thread. At this moment I checked only 0001-Add-zlib-and-zstd-streaming-compression patch. With no order:
* No checks for failures. For example, value from malloc() is not checked for NULL and used as-is right after the call. Also as a result possibly NULL values passed into ZSTD functions which are explicitly not NULL-tolerant and so dereference pointers without checks.
* Used memory management does not follow the schema used in the common module. Direct calls to std C malloc/free are hard coded. AFAIU this is not backend friendly. Looking at the code around I believe they should be wrapped to ALLOC/FREE local macroses that depend on FRONTEND define. As it's done for example. in src/common/hmac.c.
* If we're going to fix our code to be palloc/pfree friendly there's also a way to pass our custom allocator callbacks inside ZSTD functions. By default ZSTD uses malloc/free but it can be overwritten by the caller in e.g. ZSTD_createDStream_advanced(ZSTD_customMem customMem) versions of API . IMHO that would be good. If a 3rd party component allows us to inject a custom memory allocator and we do have it why not use this feature?
* Most zs_foo() functions do not check ptr arguments, though some like zs_free() do. If we're speaking about a "common API" that can be used by a wide range of different modules around the project, such a relaxed way to treat input arguments IMHO can be an evil. Or at least add Assert(ptr) assertions so they can be catched up in debug mode.
* For zs_create() function which must be called first to create context for further operations, a compression method is passed as integer. This method is used inside zs_create() as index inside an array of compress implementation descriptors. There are several problems:
1) Method ID value is not checked for validity. By passing an invalid method ID we can easily get out of array bounds.
2) Content of array depends on on configuration options e.g. HAVE_LIBZSTD, HAVE_LIBZ etc. So index inside this array is not persistent. For some config options combination method ID with value 0 may refer to ZSTD and for others to zlib.
3) There's no defines/enums/etc in public z_stream.h header that would define which value should be passed to create a specific compressor. Users have to guess it or check the code.
* I have a feeling after reading comments for ZSTD compress/decompress functions that their return value is treated in a wrong way handling only success path. Though need more checks/POCs on that.
In general, IMHO the idea of a generic compress/decompress API is very good and useful. But specific implementation needs some code cleanup/refactoring.
One little fix to 0001-Add-zlib-and-zstd-streaming-compression patch for configure.ac
================
@@ -1455,6 +1456,7 @@ fi
if test "$with_lz4" = yes; then
AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])])
@@ -1455,6 +1456,7 @@ fi
if test "$with_lz4" = yes; then
AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])])
+fi
================
Otherwise autoconf generates a broken configure script.
--
> On 21 Apr 2021, at 20:35, Ian Zagorskikh <izagorskikh@cloudlinux.com> wrote: > > Let me drop my humble 2 cents in this thread. At this moment I checked only 0001-Add-zlib-and-zstd-streaming-compressionpatch. With no order: > > * No checks for failures. For example, value from malloc() is not checked for NULL and used as-is right after the call.Also as a result possibly NULL values passed into ZSTD functions which are explicitly not NULL-tolerant and so dereferencepointers without checks. > > * Used memory management does not follow the schema used in the common module. Direct calls to std C malloc/free are hardcoded. AFAIU this is not backend friendly. Looking at the code around I believe they should be wrapped to ALLOC/FREElocal macroses that depend on FRONTEND define. As it's done for example. in src/common/hmac.c. > > * If we're going to fix our code to be palloc/pfree friendly there's also a way to pass our custom allocator callbacksinside ZSTD functions. By default ZSTD uses malloc/free but it can be overwritten by the caller in e.g. ZSTD_createDStream_advanced(ZSTD_customMemcustomMem) versions of API . IMHO that would be good. If a 3rd party componentallows us to inject a custom memory allocator and we do have it why not use this feature? > > * Most zs_foo() functions do not check ptr arguments, though some like zs_free() do. If we're speaking about a "commonAPI" that can be used by a wide range of different modules around the project, such a relaxed way to treat input argumentsIMHO can be an evil. Or at least add Assert(ptr) assertions so they can be catched up in debug mode. > > * For zs_create() function which must be called first to create context for further operations, a compression method ispassed as integer. This method is used inside zs_create() as index inside an array of compress implementation descriptors.There are several problems: > 1) Method ID value is not checked for validity. By passing an invalid method ID we can easily get out of array bounds. > 2) Content of array depends on on configuration options e.g. HAVE_LIBZSTD, HAVE_LIBZ etc. So index inside this array isnot persistent. For some config options combination method ID with value 0 may refer to ZSTD and for others to zlib. > 3) There's no defines/enums/etc in public z_stream.h header that would define which value should be passed to create aspecific compressor. Users have to guess it or check the code. > > * I have a feeling after reading comments for ZSTD compress/decompress functions that their return value is treated ina wrong way handling only success path. Though need more checks/POCs on that. > > In general, IMHO the idea of a generic compress/decompress API is very good and useful. But specific implementation needssome code cleanup/refactoring. Hi, thank you for the detailed review. Previously in the thread, Justin Pryzby mentioned the preliminary patch which will becommon between the different compression patches. Basically, this is a quick prototype of that patch and it definitelyneeds some additional effort. It would be also great to have a top-level overview of the 0001-Add-zlib-and-zstd-streaming-compressionpatch from Justin to make sure it can be potentially be used for the other compressionpatches. Currently, I’m in the process of implementing the negotiation of asymmetric compression, but I can take a look at these issuesafter I finish with it. > One little fix to 0001-Add-zlib-and-zstd-streaming-compression patch for configure.ac > > ================ > @@ -1455,6 +1456,7 @@ fi > > if test "$with_lz4" = yes; then > AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])]) > +fi > ================ > > Otherwise autoconf generates a broken configure script. Added your fix to the patch and rebased it onto the current master. — Daniil Zakhlystov
Attachment
**sorry for the noise, but I need to re-send the message because one of the recipients is blocked on the pgsql-hackers forsome reason** Hi! Done, the patch should apply to the current master now. Actually, I have an almost-finished version of the patch with the previously requested asymmetric compression negotiation.I plan to attach it soon. Thanks, Daniil Zakhlystov > On 14 Jul 2021, at 16:37, vignesh C <vignesh21@gmail.com> wrote: > > On Thu, Apr 22, 2021 at 6:34 PM Daniil Zakhlystov > <usernamedt@yandex-team.ru> wrote: >> >>> On 21 Apr 2021, at 20:35, Ian Zagorskikh <izagorskikh@cloudlinux.com> wrote: >>> >>> Let me drop my humble 2 cents in this thread. At this moment I checked only 0001-Add-zlib-and-zstd-streaming-compressionpatch. With no order: >>> >>> * No checks for failures. For example, value from malloc() is not checked for NULL and used as-is right after the call.Also as a result possibly NULL values passed into ZSTD functions which are explicitly not NULL-tolerant and so dereferencepointers without checks. >>> >>> * Used memory management does not follow the schema used in the common module. Direct calls to std C malloc/free arehard coded. AFAIU this is not backend friendly. Looking at the code around I believe they should be wrapped to ALLOC/FREElocal macroses that depend on FRONTEND define. As it's done for example. in src/common/hmac.c. >>> >>> * If we're going to fix our code to be palloc/pfree friendly there's also a way to pass our custom allocator callbacksinside ZSTD functions. By default ZSTD uses malloc/free but it can be overwritten by the caller in e.g. ZSTD_createDStream_advanced(ZSTD_customMemcustomMem) versions of API . IMHO that would be good. If a 3rd party componentallows us to inject a custom memory allocator and we do have it why not use this feature? >>> >>> * Most zs_foo() functions do not check ptr arguments, though some like zs_free() do. If we're speaking about a "commonAPI" that can be used by a wide range of different modules around the project, such a relaxed way to treat input argumentsIMHO can be an evil. Or at least add Assert(ptr) assertions so they can be catched up in debug mode. >>> >>> * For zs_create() function which must be called first to create context for further operations, a compression methodis passed as integer. This method is used inside zs_create() as index inside an array of compress implementation descriptors.There are several problems: >>> 1) Method ID value is not checked for validity. By passing an invalid method ID we can easily get out of array bounds. >>> 2) Content of array depends on on configuration options e.g. HAVE_LIBZSTD, HAVE_LIBZ etc. So index inside this arrayis not persistent. For some config options combination method ID with value 0 may refer to ZSTD and for others to zlib. >>> 3) There's no defines/enums/etc in public z_stream.h header that would define which value should be passed to createa specific compressor. Users have to guess it or check the code. >>> >>> * I have a feeling after reading comments for ZSTD compress/decompress functions that their return value is treated ina wrong way handling only success path. Though need more checks/POCs on that. >>> >>> In general, IMHO the idea of a generic compress/decompress API is very good and useful. But specific implementation needssome code cleanup/refactoring. >> >> Hi, >> >> thank you for the detailed review. Previously in the thread, Justin Pryzby mentioned the preliminary patch which willbe common between the different compression patches. Basically, this is a quick prototype of that patch and it definitelyneeds some additional effort. It would be also great to have a top-level overview of the 0001-Add-zlib-and-zstd-streaming-compressionpatch from Justin to make sure it can be potentially be used for the other compressionpatches. >> >> Currently, I’m in the process of implementing the negotiation of asymmetric compression, but I can take a look at theseissues after I finish with it. >> >>> One little fix to 0001-Add-zlib-and-zstd-streaming-compression patch for configure.ac >>> >>> ================ >>> @@ -1455,6 +1456,7 @@ fi >>> >>> if test "$with_lz4" = yes; then >>> AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])]) >>> +fi >>> ================ >>> >>> Otherwise autoconf generates a broken configure script. >> >> Added your fix to the patch and rebased it onto the current master. > > The patch does not apply on Head anymore, could you rebase and post a > patch. I'm changing the status to "Waiting for Author". > > Regards, > Vignesh
Attachment
On Wed, Jul 14, 2021 at 6:31 PM Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: > > **sorry for the noise, but I need to re-send the message because one of the recipients is blocked on the pgsql-hackersfor some reason** > > Hi! > > Done, the patch should apply to the current master now. > > Actually, I have an almost-finished version of the patch with the previously requested asymmetric compression negotiation.I plan to attach it soon. Thanks for providing the patch quickly, I have changed the status to "Need Review". Regards, Vignesh
Hi! I made some noticeable changes to the patch and fixed the previously mentioned issues. On Fri, Mar 19, 2021 at 16:28 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > Previously, I suggested that the server should have a "policy" GUC defining > which compression methods are allowed. Possibly including compression "level". > For example, the default might be to allow zstd, but only up to level 9. Now libpq_compression GUC server setting controls the available client-server traffic compression methods. It allows specifying the exact list of the allowed compression algorithms. For example, to allow only and zlib, set the setting to "zstd,zlib". Also, maximal allowed compression level can be specifiedfor each method, e.g. "zstd:1,zlib:2" setting will set the maximal compression level for zstd to 1 and zlib to 2. If a client requests the compression with a higher compression level, it will be set to the maximal allowed one. The default (and recommended) maximal compression level for each algorithm is 1. On Fri, Mar 19, 2021 at 16:28 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > There's commit messages claiming that the compression can be "asymmetric" > between client-server vs server-client, but the commit seems to be unrelated, > and the protocol documentation doesn't describe how this works. On Dec 10, 2020, at 1:39 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Another idea is that you could have a new message type that says "hey, > the payload of this is 1 or more compressed messages." It uses the > most-recently set compression method. This would make switching > compression methods easier since the SetCompressionMethod message > itself could always be sent uncompressed and/or not take effect until > the next compressed message I rewrote the compression initialization logic. Now it is asymmetric and the compression method is changeable on the fly. Now compression initialization consists only of two phases: 1. Client sends startup packet with _pq_.compression containing the list of compression algorithms specified by the clientwith an optional specification of compression level (e.g. “zstd:2,zlib:1”) 2. The server intersects the requested compression algorithms with the allowed ones (controlled via the libpq_compressionserver config setting). If the intersection is not empty, the server responds with CompressionAck containing the final list of the compression algorithmsthat can be used for the compression of libpq messages between the client and server. If the intersection is empty (server does not accept any of the requested algorithms), then it replies with CompressionAckcontaining the empty list. After sending the CompressionAck message, the server can send the SetCompressionMethod message to set the current compressionalgorithm for server-to-client traffic compression. After receiving the CompressionAck message, the client can send the SetCompressionMethod message to set the current compressionalgorithm for client-to-server traffic compression. SetCompressionMethod message contains the index of the compression algorithm in the final list of the compression algorithmswhich is generated during compression initialization (described above). Compressed data is wrapped into CompressedData messages. Rules for compressing the protocol messages are defined in zpq_stream.c. For each protocol message, the preferred compressionalgorithm can be chosen. On Wed, Apr 21, 2021 at 15:35 AM Ian Zagorskikh <izagorskikh@cloudlinux.com> wrote: > Let me drop my humble 2 cents in this thread. At this moment I checked only 0001-Add-zlib-and-zstd-streaming-compressionpatch. With no order: > > * No checks for failures. For example, value from malloc() is not checked for NULL and used as-is right after the call.Also as a result possibly NULL values passed into ZSTD functions which are explicitly not NULL-tolerant and so dereferencepointers without checks. > > * Used memory management does not follow the schema used in the common module. Direct calls to std C malloc/free are hardcoded. AFAIU this is not backend friendly. Looking at the code around I believe they should be wrapped to ALLOC/FREElocal macroses that depend on FRONTEND define. As it's done for example. in src/common/hmac.c. > > * If we're going to fix our code to be palloc/pfree friendly there's also a way to pass our custom allocator callbacksinside ZSTD functions. By default ZSTD uses malloc/free but it can be overwritten by the caller in e.g. ZSTD_createDStream_advanced(ZSTD_customMemcustomMem) versions of API . IMHO that would be good. If a 3rd party componentallows us to inject a custom memory allocator and we do have it why not use this feature? > > * Most zs_foo() functions do not check ptr arguments, though some like zs_free() do. If we're speaking about a "commonAPI" that can be used by a wide range of different modules around the project, such a relaxed way to treat input argumentsIMHO can be an evil. Or at least add Assert(ptr) assertions so they can be catched up in debug mode. > > * For zs_create() function which must be called first to create context for further operations, a compression method ispassed as integer. This method is used inside zs_create() as index inside an array of compress implementation descriptors.There are several problems: > 1) Method ID value is not checked for validity. By passing an invalid method ID we can easily get out of array bounds. > 2) Content of array depends on on configuration options e.g. HAVE_LIBZSTD, HAVE_LIBZ etc. So index inside this array isnot persistent. For some config options combination method ID with value 0 may refer to ZSTD and for others to zlib. > 3) There's no defines/enums/etc in public z_stream.h header that would define which value should be passed to create aspecific compressor. Users have to guess it or check the code. Fixed almost all of these issues, except the malloc/free-related stuff (will fix this later). On Fri, Mar 19, 2021 at 11:02 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > Also, I'm not sure, but I think you may find that the zstd configure.ac should > use pkgconfig. This allowed the CIs to compile these patches. Without > pkg-config, the macos CI couldn't find (at least) LZ4.k > https://commitfest.postgresql.org/32/2813/ > https://commitfest.postgresql.org/32/3015/ Now --with-zstd uses pkg-config to link the ZSTD library and works correctly on macos. I would appreciate hearing your thoughts on the new version of the patch, Daniil Zakhlystov
Forgot to attach the updated patch :)
Attachment
> On 29 Jul 2021, at 16:57, Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: > > Forgot to attach the updated patch :) This fails to build on Windows due to the use of strcasecmp: + if (strcasecmp(supported_algorithms[zpq->compressors[i].impl], "zstd") == Was that meant to be pg_strcasecmp? -- Daniel Gustafsson https://vmware.com/
> On 2 Sep 2021, at 00:29, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 29 Jul 2021, at 16:57, Daniil Zakhlystov <usernamedt@yandex-team.ru> wrote: >> >> Forgot to attach the updated patch :) > > This fails to build on Windows due to the use of strcasecmp: > > + if (strcasecmp(supported_algorithms[zpq->compressors[i].impl], "zstd") == > > Was that meant to be pg_strcasecmp? To keep this thread from stalling, attached is a rebased patchset with the above mentioned fix to try and get this working on Windows. -- Daniel Gustafsson https://vmware.com/
Attachment
On Fri, Oct 01, 2021 at 11:20:09PM +0200, Daniel Gustafsson wrote: > To keep this thread from stalling, attached is a rebased patchset with the > above mentioned fix to try and get this working on Windows. This patch has been waiting on author for two months now, so I have marked it as RwF in the CF app. -- Michael