Re: libpq compression - Mailing list pgsql-hackers
From | Justin Pryzby |
---|---|
Subject | Re: libpq compression |
Date | |
Msg-id | 20210109203119.GL1849@telsasoft.com Whole thread Raw |
In response to | Re: libpq compression (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: libpq compression
|
List | pgsql-hackers |
On Thu, Dec 17, 2020 at 05:54:28PM +0300, Konstantin Knizhnik wrote: > I am maintaining this code in > git@github.com:postgrespro/libpq_compression.git repository. > I will be pleased if anybody, who wants to suggest any bug > fixes/improvements of libpq compression, create pull requests: it will be > much easier for me to merge them. Thanks for working on this. I have a patch for zstd compression in pg_dump so I looked at your patch. I'm attaching some language fixes. > +zstd_create(int level, zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg, char* rx_data, size_t rx_data_size) > +zlib_create(int level, zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg, char* rx_data, size_t rx_data_size) > +build_compressors_list(PGconn *conn, char** client_compressors, bool build_descriptors) Are you able to run pg_indent to fix all the places where "*" is before the space ? (And similar style issues). There are several compression patches in the commitfest, I'm not sure how much they need to be coordinated, but for sure we should coordinate the list of compressions available at compile time. Maybe there should be a central structure for this, or maybe just the ID numbers of compressors should be a common enum/define. In my patch, I have: +struct compressLibs { + const CompressionAlgorithm alg; + const char *name; /* Name in -Z alg= */ + const char *suffix; /* file extension */ + const int defaultlevel; /* Default compression level */ +}; Maybe we'd also want to store the "magic number" of each compression library. Maybe there'd also be a common parsing of compression options. You're supporting a syntax like zlib,zstd:5, but zstd also supports long-range, checksum, and rsyncable modes (rsyncable is relevant to pg_dump, but not to libpq). I think your patch has an issue here. You have this: src/interfaces/libpq/fe-connect.c + pqGetc(&resp, conn); + index = resp; + if (index == (char)-1) + { + appendPQExpBuffer(&conn->errorMessage, + libpq_gettext( + "server is not supported requested compression algorithms%s\n"), + conn->compression); + goto error_return; + } + Assert(!conn->zstream); + conn->zstream = zpq_create(conn->compressors[index].impl, + conn->compressors[index].level, + (zpq_tx_func)pqsecure_write, (zpq_rx_func)pqsecure_read,conn, + &conn->inBuffer[conn->inCursor], conn->inEnd-conn->inCursor); This takes the "index" returned by the server and then accesses conn->compressors[index] without first checking if the index is out of range, so a malicious server could (at least) crash the client by returning index=666. I suggest that there should be an enum of algorithms, which is constant across all servers. They would be unconditionally included and not #ifdef depending on compilation options. That would affect the ZpqAlgorithm data structure, which would include an ID number similar to src/bin/pg_dump/compress_io.h:typedef enum...CompressionAlgorithm; The CompressionAck would send the ID rather than the "index". A protocol analyzer like wireshark could show "Compression: Zstd". You'd have to verify that the ID is supported (and not bogus). Right now, when I try to connect to an unpatched server, I get: psql: error: expected authentication request from server, but received v +/* + * Array with all supported compression algorithms. + */ +static ZpqAlgorithm const zpq_algorithms[] = +{ +#if HAVE_LIBZSTD + {zstd_name, zstd_create, zstd_read, zstd_write, zstd_free, zstd_error, zstd_buffered_tx, zstd_buffered_rx}, +#endif +#if HAVE_LIBZ + {zlib_name, zlib_create, zlib_read, zlib_write, zlib_free, zlib_error, zlib_buffered_tx, zlib_buffered_rx}, +#endif + {no_compression_name} +}; In config.sgml, it says that libpq_compression defaults to on (on the server side), but in libpq.sgml it says that it defaults to off (on the client side). Is that what's intended ? I would've thought the defaults would match, or that the server would enforce a default more conservative than the client's (the DBA should probably have to explicitly enable compression, and need to "opt-in" rather than "opt-out"). Maybe instead of a boolean, this should be a list of permitted compression algorithms. This allows the admin to set a "policy" rather than using the server's hard-coded preferences. This could be important to disable an algorithm at run-time if there's a vulnerability found, or performance problem, or buggy client, or for diagnostics, or performance testing. Actually, I think it may be important to allow the admin to disable not just an algorithm, but also its options. It should be possible for the server to allow "zstd" compression but not "zstd --long", or zstd:99 or arbitrarily large window sizes. This seems similar to SSL cipher strings. I think we'd want to be able to allow (or prohibit) at least alg:* (with options) or alg (with default options). Your patch documents "libpq_compression=auto" but that doesn't work: WARNING: none of specified algirthms auto is supported by client I guess you mean "any" which is what's implemented. I suggest to just remove that. I think maybe your patch should include a way to trivially change the client's compile-time default: "if (!conn->compression) conn->compression = DefaultCompression" Your patch warns if *none* of the specified algorithms are supported, but I wonder if it should warn if *any* of them are unsupported (like if someone writes libz instead of zlib, or zst/libzstd instead of zstd, which I guess about half of us will do). $ LD_LIBRARY_PATH=./src/interfaces/libpq PGCOMPRESSION=abc,def src/bin/psql/psql 'host=localhost port=1111' WARNING: none of the specified algorithms are supported by client: abc,def $ LD_LIBRARY_PATH=./src/interfaces/libpq PGCOMPRESSION=abc,zlib src/bin/psql/psql 'host=localhost port=1111' (no warning) The libpq_compression GUC can be set for a user, like ALTER ROLE .. SET ... Is there any utility in making it configurable by client address? Or by encryption? I'm thinking of a policy like "do not allow compression from LOCAL connection" or "do not allow compression on encrypted connection". Maybe this would be somehow integrated into pg_hba. But maybe it's not needed (a separate user could be used to allow/disallow compression). -- Justin
Attachment
pgsql-hackers by date: