Re: libpq compression - Mailing list pgsql-hackers

From Daniil Zakhlystov
Subject Re: libpq compression
Date
Msg-id BF36ABF2-DFD2-4981-9B28-50B90D866A4E@yandex-team.ru
Whole thread Raw
In response to Re: libpq compression  (Ian Zagorskikh <izagorskikh@cloudlinux.com>)
List pgsql-hackers
> On 21 Apr 2021, at 20:35, Ian Zagorskikh <izagorskikh@cloudlinux.com> wrote:
>
> Let me drop my humble 2 cents in this thread. At this moment I checked only
0001-Add-zlib-and-zstd-streaming-compressionpatch. With no order: 
>
> * No checks for failures. For example, value from malloc() is not checked for NULL and used as-is right after the
call.Also as a result possibly NULL values passed into ZSTD functions which are explicitly not NULL-tolerant and so
dereferencepointers without checks. 
>
> * Used memory management does not follow the schema used in the common module. Direct calls to std C malloc/free are
hardcoded. AFAIU this is not backend friendly. Looking at the code around I believe they should be wrapped to
ALLOC/FREElocal macroses that depend on FRONTEND define. As it's done for example. in src/common/hmac.c. 
>
> * If we're going to fix our code to be palloc/pfree friendly there's also a way to pass our custom allocator
callbacksinside ZSTD functions. By default ZSTD uses malloc/free but it can be overwritten by the caller in e.g.
ZSTD_createDStream_advanced(ZSTD_customMemcustomMem) versions of API . IMHO that would be good. If a 3rd party
componentallows us to inject  a custom memory allocator and we do have it why not use this feature? 
>
> * Most zs_foo() functions do not check ptr arguments, though some like zs_free() do. If we're speaking about a
"commonAPI" that can be used by a wide range of different modules around the project, such a relaxed way to treat input
argumentsIMHO can be an evil. Or at least add Assert(ptr) assertions so they can be catched up in debug mode. 
>
> * For zs_create() function which must be called first to create context for further operations, a compression method
ispassed as integer. This method is used inside zs_create() as index inside an array of compress implementation
descriptors.There are several problems: 
> 1) Method ID value is not checked for validity. By passing an invalid method ID we can easily get out of array
bounds.
> 2) Content of array depends on on configuration options e.g. HAVE_LIBZSTD, HAVE_LIBZ etc. So index inside this array
isnot persistent. For some config options combination method ID with value 0 may refer to ZSTD and for others to zlib. 
> 3) There's no defines/enums/etc in public z_stream.h header that would define which value should be passed to create
aspecific compressor. Users have to guess it or check the code. 
>
> * I have a feeling after reading comments for ZSTD compress/decompress functions that their return value is treated
ina wrong way handling only success path. Though need more checks/POCs on that. 
>
> In general, IMHO the idea of a generic compress/decompress API is very good and useful. But specific implementation
needssome code cleanup/refactoring. 

Hi,

thank you for the detailed review. Previously in the thread, Justin Pryzby mentioned the preliminary patch which will
becommon between the different compression patches. Basically, this is a quick prototype of that patch and it
definitelyneeds some additional effort. It would be also great to have a top-level overview of the
0001-Add-zlib-and-zstd-streaming-compressionpatch from Justin to make sure it can be potentially be used for the other
compressionpatches. 

Currently, I’m in the process of implementing the negotiation of asymmetric compression, but I can take a look at these
issuesafter I finish with it. 

> One little fix to 0001-Add-zlib-and-zstd-streaming-compression patch for configure.ac
>
> ================
> @@ -1455,6 +1456,7 @@ fi
>
>  if test "$with_lz4" = yes; then
>    AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])])
> +fi
> ================
>
> Otherwise autoconf generates a broken configure script.

Added your fix to the patch and rebased it onto the current master.

—
Daniil Zakhlystov
Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: multi-install PostgresNode fails with older postgres versions
Next
From: Tom Lane
Date:
Subject: Re: INT64_FORMAT in translatable strings