Re: libpq compression - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: libpq compression
Date
Msg-id 579f308e-c46c-2e79-3f90-9846450a3d71@enterprisedb.com
Whole thread Raw
In response to Re: libpq compression  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: libpq compression
List pgsql-hackers

On 12/22/20 7:31 PM, Andrey Borodin wrote:
> 
> 
>> 22 дек. 2020 г., в 23:15, Tomas Vondra <tomas.vondra@enterprisedb.com> написал(а):
>>
>>
>>
>> On 12/22/20 6:56 PM, Robert Haas wrote:
>>> On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov
>>> <usernamedt@yandex-team.ru> wrote:
>>>> When using bidirectional compression, Postgres resource usage correlates with the selected compression level. For
example,here is the Postgresql application memory usage:
 
>>>>
>>>> No compression - 1.2 GiB
>>>>
>>>> ZSTD
>>>> zstd:1 - 1.4 GiB
>>>> zstd:7 - 4.0 GiB
>>>> zstd:13 - 17.7 GiB
>>>> zstd:19 - 56.3 GiB
>>>> zstd:20 - 109.8 GiB - did not succeed
>>>> zstd:21, zstd:22  > 140 GiB
>>>> Postgres process crashes (out of memory)
>>> Good grief. So, suppose we add compression and support zstd. Then, can
>>> unprivileged user capable of connecting to the database can negotiate
>>> for zstd level 1 and then choose to actually send data compressed at
>>> zstd level 22, crashing the server if it doesn't have a crapton of
>>> memory? Honestly, I wouldn't blame somebody for filing a CVE if we
>>> allowed that sort of thing to happen. I'm not sure what the solution
>>> is, but we can't leave a way for a malicious client to consume 140GB
>>> of memory on the server *per connection*. I assumed decompression
>>> memory was going to measured in kB or MB, not GB. Honestly, even at
>>> say L7, if you've got max_connections=100 and a user who wants to make
>>> trouble, you have a really big problem.
>>> Perhaps I'm being too pessimistic here, but man that's a lot of memory.
>>
>> Maybe I'm just confused, but my assumption was this means there's a memory leak somewhere - that we're not
resetting/freeingsome piece of memory, or so. Why would zstd need so much memory? It seems like a pretty serious
disadvantage,so how could it become so popular?
 
> 
> AFAIK it's 700 clients. Does not seem like super high price for big traffic\latency reduction.
> 

I don't see aby benchmark results in this thread, allowing me to make 
that conclusion, and I find it hard to believe that 200MB/client is a 
sensible trade-off.

It assumes you have that much memory, and it may allow easy DoS attack 
(although maybe it's not worse than e.g. generating a lot of I/O or 
running expensive function). Maybe allowing limiting the compression 
level / decompression buffer size in postgresql.conf would be enough. Or 
maybe allow disabling such compression algorithms altogether.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andrey Borodin
Date:
Subject: Re: libpq compression
Next
From: Tom Lane
Date:
Subject: Re: libpq compression