Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data
Date
Msg-id 20200625064541.GH130132@paquier.xyz
Whole thread Raw
In response to Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data  (Michael Paquier <michael@paquier.xyz>)
List pgsql-bugs
On Wed, Jun 24, 2020 at 04:59:21PM +0900, Michael Paquier wrote:
> FYI, I have begun looking at this report, and I am reviewing the
> proposed patch.

Okay.  First I found the test case you added a bit hard to parse, so I
have refactored it with a CTE in charge of building the random string,
with the main query doing the compression/decompression and a check
making sure that the original string and the result match.  I quite
liked the logic with lpad() to append zeros if the computation with
random()*256 returned a result less than 16, as well as the use of
string_agg() for that purpose to build the string.  I have also
switched the second arguments of the functions to just use 'key', for
readability.  Using a random string sounds good to me here.  It could
always be possible that we finish with something less random, causing
it to to become compressed but I'll believe in the rule of chaos
here.

Then, the fix you are proposing is to simply make sure that all the
input from the source stream is properly consumed even after the zlib
stream has ended in this corner case thanks to pullf_read(), and that
sounds good to me.  However, I had an idea slightly different than
yours, consisting of simply reading the contents of the source before
checking if there is any available in the decompressed buffer (the
check on buf_data before the goto restart step).  That makes the fix a
bit simpler, without changing the logic.

I am also attaching an extra script I used to validate this stuff
based on the regression test of the patch, that has allowed me to
check the logic for random strings up to 33kB, like that for example
(this one took 8 mins on my laptop):
select count(test_crypto(len)) from generate_series(0, 33000) as len;

The inputs and outputs perfectly matched in all my tests, with the
16kB string being the only one failing in the range I have tested on
HEAD, test passing with the patch of course.  One or more extra pairs
of eyes is welcome, so please feel free to look at the version
attached.

Thanks,
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #16508: using multi-host connection string when the first host is starting fails
Next
From: Etsuro Fujita
Date:
Subject: Re: Very slow inserts when using postgres_fdw + declarative partitioning