Re: Optimize partial TOAST decompression - Mailing list pgsql-hackers

From Paul Ramsey
Subject Re: Optimize partial TOAST decompression
Date
Msg-id CACowWR1U3guqBzqPL1rHWxYBpU-yUQzr+s07MPT9qBg8LGb+uA@mail.gmail.com
Whole thread Raw
In response to Re: Optimize partial TOAST decompression  (Binguo Bao <djydewang@gmail.com>)
Responses Re: Optimize partial TOAST decompression  (Binguo Bao <djydewang@gmail.com>)
List pgsql-hackers
On Mon, Jul 1, 2019 at 6:46 AM Binguo Bao <djydewang@gmail.com> wrote:
> > Andrey Borodin <x4mmm@yandex-team.ru> 于2019年6月29日周六 下午9:48写道:
>> I've took a look into the code.
>> I think we should extract function for computation of max_compressed_size and put it somewhere along with pglz code.
Justin case something will change something about pglz so that they would not forget about compression algorithm
assumption.
>>
>> Also I suggest just using 64 bit computation to avoid overflows. And I think it worth to check if
max_compressed_sizeis whole data and use min of (max_compressed_size, uncompressed_data_size). 
>>
>> Also you declared needsize and max_compressed_size too far from use. But this will be solved by function extraction
anyway.
>>
> Thanks for the suggestion.
> I've extracted function for computation for max_compressed_size and put the function into pg_lzcompress.c.

This looks good to me. A little commentary around why
pglz_maximum_compressed_size() returns a universally correct answer
(there's no way the compressed size can ever be larger than this
because...) would be nice for peasants like myself.

If you're looking to continue down this code line in your next patch,
the next TODO item is a little more involved: a user-land (ala
PG_DETOAST_DATUM) iterator API for access of TOAST datums would allow
the optimization of searching of large objects like JSONB types, and
so on, where the thing you are looking for is not at a known location
in the object. So, things like looking for a particular substring in a
string, or looking for a particular key in a JSONB. "Iterate until you
find the thing." would allow optimization of some code lines that
currently require full decompression of the objects.

P.



pgsql-hackers by date:

Previous
From: Anthony Nowocien
Date:
Subject: Re: progress report for ANALYZE
Next
From: Rui Hai Jiang
Date:
Subject: TopoSort() fix