On 8/5/2007 6:30 PM, Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
>> (Incidentally, this means what I said earlier about uselessly trying to
>> compress objects below 256 is even grosser than I realized. If you have a
>> single large object which even after compressing will be over the toast target
>> it will force *every* varlena to be considered for compression even though
>> they mostly can't be compressed. Considering a varlena smaller than 256 for
>> compression only costs a useless palloc, so it's not the end of the world but
>> still. It does seem kind of strange that a tuple which otherwise wouldn't be
>> toasted at all suddenly gets all its fields compressed if you add one more
>> field which ends up being stored externally.)
>
> Yeah. It seems like we should modify the first and third loops so that
> if (after compression if any) the largest attribute is *by itself*
> larger than the target threshold, then we push it out to the toast table
> immediately, rather than continuing to compress other fields that might
> well not need to be touched.
I agree with the general lack of sanity in the logic and think this one
is a good starter.
Another optimization to think about would eventually be to let the
compressor abort the attempt after the first X bytes had to be copied
literally. People do have the possibility to disable compression on a
per column base, but how many actually do so? and if the first 100,000
bytes of a 10M attribute can't be compressed, it is very likely that the
input is compressed already.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #