Re: TOAST vs arrays - Mailing list pgsql-hackers

From JanWieck@t-online.de (Jan Wieck)
Subject Re: TOAST vs arrays
Date
Msg-id 200007181058.MAA10729@hot.jw.home
Whole thread Raw
In response to TOAST vs arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TOAST vs arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> If I understand the fundamental design of TOAST correctly, it's not
> allowed to have multiple heap tuples containing pointers to the same
> moved-off TOAST item.  For example, if one tuple has such a pointer,
> and we copy it with INSERT ... SELECT, then the new tuple has to be
> constructed with its own copy of the moved-off item.  Without this
> you'd need reference counts and so forth for moved-off values.
>
> It looks like you have logic for all this in tuptoaster.c, but
> I see a flaw: the code doesn't look inside array fields to see if
> any of the array elements are pre-toasted values.  There could be
> a moved-off-item pointer inside an array, copied from some other
> place.
>
> Note the fact that arrays aren't yet considered toastable is
> no defense.  An array of a toastable data type is sufficient
> to create the risk.
   Yepp

> What do you want to do about this?  We could have heap_tuple_toast_attrs
> scan through all the elements of arrays of toastable types, but that
> strikes me as slow.  I'm thinking the best approach is for the array
> construction routines to refuse to insert toasted values into array
> objects in the first place --- instead, expand them before insertion.
> Then the whole array could be treated as a toastable object, but there
> are no references inside the array to worry about.
   I think the array construction routines is the right place to   expand them.

> If we do that, should compressed-in-place array items be expanded back
> to full size before insertion in the array?  If we don't, we'd likely
> end up trying to compress already-compressed data, which is a waste of
> effort ... but OTOH it seems a shame to force the data back to full
> size unnecessarily.  Either way would work, I'm just not sure which
> is likely to be more efficient.
   I think it's not too bad to expand  them  and  then  let  the   toaster  (eventually) compress the entire array
again.Larger   data usually yields better  compression  results.  Given  the   actual  speed  of  our  compression
code, I  don't expect a   performance penalty from it.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




pgsql-hackers by date:

Previous
From: JanWieck@t-online.de (Jan Wieck)
Date:
Subject: Re: TUPLE SIZE HELP
Next
From: JanWieck@t-online.de (Jan Wieck)
Date:
Subject: Re: pltcl regress test?