Re: Composite Datums containing toasted fields are a bad idea(?) - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Composite Datums containing toasted fields are a bad idea(?)
Date
Msg-id 20140329183900.GB170273@tornado.leadboat.com
Whole thread Raw
In response to Composite Datums containing toasted fields are a bad idea(?)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Composite Datums containing toasted fields are a bad idea(?)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Interesting bug.

On Fri, Mar 28, 2014 at 04:34:41PM -0400, Tom Lane wrote:
> I think we might be better off to get rid of toast_flatten_tuple_attribute
> and instead insist that composite Datums never contain any external toast
> pointers in the first place.  That is, places that call heap_form_tuple
> to make a composite Datum (rather than a tuple that's due to be stored
> to disk) would be responsible for detoasting any fields with external
> values first.  We could make a wrapper routine for heap_form_tuple to
> take care of this rather than duplicating the code in lots of places.
> 
> >From a performance standpoint this is probably a small win.  In cases
> where a composite Datum is formed and ultimately saved to disk, it should
> be a win, since we'd have to detoast those fields anyway, and we can avoid
> the overhead of an extra disassembly and reassembly of the composite
> value.  If the composite value is never sent to disk, it's a bit harder
> to tell: we lose if the specific field value is never extracted from the
> composite, but on the other hand we win if it's extracted more than once.

Performance is the dominant issue here; the hacker-centric considerations you
outlined vanish next to it.  Adding a speculative detoast can regress by a
million-fold the performance of a query that passes around, without ever
dereferencing, toast pointers to max-size values.  Passing around a record
without consulting all fields is a credible thing to do, so I'd scarcely
consider taking the performance risk of more-aggressively detoasting every
composite.  That other cases win is little comfort.  Today's PostgreSQL
applications either suffer little enough to care or have already taken
measures to avoid duplicate detoasts.  Past discussions have examined general
ways to avoid repetitive detoast traffic; we'll have something good when it
can do that without imposing open-ended penalties on another usage pattern.

Making the array construction functions use toast_flatten_tuple_attribute()
carries a related performance risk, more elusive yet just as costly when
encountered.  That much risk seems tolerable for 9.4, though.  I won't worry
about performance regressions for range types; using a range to marshal huge
values you'll never detoast is a stretch.

> In any case, adding the extra syscache lookups involved in doing
> toast_flatten_tuple_attribute in lots more places isn't appealing.

True; alas.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: separate output dirs for test decoding pieces.
Next
From: steve k
Date:
Subject: Re: PQputCopyData dont signal error