Re: update on TOAST status' - Mailing list pgsql-hackers

From JanWieck@t-online.de (Jan Wieck)
Subject Re: update on TOAST status'
Date
Msg-id 200007070005.CAA20517@hot.jw.home
Whole thread Raw
In response to Re: update on TOAST status'  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: update on TOAST status'  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> -   VACUUM  doesn't  propagate  ANALYZE  to   TOAST   tables.
> >> Statistics for them are needless because the toast access
> >> is allways hardcoded indexed.
>
> > I don't think statistics are insignificant for TOASTed columns.
>
> He didn't say that!  I think what he meant is that there's no need for
> statistics associated with the TOAST table itself, and AFAICS that's OK.
>
> BTW, I have thought of a potential problem with indexes on toasted
> columns.  As I understand Jan's current thinking, the idea is
>
> 1. During storage of the tuple in the main table, any oversize fields
> get compressed/moved off.
>
> 2. The toasted item in the finished main tuple gets handed to the index
> routines to be stored in the index.
   Right.

> Now, storing the toasted item in the index tuple seems fine, but what
> I do not like here is the implication that all the comparisons needed
> to find where to *put* the index tuple are done using a pretoasted
> value.  That seems to imply dozens of redundant decompressions/fetches,
> another one for each key comparison we have to do.
   Dozens - right.
   I just did a little gdb session tracing a
       SELECT ... WHERE toasted = 'xxx'
   The table has 151 rows and an index on 'toasted'. It needed 6   fetches of the attribute. Better than good,  because
2^6  is   only 64, so btree did a perfect job. Anyhow, in the case of a   real TOASTed (read burned) value, it'd mean 6
indexscans  to   recreate   the   on   disk   stored   representation  plus  6   decompression loops to get the plain
oneto compare  against.   What the hell would an "IN (SELECT ...)" cause?
 

> Jan, do you have a way around this that I missed?
>
> One simple answer that might help for other scenarios too is to keep
> a small cache of the last few values that had to be untoasted.  Maybe
> we only need it for moved-off values --- it could be that decompression
> is fast enough that we should just do it over rather than trying to
> cache.
   I'm  still argueing that indexing huge values is a hint for a   misleading schema.  If  this  is  true,  propagating
toasted   attributes  into indices is a dead end street and I'd have to   change  the  heap-access<->toaster  interface
so  that   the   modified  (stored)  main tuple isn't visible to the following   code (that does the index inserts).
 
   What is the value of supporting index tuples >2K? Support  of   braindead schemas? I can live withoout it!


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




pgsql-hackers by date:

Previous
From: rsand@vgalleries.com (Richard Sand)
Date:
Subject: Re: Lessons learned on how to build 7.0.2 on AIX 4.x
Next
From: JanWieck@t-online.de (Jan Wieck)
Date:
Subject: Re: Memo on coding practices: strcmp() does not yield bool