Zeugswetter Andreas SB wrote:
>
> > > I don't like that --- seems it would put a definite crimp in the
> > > whole point of TOAST, which is not to have arbitrary limits on field
> > > sizes.
> >
> > If we can solve it, let's do so. If we cannot, let's restrict
> > it for 7.1.
>
> How are you doing the index toasting currently ? Is it on the same
> line as table toasting ? That is: toast some index column values if the key
> exceeds 2k ?
The current CVS is broken in that area. You'll notice as soon as you have many huge "text" values in an index,
updatethem, vacuum and continue to update.
The actual behaviour of the toaster is to toast each tuple until it has a delicious looking, brown and crispy
surface. The indicator for beeing delicious is that it shrank below MaxTupleSize/4 - that's a little less than 2K
ina default 8K blocksize setup.
It then sticks the new tuple into the HeapTuple's t_data pointer.
Index inserts are allways done after heap_insert() or heap_update(). At that time, the index tuples will
bebuilt from the values found in the now replaced heap tuple. And since the heap tuple found now is allways
smallerthan 2K, any combination of attributes out of it must be too (it's impossible to specify one and the
sameattribute multiple times in one index).
So the indices simply inherit the toasting result. If a value got compressed, the index will store the
compressedformat. If it got moved off, the index will hold the toast entry reference for it.
One of the biggest advantages is this: In the old system, an indexed column of 2K caused 2K be stored in the heap
plus 2K stored in the index. Plus all the 2K instances in upper index block range specs. Now, the heap and the
index will only hold references or compressed items.
Absolutely no problem for compressed items. All information to recreate the original value is in the Datum
itself.
For external stored ones, the reference tells the OIDs of the secondary relation and it's index (where to find
thedata of this entry), a unique identifier of the item (another OID) and some other info. So the
referencecontains all the information required to fetch the data just by looking at the reference. And since the
detoaster scans the secondary relation with a visibility of SnapShotAny, it'll succeed to find them even if
they'vebeen deleted long ago by another committed transaction. So index traversal will succeed on that in any
case.
What I didn't knew at the time of implementation is, that btree indices can keep such a reference in upper
levelblocks range specifications even after a vacuum successfully deleted the index tuple holding the reference
itself. That's the current pity.
Thus, if vacuum finally removed deleted tuples from the secondary relations (after the heap and index
have been vacuumed), the detoaster cannot find those entries, referenced by upper index blocks, any
more.
Maybe we could propagate key range changes into upper blocks at index_delete() time. Will look at the btree code
now.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #