Re: AW: update on TOAST status' - Mailing list pgsql-hackers
From | JanWieck@t-online.de (Jan Wieck) |
---|---|
Subject | Re: AW: update on TOAST status' |
Date | |
Msg-id | 200007121241.OAA23586@hot.jw.home Whole thread Raw |
In response to | AW: update on TOAST status' (Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>) |
Responses |
Re: AW: update on TOAST status'
|
List | pgsql-hackers |
Zeugswetter Andreas SB wrote: > > > > I don't like that --- seems it would put a definite crimp in the > > > whole point of TOAST, which is not to have arbitrary limits on field > > > sizes. > > > > If we can solve it, let's do so. If we cannot, let's restrict > > it for 7.1. > > How are you doing the index toasting currently ? Is it on the same > line as table toasting ? That is: toast some index column values if the key > exceeds 2k ? The current CVS is broken in that area. You'll notice as soon as you have many huge "text" values in an index, updatethem, vacuum and continue to update. The actual behaviour of the toaster is to toast each tuple until it has a delicious looking, brown and crispy surface. The indicator for beeing delicious is that it shrank below MaxTupleSize/4 - that's a little less than 2K ina default 8K blocksize setup. It then sticks the new tuple into the HeapTuple's t_data pointer. Index inserts are allways done after heap_insert() or heap_update(). At that time, the index tuples will bebuilt from the values found in the now replaced heap tuple. And since the heap tuple found now is allways smallerthan 2K, any combination of attributes out of it must be too (it's impossible to specify one and the sameattribute multiple times in one index). So the indices simply inherit the toasting result. If a value got compressed, the index will store the compressedformat. If it got moved off, the index will hold the toast entry reference for it. One of the biggest advantages is this: In the old system, an indexed column of 2K caused 2K be stored in the heap plus 2K stored in the index. Plus all the 2K instances in upper index block range specs. Now, the heap and the index will only hold references or compressed items. Absolutely no problem for compressed items. All information to recreate the original value is in the Datum itself. For external stored ones, the reference tells the OIDs of the secondary relation and it's index (where to find thedata of this entry), a unique identifier of the item (another OID) and some other info. So the referencecontains all the information required to fetch the data just by looking at the reference. And since the detoaster scans the secondary relation with a visibility of SnapShotAny, it'll succeed to find them even if they'vebeen deleted long ago by another committed transaction. So index traversal will succeed on that in any case. What I didn't knew at the time of implementation is, that btree indices can keep such a reference in upper levelblocks range specifications even after a vacuum successfully deleted the index tuple holding the reference itself. That's the current pity. Thus, if vacuum finally removed deleted tuples from the secondary relations (after the heap and index have been vacuumed), the detoaster cannot find those entries, referenced by upper index blocks, any more. Maybe we could propagate key range changes into upper blocks at index_delete() time. Will look at the btree code now. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-hackers by date: