Re: Applying TOAST to CURRENT - Mailing list pgsql-hackers
From | JanWieck@t-online.de (Jan Wieck) |
---|---|
Subject | Re: Applying TOAST to CURRENT |
Date | |
Msg-id | 200005311611.SAA19110@hot.jw.home Whole thread Raw |
In response to | Re: Applying TOAST to CURRENT (Hannu Krosing <hannu@tm.ee>) |
List | pgsql-hackers |
Hannu Krosing wrote: > > I've made some good experiences with having the toaster > > trying to keep the main tuple size below 1/4 of MaxTupleSize > > (BLKSIZE - block header). > > Can't _that_ behaviour be made modifyable by some setting ? Good point. There is already a fine tuning option per table attribute, where someone can tell things like "forget about compression for this attribute" or "try keeping in main tuple and toast others first". Theres no utility command up to now to customize them, but an UPDATE pg_attribute does it already. Seems another value in pg_class, telling the toaster what max size to try, would be a good idea. > What is the priority of checks on indexed fetch? > > I mean if we do "SELECT * FROM ttable WHERE toasted LIKE 'ab%' " > > DO we first scan by index to 'ab%', then check if tuple is live and > after that to the LIKE comparison ? That's the current behaviour, and TOAST doesn't change it. There was discussion already about index tuple toasting. Indices have different size constraints and other featuresso they cannot share exactly the same toasting scheme as heap tuples. I'm still not sure if supporting indices on huge values is worth the efford. Many databases have some limit on the size of index entries, and noone seems to really care for that. > > If no external table exists, the toaster will try the <2K > > thing by compression only. If the resulting tuple fits into > > the 8K limit, it's OK. > > Would it not be faster/cleaner to check some configuration variable > than the existance of toest table ? The toast tables and indexes OID are stored in pg_class. An open Relation has reference to the pg_class row, so it's simply comparing that to INVALID_OID. No wasted time here. > Do our current (btree/hash) indexes support toast ? Not hard tested yet. At least, they don't support it if toasting would be required to make the index tuple fit, but the heap toaster is already happy with it. The tuple is modified in place at heap_insert(). So the later index_insert() will use the Datums found there to build the index tuples, either plain or toast reference, whatever the toaster left. > > If not, will they ? Not planned for 7.1. Maybe we can workout a solution for unlimited index entries after that. > > Hmmm - thinking about that it doesn't sound bad if we allways > > create a secondary relation at CREATE TABLE time, but NOT the > > index for it. And at VACUUM time we create the index if it > > doesn't exist AND there is external stored data. > > Is there a plan to migrate to some combined index/database table for > at least toast tables later ? No. But we plan a general overwriting storage manager, so that might not be an issue at all. > For at least toast tables it seems feasible to start using the > originally planned tuple-spanning mechanisms, unless we plan > migrating LOs to toast table at some point which would make index-less > tuple chaining a bad idea as it would make seeking on really large > LOs slow. I've never seen a complete proposal for tuple-spanning. The toaster breaks up the large Datum into chunks. Thereis a chunk number, so modifying the index to be a multi-attribute one would gain direct access to a chunk. That should make seeks reasonably fast. > > Another benefit would be, that reloads should be faster > > because with this technique, the toaster doesn't need to > > insert index tuples during the load. The indices are created > > later at VACUUM after reload. > > AFAIK reloads (from pg_dump at least) create indexes after LOAD'ing data Finally the toast table will have another relkind, so it'll not be accessible by normal SQL. The toaster acts onthese tables more hardwired like on system catalogs. It expects a fixed schema and uses direct heap access. Due to the different relkind, a dump wouldn't be able to delay the index creation. > But do we need it ? > > [...] > > You would need 6400 toast tables to consume 1% of the smallest currently > available (10GB) disk. > > If that is a concern this can probably be cured by good docs that say > in detail which datatypes cause toast tables an which don't. We plan to make ALL variable size builtin types toastable. So this list would name them all :-). But this 6400 = 1% really is the point. Let's forget about the 16K and create the toast table allways (as soonas the main table has toastable attributes). Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-hackers by date: