Re: Applying TOAST to CURRENT - Mailing list pgsql-hackers

From JanWieck@t-online.de (Jan Wieck)
Subject Re: Applying TOAST to CURRENT
Date
Msg-id 200005310110.DAA14672@hot.jw.home
Whole thread Raw
In response to Re: Applying TOAST to CURRENT  (The Hermit Hacker <scrappy@hub.org>)
Responses Re: Applying TOAST to CURRENT  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
The Hermit Hacker wrote:
>
> have to third this one ... I think it should be totally transparent to the
> admin/user ... just create it when the table is created, what's the worst
> case scenario?  it never gets used and you waste 16k of disk space?
>
   Not exactly.
   I've  made  some  good  experiences  with  having the toaster   trying to keep the main tuple size below 1/4 of
MaxTupleSize  (BLKSIZE  -  block  header).   Remember  that external stored   attributes are only fetched from the
secondary relation  if   really  needed  (when the result set is sent to the client or   if explicitly used in the
query).So in a usual case, where a   relatively  small  amount of the entire data is retrieved and   key attributes are
small,it's a win. With this  config  more   main tuples fit into one block, and if the attributes used in   the WHERE
clause aren't  stored  external,  the  result  set   (including  sort  and  group  actions)  can be collected with
fewerblock reads. Only those big  values,  that  the  client   really wanted, have to be fetched at send time.
 
   If  no  external  table  exists, the toaster will try the <2K   thing by compression only. If the resulting tuple
fits into   the  8K  limit,  it's OK. But if a secondary relation exists,   it'll store external to make the tuple <2K.
Thus, a 4K or 6K   tuple,  that  actually  fits  and would be stored in the main   table, will cause the toaster to
jumpin if we allways create   the secondary table.
 
   Hmmm - thinking about that it doesn't sound bad if we allways   create a secondary relation at CREATE TABLE time,
butNOT the   index  for  it.  And at VACUUM time we create the index if it   doesn't exist AND there is external stored
data.
   The table is prepared for external  storage  allways  and  we   avoid  the  risks  from  creating  tables  in
possiblylater   aborting transactions or due to concurrency  issues.  But  we   don't  waste  the  index space for
reallyallways-small-tuple   tables.
 
   Another benefit would  be,  that  reloads  should  be  faster   because  with  this  technique,  the  toaster
doesn'tneed to   insert index tuples during the load. The indices are  created   later at VACUUM after reload.
 
   The  toaster  needs  to  use sequential scans on the external   table until the next vacuum  run,  but  index  usage
allways   depends on vacuum so that's not a real issue from my PoV.
 
   At least a transparent compromise - isn't it?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




pgsql-hackers by date:

Previous
From: Thomas Lockhart
Date:
Subject: Re: Re: [SQL] aliases break my query
Next
From: "Hiroshi Inoue"
Date:
Subject: RE: Using BOOL in indexes