Re: Frequent Update Project: Design Overview of HOT - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Frequent Update Project: Design Overview of HOT
Date
Msg-id 1163145088.3071.37.camel@localhost.localdomain
Whole thread Raw
In response to Re: Frequent Update Project: Design Overview of HOT  (Hannu Krosing <hannu@skype.net>)
Responses Re: Frequent Update Project: Design Overview of HOT
List pgsql-hackers
Ühel kenal päeval, R, 2006-11-10 kell 09:06, kirjutas Hannu Krosing:
> Ühel kenal päeval, N, 2006-11-09 kell 18:28, kirjutas Tom Lane:
> > "Simon Riggs" <simon@2ndquadrant.com> writes:
> > > As more UPDATEs take place these tuple chains would grow, making
> > > locating the latest tuple take progressively longer.
> > 
> > This is the part that bothers me --- particularly the random-access
> > nature of the search.  I wonder whether you couldn't do something
> > involving an initial table fill-factor of less than 50%, and having
> > the first updated version living on the same heap page as its parent.
> > Only when the active chain length was more than one (which you
> > hypothesize is rare) would it actually be necessary to do a random
> > access into the overflow table.
> > 
> > More generally, do we need an overflow table at all, rather than having
> > these overflow tuples living in the same file as the root tuples?  As
> > long as there's a bit available to mark a tuple as being this special
> > not-separately-indexed type, you don't need a special location to know
> > what it is.  This might break down in the presence of seqscans though.
> 
> And why do you need to mark it as not-separately-indexed at all ?
> 
> We already cope with missing index pointers in VACUUM and I can't see
> any other reason to have it.

Ok, now I see it - we can't VACUUM a tuple, if next versions of it are
accessible by t_ctid chain only. That is vacuum must not free tuples,
which have t_ctid pointing to a tuple that has not-separately-indexed
bit set. This seems to make vacuum quite complicated, as it has to
examine c_tid chains to detect if it can free a tuple, and what's even
worse, it has to examine these chains backwards.

> What are the advantages of HOT over SITC (other than cool name) ?

still wondering this, is it just the abilty to span multiple pages ?

> Maybe just make HOT an extended SITC which can span pages.
> 
> In case of HOT together with reusing index tuples with DELETED bit set
> we don't actually need copyback, but the same index pointer will follow
> the head of live data automatically, maybe lagging only a small number
> of versions.

> > Actually, you omitted to mention the locking aspects of moving tuples
> > around --- exactly how are you going to make that work without breaking
> > concurrent scans?
> > 
> > > This allows the length of a typical tuple chain to be extremely short in
> > > practice. For a single connection issuing a stream of UPDATEs the chain
> > > length will no more than 1 at any time.
> > 
> > Only if there are no other transactions being held open, which makes
> > this claim a lot weaker.
> > 
> > > HOT can only work in cases where a tuple does not modify one of the
> > > columns defined in an index on the table, and when we do not alter the
> > > row length of the tuple.
> > 
> > Seems like "altering the row length" isn't the issue, it's just "is
> > there room on the page for the new version".  Again, a generous
> > fillfactor would give you more flexibility.
> 
> Maybe they hoped to take very light locks when new chaoin head is copied
> iver the old one in the same-length case.
> 
> >                 http://www.postgresql.org/about/donate
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

NOTICE: This communication contains privileged or other confidential
information. If you have received it in error, please advise the sender
by reply email and immediately delete the message and any attachments
without copying or disclosing the contents.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Frequent Update Project: Design Overview of HOT Updates
Next
From: Tom Lane
Date:
Subject: Re: beta3 CFLAGS issue on openbsd