Volunteer: Large Tuples / Tuple chaining - Mailing list pgsql-hackers

From Christof Petig
Subject Volunteer: Large Tuples / Tuple chaining
Date
Msg-id 3850299F.C86DEFD6@wtal.de
Whole thread Raw
Responses RE: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Re: [HACKERS] Volunteer: Large Tuples / Tuple chaining
List pgsql-hackers
Hello,

I'll donate some (read all freely available) of my spare time to
implementing tuple
chaining. It looks like this feature is most wanted and it would be a
pity to hold this until post 7.0. Personally I don't need it, yet ...
But I will definitely find a use for it once available ;-) And it looks
like a good start for hacking on pgsql.

I already dived into the depth of pgsql's page and tuple structures and
it looks like it is possible. But before I start coding I would like to
hear some more experienced opinions on how to implement it.

Did you alread discuss technical matters about the implementation? How
can I get in touch with it? (Simply browse the mailing list archives?)

Here's a layout how I imagine the work:

What is needed:
- lay out a tuple continuation structure
- put tuple into multiple chunks when pages are considered, reconcile
when loaded from disk (how to continue a tuple - need a structure) how is a tuple (read page item) addressed?
ItemPointerDataI imagine to store a continuation address as the last bytes of the
 
tuple unless it fits into one page. I need to mark large tuples (how, just one flag in tuple) How to tell a maximum
possiblesize last block from a continued  (which carries a pointer to the next one at its end)?  Or don't care: make
itemcontinued and put last 6(?) bytes into a new
 
block
- note that the continued tuples are not referenced directly (vacuum?) mark them as used. I hope vacuum operates on a
tuplebasis and has no
 
concept of pages
- I guess that the tuple pointer points into page memory, if multiple
pages  are concatenated for a tuple, these pages must not reside in memory
but the full tuple's memory must be allocated (from a memory similar to
pages) (shared mem?)
- should be possible for memory only pages  see PageGetPageSize but od_pagesize is 16bit! Reuse another variable?
Anothertype of page? (32bit od_pagesize) 
 
Very fascinated by this large beast of ancient code to explore     Christof

PS: I think the documentation on page layout is far outdated (or points
into the future since it speaks about ItemContinuationData structures.)
Should I update it?
The table doesn't match actual structure components. At least I don't
understand what it's about. The source code mentions a different page
layout.

PPS: Do not pity me, I have ten+ years of coding experience in C.

PPPS: Could someone in few words tell me what an access method is (a
tuple is an access method, log pages are another?)



pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] 6.6 release
Next
From: Peter Mount
Date:
Subject: RE: [INTERFACES] Transaction support in 6.5.3/JDBC