Thread: Reading deleted records - PageHeader v3

Reading deleted records - PageHeader v3

From
"Jonathan Bond-Caron"
Date:
<div class="Section1"><p class="MsoNormal">Hi, <p class="MsoNormal"> <p class="MsoNormal">So first I’m a pgsql hacker
newbieand I’ve been reading up on the storage structure: <p class="MsoNormal"><a
href="http://www.postgresql.org/docs/8.2/interactive/storage-page-layout.html">http://www.postgresql.org/docs/8.2/interactive/storage-page-layout.html</a><p
class="MsoNormal"> <pclass="MsoNormal">I’m trying to recover deleted records from a page file (postgresql 8.2) : i.e.
base/dbId/20132<pclass="MsoNormal"> <p class="MsoNormal">I am able to successfully read all the header data I need
(PageHeaderData,ItemIdData , HeapTupleHeaderData) <p class="MsoNormal">but I hit a wall when I try to start reading
userdata. <p class="MsoNormal"> <p class="MsoNormal">This has helped:<p class="MsoNormal"><a
href="http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/postgres.h?rev=1.77;content-type=text%2Fplain">http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/postgres.h?rev=1.77;content-type=text%2Fplain</a><p
class="MsoNormal"> <pclass="MsoNormal">I’ve read and understood fairly well how varlena structures are stored (plain,
compressed,external/toast) but so far I can’t seem to read a plain inline value.<p class="MsoNormal">I think part of my
problemis I haven’t really understood what ‘Then make sure you have the right alignment’ means. <p
class="MsoNormal"> <pclass="MsoNormal">My approach currently is:<p class="MsoNormal">After reading HeapTupleHeaderData
(23bytes), I advance another 4 bytes (hoff) and try to read a 32 bit integer (first attribute).<p class="MsoNormal"> <p
class="MsoNormal">Iam expecting to get an integer value 1 but I get 512 .<p class="MsoNormal"> <p class="MsoNormal">Am
Idoing this wrong? <p class="MsoNormal">Could someone point me to the pgsql code pieces I should be looking at? <p
class="MsoNormal"> <pclass="MsoNormal">If useful, this is the information I have before reading the ‘user data’:<p
class="MsoNormal"> <pclass="MsoNormal">object(PostgreSQL_HeapTupleHeaderData)#14 (7) {<p class="MsoNormal"> 
["xmin"]=><pclass="MsoNormal">  string(5) "13824"<p class="MsoNormal">  ["xmax"]=><p class="MsoNormal"> 
string(1)"0"<p class="MsoNormal">  ["cid"]=><p class="MsoNormal">  string(1) "0"<p class="MsoNormal"> 
["ctid"]=><pclass="MsoNormal">  object(PostgreSQL_ItemPointerData)#16 (2) {<p class="MsoNormal">   
["blockId"]=><pclass="MsoNormal">    string(1) "0"<p class="MsoNormal">    ["posId"]=><p class="MsoNormal">   
int(0)<pclass="MsoNormal">  }<p class="MsoNormal">  ["infomask2"]=><p class="MsoNormal">  int(0)<p
class="MsoNormal"> ["infomask"]=><p class="MsoNormal">  int(2)<p class="MsoNormal">  ["hoff"]=><p
class="MsoNormal"> int(4)<p class="MsoNormal">}<p class="MsoNormal">object(PostgreSQL_Attribute)#7 (6) {<p
class="MsoNormal"> ["name"]=><p class="MsoNormal">  string(7) "book_id"<p class="MsoNormal">  ["relid"]=><p
class="MsoNormal"> int(20132)<p class="MsoNormal">  ["len"]=><p class="MsoNormal">  int(4)<p class="MsoNormal"> 
["num"]=><pclass="MsoNormal">  int(1)<p class="MsoNormal">  ["ndims"]=><p class="MsoNormal">  int(0)<p
class="MsoNormal"> ["align"]=><p class="MsoNormal">  string(1) "i"<p class="MsoNormal">}<p
class="MsoNormal">array(1){<p class="MsoNormal">  ["book_id"]=><p class="MsoNormal">  int(512)<p
class="MsoNormal">}</div>

Re: Reading deleted records - PageHeader v3

From
Tom Lane
Date:
"Jonathan Bond-Caron" <jbondc@gmail.com> writes:
> I think part of my problem is I haven't really understood what 'Then make
> sure you have the right alignment' means. 

> My approach currently is:

> After reading HeapTupleHeaderData (23 bytes), I advance another 4 bytes
> (hoff) and try to read a 32 bit integer (first attribute).

No.  First you start at the tuple beginning plus the number of bytes
indicated by hoff (which should be at least 24).  The first field
will always be right there, because this position is always maximally
aligned.  For subsequent fields you have to advance to a multiple of
the alignment requirement of the datatype.  For example, assume the
table's first column is of type bool (1 byte) and the second column
is of type integer.  The bool will be at offset hoff, but the integer
will be at offset hoff + 4 ... it can't immediately follow the bool,
at offset hoff + 1, because that position isn't correctly aligned.
It has to start at the next offset that's a multiple of 4.
        regards, tom lane


Re: Reading deleted records - PageHeader v3

From
"Jonathan Bond-Caron"
Date:
On Sat Feb 6 01:20 AM, Tom Lane wrote:
> "Jonathan Bond-Caron" <jbondc@gmail.com> writes:
> > I think part of my problem is I haven't really understood what 'Then 
> > make sure you have the right alignment' means.
> 
> > My approach currently is:
> 
> > After reading HeapTupleHeaderData (23 bytes), I advance another 4 
> > bytes
> > (hoff) and try to read a 32 bit integer (first attribute).
> 
> No.  First you start at the tuple beginning plus the number of bytes 
> indicated by hoff (which should be at least 24).  

Thanks, much appreciated!

I was reading HeapTupleHeaderData as 23 bytes but it's 27 bytes in
access/htup.h?rev=1.87. 

The hoff now makes sense with a 28 bytes value and I can start to read the
user data.