Thread: Reading deleted records - PageHeader v3
<div class="Section1"><p class="MsoNormal">Hi, <p class="MsoNormal"> <p class="MsoNormal">So first I’m a pgsql hacker newbieand I’ve been reading up on the storage structure: <p class="MsoNormal"><a href="http://www.postgresql.org/docs/8.2/interactive/storage-page-layout.html">http://www.postgresql.org/docs/8.2/interactive/storage-page-layout.html</a><p class="MsoNormal"> <pclass="MsoNormal">I’m trying to recover deleted records from a page file (postgresql 8.2) : i.e. base/dbId/20132<pclass="MsoNormal"> <p class="MsoNormal">I am able to successfully read all the header data I need (PageHeaderData,ItemIdData , HeapTupleHeaderData) <p class="MsoNormal">but I hit a wall when I try to start reading userdata. <p class="MsoNormal"> <p class="MsoNormal">This has helped:<p class="MsoNormal"><a href="http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/postgres.h?rev=1.77;content-type=text%2Fplain">http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/postgres.h?rev=1.77;content-type=text%2Fplain</a><p class="MsoNormal"> <pclass="MsoNormal">I’ve read and understood fairly well how varlena structures are stored (plain, compressed,external/toast) but so far I can’t seem to read a plain inline value.<p class="MsoNormal">I think part of my problemis I haven’t really understood what ‘Then make sure you have the right alignment’ means. <p class="MsoNormal"> <pclass="MsoNormal">My approach currently is:<p class="MsoNormal">After reading HeapTupleHeaderData (23bytes), I advance another 4 bytes (hoff) and try to read a 32 bit integer (first attribute).<p class="MsoNormal"> <p class="MsoNormal">Iam expecting to get an integer value 1 but I get 512 .<p class="MsoNormal"> <p class="MsoNormal">Am Idoing this wrong? <p class="MsoNormal">Could someone point me to the pgsql code pieces I should be looking at? <p class="MsoNormal"> <pclass="MsoNormal">If useful, this is the information I have before reading the ‘user data’:<p class="MsoNormal"> <pclass="MsoNormal">object(PostgreSQL_HeapTupleHeaderData)#14 (7) {<p class="MsoNormal"> ["xmin"]=><pclass="MsoNormal"> string(5) "13824"<p class="MsoNormal"> ["xmax"]=><p class="MsoNormal"> string(1)"0"<p class="MsoNormal"> ["cid"]=><p class="MsoNormal"> string(1) "0"<p class="MsoNormal"> ["ctid"]=><pclass="MsoNormal"> object(PostgreSQL_ItemPointerData)#16 (2) {<p class="MsoNormal"> ["blockId"]=><pclass="MsoNormal"> string(1) "0"<p class="MsoNormal"> ["posId"]=><p class="MsoNormal"> int(0)<pclass="MsoNormal"> }<p class="MsoNormal"> ["infomask2"]=><p class="MsoNormal"> int(0)<p class="MsoNormal"> ["infomask"]=><p class="MsoNormal"> int(2)<p class="MsoNormal"> ["hoff"]=><p class="MsoNormal"> int(4)<p class="MsoNormal">}<p class="MsoNormal">object(PostgreSQL_Attribute)#7 (6) {<p class="MsoNormal"> ["name"]=><p class="MsoNormal"> string(7) "book_id"<p class="MsoNormal"> ["relid"]=><p class="MsoNormal"> int(20132)<p class="MsoNormal"> ["len"]=><p class="MsoNormal"> int(4)<p class="MsoNormal"> ["num"]=><pclass="MsoNormal"> int(1)<p class="MsoNormal"> ["ndims"]=><p class="MsoNormal"> int(0)<p class="MsoNormal"> ["align"]=><p class="MsoNormal"> string(1) "i"<p class="MsoNormal">}<p class="MsoNormal">array(1){<p class="MsoNormal"> ["book_id"]=><p class="MsoNormal"> int(512)<p class="MsoNormal">}</div>
"Jonathan Bond-Caron" <jbondc@gmail.com> writes: > I think part of my problem is I haven't really understood what 'Then make > sure you have the right alignment' means. > My approach currently is: > After reading HeapTupleHeaderData (23 bytes), I advance another 4 bytes > (hoff) and try to read a 32 bit integer (first attribute). No. First you start at the tuple beginning plus the number of bytes indicated by hoff (which should be at least 24). The first field will always be right there, because this position is always maximally aligned. For subsequent fields you have to advance to a multiple of the alignment requirement of the datatype. For example, assume the table's first column is of type bool (1 byte) and the second column is of type integer. The bool will be at offset hoff, but the integer will be at offset hoff + 4 ... it can't immediately follow the bool, at offset hoff + 1, because that position isn't correctly aligned. It has to start at the next offset that's a multiple of 4. regards, tom lane
On Sat Feb 6 01:20 AM, Tom Lane wrote: > "Jonathan Bond-Caron" <jbondc@gmail.com> writes: > > I think part of my problem is I haven't really understood what 'Then > > make sure you have the right alignment' means. > > > My approach currently is: > > > After reading HeapTupleHeaderData (23 bytes), I advance another 4 > > bytes > > (hoff) and try to read a 32 bit integer (first attribute). > > No. First you start at the tuple beginning plus the number of bytes > indicated by hoff (which should be at least 24). Thanks, much appreciated! I was reading HeapTupleHeaderData as 23 bytes but it's 27 bytes in access/htup.h?rev=1.87. The hoff now makes sense with a 28 bytes value and I can start to read the user data.