Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Amit Khandekar
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAJ3gD9cer=Frzff6B2nE1b2w9Orif+o90vKdBRsZUvz9f1YzwA@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs  (Robert Haas <robertmhaas@gmail.com>)
Re: POC: Cleaning up orphaned files using undo logs  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
On Thu, 9 May 2019 at 12:04, Dilip Kumar <dilipbalaut@gmail.com> wrote:

> Patches can be applied on top of undo branch [1] commit:
> (cb777466d008e656f03771cf16ec7ef9d6f2778b)
>
> [1] https://github.com/EnterpriseDB/zheap/tree/undo

Below are some review points for 0009-undo-page-consistency-checker.patch :


+ /* Calculate the size of the partial record. */
+ partial_rec_size = UndoRecordHeaderSize(phdr->uur_info) +
+ phdr->tuple_len + phdr->payload_len -
+ phdr->record_offset;

There is already an UndoPagePartialRecSize() function which calculates
the size of partial record, which seems to do the same as above. If
this is same, you can omit the above code, and instead down below
where you increment next_record, you can do "next_record +=
UndoPagePartialRecSize()".

Also, I see an extra  sizeof(uint16) added in
UndoPagePartialRecSize(). Not sure which one is correct, and which one
wrong, unless I am wrong in assuming that the above calculation and
the function definition do the same thing.

------------------

+ * We just want to mask the cid in the undo record header.  So
+ * only if the partial record in the current page include the undo
+ * record header then we need to mask the cid bytes in this page.
+ * Otherwise, directly jump to the next record.
Here, I think you mean : "So only if the partial record in the current
page includes the *cid* bytes", rather than "includes the undo record
header"
May be we can say :
We just want to mask the cid. So do the partial record masking only if
the current page includes the cid bytes from the partial record
header.

----------------

+ if (phdr->record_offset < (cid_offset + sizeof(CommandId)))
+ {
+    char    *cid_data;
+    Size mask_size;
+
+    mask_size = Min(cid_offset - phdr->record_offset,
+    sizeof(CommandId));
+
+    cid_data = next_record + cid_offset - phdr->record_offset;
+    memset(&cid_data, MASK_MARKER, mask_size);
+
Here, if record_offset lies *between* cid start and cid end, then
cid_offset - phdr->record_offset will be negative, and so will be
mask_size. Probably abs() should do the work.

Also, an Assert(cid_data + mask_size <= page_end) would be nice. I
know cid position of a partial record cannot go beyond the page
boundary, but it's better to have this Assert to do sanity check.

+ * Process the undo record of the page and mask their cid filed.
filed => field

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Next
From: Tomas Vondra
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)