Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAA4eK1L2Vqw90qE5_rpgX+uDpgfmf_KqPsYeZS65fFDzdQVpeA@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Thu, Jul 11, 2019 at 9:17 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Jul 11, 2019 at 12:38 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > I don't like the fact that undoaccess.c has a new global,
> > undo_compression_info.  I haven't read the code thoroughly, but do we
> > really need that?  I think it's never modified (so it could just be
> > declared const),
>
> Actually, this will get modified otherwise across undo record
> insertion how we will know what was the values of the common fields in
> the first record of the page.  Another option could be that every time
> we insert the record, read the value from the first complete undo
> record on the page but that will be costly because for every new
> insertion we need to read the first undo record of the page.
>

This information won't be shared across transactions, so can't we keep
it in top transaction's state?   It seems to me that will be better
than to maintain it as a global state.

Few more comments on this patch:
1.
PrepareUndoInsert()
{
..
+ if (logswitched)
+ {
..
+ }
+ else
+ {
..
+ resize = true;
..
+ }
+
..
+
+ do
+ {
+ bufidx = UndoGetBufferSlot(context, rnode, cur_blk, rbm);
..
+ rbm = RBM_ZERO;
+ cur_blk++;
+ } while (cur_size < size);
+
+ /*
+ * Set/overwrite compression info if required and also exclude the common
+ * fields from the undo record if possible.
+ */
+ if (UndoSetCommonInfo(compression_info, urec, urecptr,
+   context->prepared_undo_buffers[prepared_undo->undo_buffer_idx[0]].buf))
+ resize = true;
+
+ if (resize)
+ size = UndoRecordExpectedSize(urec);

I see that in some cases where resize is possible are checked before
buffer allocation and some are after.  Isn't it better to do all these
checks before buffer allocation?  Also, isn't it better to even
compute changed size before buffer allocation as that might sometimes
help in lesser buffer allocations?

Can you find a better way to write
:context->prepared_undo_buffers[prepared_undo->undo_buffer_idx[0]].buf)?
 It makes the line too long and difficult to understand.  Check for
similar instances in the patch and if possible, change them as well.

2.
+InsertPreparedUndo(UndoRecordInsertContext *context)
{
..
/*
+ * Try to insert the record into the current page. If it
+ * doesn't succeed then recall the routine with the next page.
+ */
+ InsertUndoData(&ucontext, page, starting_byte);
+ if (ucontext.stage == UNDO_PACK_STAGE_DONE)
+ {
+ MarkBufferDirty(buffer);
+ break;
+ }
+ MarkBufferDirty(buffer);
..
}

Can't we call MarkBufferDirty(buffer) just before 'if' check?  That
will avoid calling it twice.

3.
+ * Later, during insert phase we will write actual records into thse buffers.
+ */
+struct PreparedUndoBuffer

/thse/these

4.
+ /*
+ * If we are writing first undo record for the page the we can set the
+ * compression so that subsequent records from the same transaction can
+ * avoid including common information in the undo records.
+ */
+ if (first_complete_undo)

/page the we/page then we

5.
PrepareUndoInsert()
{
..
After
+ * allocation We'll only advance by as many bytes as we turn out to need.
+ */
+ UndoRecordSetInfo(urec);

Change the beginning of comment as: "After allocation, we'll .."

6.
PrepareUndoInsert()
{
..
* TODO:  instead of storing this in the transaction header we can
+ * have separate undo log switch header and store it there.
+ */
+ prevlogurp =
+ MakeUndoRecPtr(UndoRecPtrGetLogNo(prevlog_insert_urp),
+    (UndoRecPtrGetOffset(prevlog_insert_urp) - prevlen));
+

I don't think this TODO is valid anymore because now the patch has a
separate log-switch header.

7.
/*
+ * If undo log is switched then set the logswitch flag and also reset the
+ * compression info because we can use same compression info for the new
+ * undo log.
+ */
+ if (UndoRecPtrIsValid(prevlog_xact_start))

/can/can't

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: partition routing layering in nodeModifyTable.c
Next
From: Alexander Korotkov
Date:
Subject: Re: SQL/JSON path issues/questions