Home > mailing lists

Re: Undo logs - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Undo logs
Date	November 17, 2018 11:42:01
Msg-id	CAA4eK1+_R8Mgr0un=wr-v6rre3e_suQvQaWwit451TFY3-qXOw@mail.gmail.com Whole thread
In response to	Re: Undo logs (Dilip Kumar <dilipbalaut@gmail.com>)
Responses	Re: Undo logs
List	pgsql-hackers

Tree view

On Fri, Nov 16, 2018 at 9:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 12:14 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> Updated patch (merged latest code from the zheap main branch [1]).
>

Review comments:
-------------------------------
1.
UndoRecordPrepareTransInfo()
{
..
+ /*
+ * The absence of previous transaction's undo indicate that this backend
+ * is preparing its first undo in which case we have nothing to update.
+ */
+ if (!UndoRecPtrIsValid(prev_xact_urp))
+ return;
..
}

It is expected that the caller of UndoRecPtrIsValid should have
discard lock, but I don't see that how this the call from this place
ensures the same?

2.
UndoRecordPrepareTransInfo()
{
..
/*
+ * The absence of previous transaction's undo indicate that this backend
+ * is preparing its first undo in which case we have nothing to update.
+ */
+ if (!UndoRecPtrIsValid(prev_xact_urp))
+ return;
+
+ /*
+ * Acquire the discard lock before accessing the undo record so that
+ * discard worker doesn't remove the record while we are in process of
+ * reading it.
+ */
+ LWLockAcquire(&log->discard_lock, LW_SHARED);
+
+ if (!UndoRecordIsValid(log, prev_xact_urp))
+ return;
..
}

I don't understand this logic where you are checking the same
information with and without a lock, is there any reason for same?  It
seems we don't need the first call to  UndoRecPtrIsValid is not
required.

3.
UndoRecordPrepareTransInfo()
{
..
+ while (true)
+ {
+ bufidx = InsertFindBufferSlot(rnode, cur_blk,
+   RBM_NORMAL,
+   log->meta.persistence);
+ prev_txn_info.prev_txn_undo_buffers[index] = bufidx;
+ buffer = undo_buffer[bufidx].buf;
+ page = BufferGetPage(buffer);
+ index++;
+
+ if (UnpackUndoRecord(&prev_txn_info.uur, page, starting_byte,
+ &already_decoded, true))
+ break;
+
+ starting_byte = UndoLogBlockHeaderSize;
+ cur_blk++;
+ }

Can you write some commentary on what this code is doing?

There is no need to use index++; as a separate statement, you can do
it while assigning the buffer in that index.

4.
+UndoRecordPrepareTransInfo(UndoRecPtr urecptr, bool log_switched)
+{
+ UndoRecPtr prev_xact_urp;

I think you can simply name this variable as xact_urp.  All this and
related prev_* terminology used for variables seems confusing to me. I
understand that you are trying to update the last transactions undo
record information, but you can explain that via comments.  Keeping
such information as part of variable names not only makes their length
longer, but is also confusing.

5.
/*
+ * Structure to hold the previous transaction's undo update information.
+ */
+typedef struct PreviousTxnUndoRecord
+{
+ UndoRecPtr prev_urecptr; /* prev txn's starting urecptr */
+ int prev_txn_undo_buffers[MAX_BUFFER_PER_UNDO];
+ UnpackedUndoRecord uur; /* prev txn's first undo record. */
+} PreviousTxnInfo;
+
+static PreviousTxnInfo prev_txn_info;

Due to reasons mentioned in point-4, lets name the structure and it's
variables as below:

typedef struct XactUndoRecordInfo
{
UndoRecPtr start_urecptr; /* prev txn's starting urecptr */
int idx_undo_buffers[MAX_BUFFER_PER_UNDO];
UnpackedUndoRecord first_uur; /* prev txn's first undo record. */
} XactUndoRecordInfo;

static XactUndoRecordInfo xact_ur_info;

6.
+static int
+InsertFindBufferSlot(RelFileNode rnode,

The name of this function is not clear, can we change it to
UndoGetBufferSlot or UndoGetBuffer?

7.
+UndoRecordAllocateMulti(UnpackedUndoRecord *undorecords, int nrecords,
+ UndoPersistence upersistence, TransactionId txid)
{
..
+ /*
+ * If this is the first undo record for this transaction then set the
+ * uur_next to the SpecialUndoRecPtr.  This is the indication to allocate
+ * the space for the transaction header and the valid value of the uur_next
+ * will be updated while preparing the first undo record of the next
+ * transaction.
+ */
+ first_rec_in_recovery = InRecovery && IsTransactionFirstRec(txid);
..
}

I think it will be better if we move this comment few lines down:
+ if (need_start_undo && i == 0)
+ {
+ urec->uur_next = SpecialUndoRecPtr;

BTW, is the only reason set a special value (SpecialUndoRecPtr) for
uur_next is for allocating transaction header? If so, can't we
directly set the corresponding flag (UREC_INFO_TRANSACTION) in
uur_info and then remove it from UndoRecordSetInfo?

I think it would have been better if there is one central location to
set uur_info, but as that is becoming tricky,
we should not try to add some special stuff to make it possible.

8.
+UndoRecordExpectedSize(UnpackedUndoRecord *uur)
+{
+ Size size;
+
+ /* Fixme : Temporary hack to allow zheap to set some value for uur_info. */
+ /* if (uur->uur_info == 0) */
+ UndoRecordSetInfo(uur);

Can't we move UndoRecordSetInfo in it's caller
(UndoRecordAllocateMulti)?  It seems another caller of this function
doesn't expect this.  If we do that way, then we can have an Assert
for non-zero uur_info in UndoRecordExpectedSize.

9.
bool
+InsertUndoRecord(UnpackedUndoRecord *uur, Page page,
+ int starting_byte, int *already_written, bool header_only)
+{
+ char   *writeptr = (char *) page + starting_byte;
+ char   *endptr = (char *) page + BLCKSZ;
+ int my_bytes_written = *already_written;
+
+ if (uur->uur_info == 0)
+ UndoRecordSetInfo(uur);

Do we really need UndoRecordSetInfo here?  If not, then just add an
assert for non-zero uur_info?

10
UndoRecordAllocateMulti()
{
..
else
+ {
+ /*
+ * It is okay to initialize these variables as these are used only
+ * with the first record of transaction.
+ */
+ urec->uur_next = InvalidUndoRecPtr;
+ urec->uur_xidepoch = 0;
+ urec->uur_dbid = 0;
+ urec->uur_progress = 0;
+ }
+
+
+ /* calculate the size of the undo record. */
+ size += UndoRecordExpectedSize(urec);
+ }

Remove one extra line before comment "calculate the size of ..".

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Alvaro Herrera
Date: 17 November 2018, 11:16:53
Subject: Re: New function pg_stat_statements_reset_query() to resetstatistics of a specific query

From: Hakan Kocaman
Date: 17 November 2018, 11:47:43
Subject: Re: zheap: a new storage format for PostgreSQL

Re: Undo logs - Mailing list pgsql-hackers

Previous

Next