Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Date
Msg-id 561A2AF4-E745-4043-B5FC-9537AEE55701@yandex-team.ru
Whole thread Raw
In response to Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers

> 1 янв. 2021 г., в 23:05, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):
>
> I've found this thread in CF looking for something to review.

We discussed patches with Noah offlist. I'm resending summary to list.

There are two patches:
1. slru-truncate-modulo-v5.patch
2. slru-truncate-t-insurance-v4.patch

It would be a bit easier to review if patches were versioned together (v5 both), because 2nd patch applies on top of
1st.Also 2nd patch have a problem if applying with git (in async.c). 

First patch fixes a bug with possible SLRU truncation around wrapping point too early.
Basic idea of the patch is "If we want to delete a range we must be eligible to delete it's beginning and ending".
So to test if page is deletable it tests that first and last xids(or other SLRU's unit) are of no interest. To test if
asegment is deletable it tests if first and last pages can be deleted. 
Patch adds test in unusual manner: they are implemented as assert functions. Tests are fast, they are only checking
basicand edge cases. But tests will not be run if Postgres is build without asserts. 
I'm a little suspicious of implementation of *PagePrecedes(int page1, int page2) functions. Consider following example
fromthe patch: 

static bool
CommitTsPagePrecedes(int page1, int page2)
{
    TransactionId xid1;
    TransactionId xid2;

    xid1 = ((TransactionId) page1) * COMMIT_TS_XACTS_PER_PAGE;
    xid1 += FirstNormalTransactionId + 1;
    xid2 = ((TransactionId) page2) * COMMIT_TS_XACTS_PER_PAGE;
    xid2 += FirstNormalTransactionId + 1;

    return (TransactionIdPrecedes(xid1, xid2) &&
            TransactionIdPrecedes(xid1, xid2 + COMMIT_TS_XACTS_PER_PAGE - 1));
}

We are adding FirstNormalTransactionId to xids to avoid them being special xids.
We add COMMIT_TS_XACTS_PER_PAGE - 1 to xid2 to shift it to the end of the page. But due to += FirstNormalTransactionId
weshift slightly behind page boundaries and risk that xid2 + COMMIT_TS_XACTS_PER_PAGE - 1 can become
FrozenTransactionId(FirstNormalTransactionId - 1). Thus we add +1 to all values in scope. While the logic is correct,
codingis difficult. Maybe we could just use 
    page1_first_normal_xid = ((TransactionId) page1) * CLOG_XACTS_PER_PAGE + FirstNormalTransactionId;
    page2_first_normal_xid = ((TransactionId) page2) * CLOG_XACTS_PER_PAGE + FirstNormalTransactionId;
    page2_last_xid = ((TransactionId) page2) * CLOG_XACTS_PER_PAGE + CLOG_XACTS_PER_PAGE - 1;
But I'm not insisting on this.


Following comment is not correct for 1Kb and 4Kb pages
+ * At every supported BLCKSZ, (1 << 31) % COMMIT_TS_XACTS_PER_PAGE == 128.

All above notes are not blocking the fix, I just wanted to let committer know about this. I think that it's very
importantto have this bug fixed. 

Second patch converts binary *PagePreceeds() functions to *PageDiff()s and adds logic to avoid deleting pages in
suspiciouscases. This logic depends on the scale of returned by diff value: it expects that overflow happens between
INT_MIN\INT_MAX.This it prevents page deletion if page_diff <= INT_MIN / 2 (too far from current cleaning point; and in
normalcases, of cause).  

It must be comparison here, not equality test.
-   ctl->PagePrecedes(segpage, trunc->earliestExistingPage))
+   ctl->PageDiff(segpage, trunc->earliestExistingPage))

This
int diff_max = ((QUEUE_MAX_PAGE + 1) / 2) - 1,
seems to be functional equivalent of
int diff_max = ((QUEUE_MAX_PAGE - 1) / 2),

What I like about the patch is that it removes all described above trickery around + FirstNormalTransactionId + 1.

AFAICS the overall purpose of the 2nd patch is to help corrupted by other bugs clusters avoid deleting SLRU segments.
I'm a little bit afraid that this kind of patch can hide bugs (while potentially saving some users data). Besides this
patchseems like a useful precaution. Maybe we could emit scary warnings if SLRU segments do not stack into continuous
range?


Thanks!

Best regards, Andrey Borodin.


pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Parallel Inserts in CREATE TABLE AS
Next
From: "matsumura.ryo@fujitsu.com"
Date:
Subject: RE: libpq debug log