Re: BUG #18212: Functions txid_status() and pg_xact_status() return invalid status of the specified transaction - Mailing list pgsql-bugs

From Kyotaro Horiguchi
Subject Re: BUG #18212: Functions txid_status() and pg_xact_status() return invalid status of the specified transaction
Date
Msg-id 20231128.142038.1207053663419830524.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: BUG #18212: Functions txid_status() and pg_xact_status() return invalid status of the specified transaction  (Karina Litskevich <litskevichkarina@gmail.com>)
Responses Re: BUG #18212: Functions txid_status() and pg_xact_status() return invalid status of the specified transaction  (Karina Litskevich <litskevichkarina@gmail.com>)
List pgsql-bugs
Good catch!

At Fri, 24 Nov 2023 14:38:23 +0300, Karina Litskevich <litskevichkarina@gmail.com> wrote in 
> one epoch older, it's far in the past. For newer ids, it's transaction id
> part without an epoch is compared to oldestClogXid, but it's a modulo-2^32
> comparison, so only 2^31 xids before oldestClogXid are considered preceding
> it. Thus, all full transaction ids between (next full transaction id - 2^32)
> and (oldestClogXid - 2^31) are mistakenly considered to be in the recent
> past.

I'm not entirely sure about the specific range where the error occurs,
but I do believe that the second and third lines of the problematic
comparison seem to be invalid. It seems like there is confusion
between the epoch boundary of full XIDs and the wraparound boundary of
32-bit XIDs.

> In the attached patch I suggest calculating an epoch for oldestClogXid
> assuming it's not much older than next full transaction id, make a full
> transaction id for oldestClogXid, and then just compare it to the given
> full transaction id.

I considered bringing this down to a comparison of 32-bit XIDs, but
couldn't come up with a clean method. Therefore, using full XID seems
to be the right approach. However, it seems like there is an error in
the XID comparison condition. There are cases where oldest_xid and
now_epoch_next_xid can have the same value. If we skip running
txid_current() in the repro in the your previous mail, and directly
execute txid_status(3), it would lead to assertion failure.

Also, I feel the comments could be more straight forward and simple
like this:

> Convert oldest_xid into a full XID to compare with the given
> XID. Alghouth it's guaranteed that the the oldest and newest XIDs
> are within the XID wraparound distance, they may have different
> epochs.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BUG #17893: Assert failed in heap_update()/_delete() when FK modiified by RI trigger in non-read-committed xact
Next
From: Richard Guo
Date:
Subject: Re: BUG #18187: Unexpected error: "variable not found in subplan target lists" triggered by JOIN