Thread: Re: Fix for consume_xids advancing XIDs incorrectly
Hi, Thank you for the report. On Mon, Oct 14, 2024 at 6:51 PM Yushi Ogiwara <btogiwarayuushi@oss.nttdata.com> wrote: > > Hi, > > I found that the consume_xids function incorrectly advances XIDs as > shown: > > postgres=# select txid_current(); > txid_current > -------------- > 746 > (1 row) > > postgres=# select consume_xids('100'); > consume_xids > -------------- > 847 > (1 row) > > In the example, the consume_xids function consumes 100 XIDs when XID = > 746, so the desired outcome from consume_xids should be 746 + 100 = 846, > which differs from the actual outcome, 847. > > Behavior inside a transaction block: > > postgres=# select txid_current(); > txid_current > -------------- > 1410 > (1 row) > > postgres=# begin; > BEGIN > postgres=*# select consume_xids('100'); > consume_xids > -------------- > 1511 > (1 row) > postgres=*# select consume_xids('100'); > consume_xids > -------------- > 1521 > (1 row) > > Here, the first call inside the transaction block consumes 100+1 XIDs > (incorrect), while the second call consumes exactly 100 XIDs (as > expected) > > Summary: > > The function performs incorrectly when: > - Outside of a transaction block > - The first call inside a transaction block > But works correctly when: > - After the second call inside a transaction block > > The issue arises because consume_xids does not correctly count the > consumed XIDs when it calls the GetTopTransactionId function, which > allocates a new XID when none has been assigned. I agree with your analysis. I have one comment on the patch: - (void) GetTopTransactionId(); + if(!FullTransactionIdIsValid(GetTopFullTransactionIdIfAny())) + { + (void) GetTopTransactionId(); + consumed++; + } If we count this case as consumed too, I think we need to set the returned value of GetTopTranasctionId() to lastxid. Because otherwise, the return value when passing 1 to the function would be the latest XID at the time but not the last XID consumed by the function. Passing 1 to this function is very unrealistic case, though. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Oct 15, 2024 at 10:06 PM Yushi Ogiwara <btogiwarayuushi@oss.nttdata.com> wrote: > > Hi, > > Thank you for your comment. > > Regarding the first patch, I believe it works correctly when > consume_xids(1) is called. This is because the lastxid variable in the > consume_xids_common function is initialized as lastxid = > ReadNextFullTransactionId(), where the ReadNextFullTransactionId > function returns the (current XID) + 1. But it's possible that concurrent transactions consume XIDs in meanwhile, no? > > Separately, I found that consume_xids(0) does not behave as expected. > Below is an example: > > postgres=# select txid_current(); > txid_current > -------------- > 45496 > (1 row) > > postgres=# select consume_xids(0); > consume_xids > -------------- > 45497 > (1 row) > > postgres=# select consume_xids(0); > consume_xids > -------------- > 45497 > (1 row) > > In the example, the argument to consume_xids is 0, meaning it should not > consume any XIDs. However, the first invocation of consume_xids(0) looks > like unexpectedly consuming 1 XID though it's not consuming actually. > This happens because consume_xids(0) returns the value from > ReadNextFullTransactionId. > > I have updated the patch (skip.diff, attached to this e-mail) to address > this issue. Now, when consume_xids(0) is called, it returns > ReadNextFullTransactionId().value - 1, ensuring no XID is consumed as > shown below: > > postgres=# select txid_current(); > txid_current > -------------- > 45498 > (1 row) > > postgres=# select consume_xids(0); > consume_xids > -------------- > 45498 > (1 row) Hmm, I think if we expect this function to return the last XID that the function actually consumed, calling consume_xids with 0 should raise an error instead. Even if it returns ReadNextFullTransactionId().value - 1 as you proposed, other concurrent transactions might consume XIDs between txid_current() and consume_xids(0), resulting in consume_xids() appearing to have consumed XIDs. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On 2024/10/18 14:57, Yushi Ogiwara wrote: > In conclusion, I agree that: > > * Update lastxid with GetTopTransactionId(). > * consume_xids with 0 should raise an error. > > I have attached a new patch that incorporates your suggestions. One concern in this discussion is that the return value of this function isn't entirely clear. To address this, how about adding a comment at the beginning of consume_xids() like: "Returns the last XID assigned by consume_xids()"? * the cache overflows, but beyond that, we don't keep track of the * consumed XIDs. */ - (void) GetTopTransactionId(); + if(!FullTransactionIdIsValid(GetTopFullTransactionIdIfAny())) + { + lastxid = GetTopTransactionId(); + consumed++; + } How about appending the following to the end of the first paragraph in the above source comments? If no top-level XID is assigned, a new one is obtained, and the consumed XID counter is incremented. if (xids_left > 2000 && consumed - last_reported_at < REPORT_INTERVAL && MyProc->subxidStatus.overflowed) { int64 consumed_by_shortcut = consume_xids_shortcut(); if (consumed_by_shortcut > 0) { consumed += consumed_by_shortcut; continue; } } By the way, this isn't directly related to the proposed patch, but while reading the xid_wraparound code, I noticed that consume_xids_common() could potentially return an unexpected XID if consume_xids_shortcut() returns a value greater than 2000. Based on the current logic, consume_xids_common() should always return a value less than 2000, so the issue I'm concerned about shouldn't occur. Still, would it be worth adding an assertion to ensure that consume_xids_common() never returns a value greater than 2000? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Tue, Oct 22, 2024 at 11:47 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2024/10/18 14:57, Yushi Ogiwara wrote: > > In conclusion, I agree that: > > > > * Update lastxid with GetTopTransactionId(). > > * consume_xids with 0 should raise an error. > > > > I have attached a new patch that incorporates your suggestions. > > One concern in this discussion is that the return value of this function isn't > entirely clear. To address this, how about adding a comment at the beginning of > consume_xids() like: "Returns the last XID assigned by consume_xids()"? Agreed. That value is what I expected this function to return. > > > * the cache overflows, but beyond that, we don't keep track of the > * consumed XIDs. > */ > - (void) GetTopTransactionId(); > + if(!FullTransactionIdIsValid(GetTopFullTransactionIdIfAny())) > + { > + lastxid = GetTopTransactionId(); > + consumed++; > + } > > How about appending the following to the end of the first paragraph in > the above source comments? > > If no top-level XID is assigned, a new one is obtained, > and the consumed XID counter is incremented. Agreed. > > > > if (xids_left > 2000 && > consumed - last_reported_at < REPORT_INTERVAL && > MyProc->subxidStatus.overflowed) > { > int64 consumed_by_shortcut = consume_xids_shortcut(); > > if (consumed_by_shortcut > 0) > { > consumed += consumed_by_shortcut; > continue; > } > } > > By the way, this isn't directly related to the proposed patch, but while reading > the xid_wraparound code, I noticed that consume_xids_common() could potentially > return an unexpected XID if consume_xids_shortcut() returns a value greater > than 2000. Based on the current logic, consume_xids_common() should always return > a value less than 2000, so the issue I'm concerned about shouldn't occur. Good point. Yes, the function doesn't return a value greater than 2000 as long as we do "distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - rem);". But it's true with <= 16KB block sizes. > Still, > would it be worth adding an assertion to ensure that consume_xids_common() never > returns a value greater than 2000? While adding an assertion makes sense to me, another idea is to set last_xid even in the shortcut path. That way, it works even with 32KB block size. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On 2024/10/24 5:23, Masahiko Sawada wrote: >> if (xids_left > 2000 && >> consumed - last_reported_at < REPORT_INTERVAL && >> MyProc->subxidStatus.overflowed) >> { >> int64 consumed_by_shortcut = consume_xids_shortcut(); >> >> if (consumed_by_shortcut > 0) >> { >> consumed += consumed_by_shortcut; >> continue; >> } >> } >> >> By the way, this isn't directly related to the proposed patch, but while reading >> the xid_wraparound code, I noticed that consume_xids_common() could potentially >> return an unexpected XID if consume_xids_shortcut() returns a value greater >> than 2000. Based on the current logic, consume_xids_common() should always return >> a value less than 2000, so the issue I'm concerned about shouldn't occur. > > Good point. Yes, the function doesn't return a value greater than 2000 > as long as we do "distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - > rem);". But it's true with <= 16KB block sizes. > >> Still, >> would it be worth adding an assertion to ensure that consume_xids_common() never >> returns a value greater than 2000? > > While adding an assertion makes sense to me, another idea is to set > last_xid even in the shortcut path. That way, it works even with 32KB > block size. Agreed on making xid_wraparound compatible with a 32k block size. As you pointed out, with 32k blocks, XidSkip() can return values greater than 2000. So, the first step is to adjust the following if-condition by increasing "2000" to a higher value. Since XidSkip() can return up to 3276 (based on COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10) with 32k blocks), we could, for instance, update "2000" to "4000." if (xids_left > 2000 && consumed - last_reported_at < REPORT_INTERVAL && MyProc->subxidStatus.overflowed) To ensure lastxid is set even in the shortcut case, what about modifying XidSkip() so it returns "distance - 1" instead of "distance"? This change would make consume_xids_common() run at least one more loop cycle, triggering GetNewTransactionId() and setting lastxid correctly. consumed = XidSkip(nextXid); if (consumed > 0) TransamVariables->nextXid.value += (uint64) consumed; BTW, the code snippet above in consume_xids_shortcut() could potentially set the next XID to a value below FirstNormalTransactionId? If yes, we should account for FirstNormalFullTransactionId when increasing the next XID, similar to how FullTransactionIdAdvance() handles it. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Thu, Oct 24, 2024 at 7:55 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2024/10/24 5:23, Masahiko Sawada wrote: > >> if (xids_left > 2000 && > >> consumed - last_reported_at < REPORT_INTERVAL && > >> MyProc->subxidStatus.overflowed) > >> { > >> int64 consumed_by_shortcut = consume_xids_shortcut(); > >> > >> if (consumed_by_shortcut > 0) > >> { > >> consumed += consumed_by_shortcut; > >> continue; > >> } > >> } > >> > >> By the way, this isn't directly related to the proposed patch, but while reading > >> the xid_wraparound code, I noticed that consume_xids_common() could potentially > >> return an unexpected XID if consume_xids_shortcut() returns a value greater > >> than 2000. Based on the current logic, consume_xids_common() should always return > >> a value less than 2000, so the issue I'm concerned about shouldn't occur. > > > > Good point. Yes, the function doesn't return a value greater than 2000 > > as long as we do "distance = Min(distance, COMMIT_TS_XACTS_PER_PAGE - > > rem);". But it's true with <= 16KB block sizes. > > > >> Still, > >> would it be worth adding an assertion to ensure that consume_xids_common() never > >> returns a value greater than 2000? > > > > While adding an assertion makes sense to me, another idea is to set > > last_xid even in the shortcut path. That way, it works even with 32KB > > block size. > > Agreed on making xid_wraparound compatible with a 32k block size. As you pointed out, > with 32k blocks, XidSkip() can return values greater than 2000. So, the first step is > to adjust the following if-condition by increasing "2000" to a higher value. > Since XidSkip() can return up to 3276 (based on COMMIT_TS_XACTS_PER_PAGE (BLCKSZ / 10) with 32k blocks), > we could, for instance, update "2000" to "4000." > > if (xids_left > 2000 && > consumed - last_reported_at < REPORT_INTERVAL && > MyProc->subxidStatus.overflowed) > > > To ensure lastxid is set even in the shortcut case, what about modifying XidSkip() > so it returns "distance - 1" instead of "distance"? This change would make > consume_xids_common() run at least one more loop cycle, > triggering GetNewTransactionId() and setting lastxid correctly. Increasing "2000" to "4000" makes sense to me. I think that with this change we don't necessarily need to change XidSkip() to return "distance - 1'. What do you think? > > > consumed = XidSkip(nextXid); > if (consumed > 0) > TransamVariables->nextXid.value += (uint64) consumed; > > BTW, the code snippet above in consume_xids_shortcut() could potentially set > the next XID to a value below FirstNormalTransactionId? If yes, we should account for > FirstNormalFullTransactionId when increasing the next XID, similar to > how FullTransactionIdAdvance() handles it. Good catch. I agree with you. We need to do something similar to what FullTransactionIdAdvance() does so that it does not appear as a special 32-bit XID. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
>> >> >> consumed = XidSkip(nextXid); >> if (consumed > 0) >> TransamVariables->nextXid.value += (uint64) consumed; >> >> BTW, the code snippet above in consume_xids_shortcut() could >> potentially set >> the next XID to a value below FirstNormalTransactionId? If yes, we >> should account for >> FirstNormalFullTransactionId when increasing the next XID, similar to >> how FullTransactionIdAdvance() handles it. > > Good catch. I agree with you. We need to do something similar to what > FullTransactionIdAdvance() does so that it does not appear as a > special 32-bit XID. > > Regards, I think this is not the case since XidSkip returns min(UINT32_MAX - 5 - low, *), which prevents the wrap-around of nextXid. Regards, Yushi Ogiwara
On 2024/10/29 18:00, Yushi Ogiwara wrote: >>> >>> >>> consumed = XidSkip(nextXid); >>> if (consumed > 0) >>> TransamVariables->nextXid.value += (uint64) consumed; >>> >>> BTW, the code snippet above in consume_xids_shortcut() could potentially set >>> the next XID to a value below FirstNormalTransactionId? If yes, we should account for >>> FirstNormalFullTransactionId when increasing the next XID, similar to >>> how FullTransactionIdAdvance() handles it. >> >> Good catch. I agree with you. We need to do something similar to what >> FullTransactionIdAdvance() does so that it does not appear as a >> special 32-bit XID. >> >> Regards, > > I think this is not the case since XidSkip returns min(UINT32_MAX - 5 - low, *), which prevents the wrap-around of nextXid. Yes, you're right. Thanks for the correction! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Wed, Oct 30, 2024 at 12:00 AM Yushi Ogiwara <btogiwarayuushi@oss.nttdata.com> wrote: > > I made a new patch (skip_xid_correctly.diff) that incorporates the > points we discussed: > > 1. Fix the issue that consume_xids consumes nxids+1 XIDs. > 2. Update lastxid when calling GetTopFullTransactionId() to support > nxids==1 case. > 3. Forbid consume_xids when nxids==0. > 4. Add comments explaining the return values of consume_xids and > consume_xids_until, and the rationale for incrementing consumed++ when > GetTopFullTransactionId() is called. > > Also, I made another patch (support_blksz_32k.diff) that supports the > block size == 32K case. > Thank you for making patches! Here are review comments. skip_xid_correctly.diff: - if (nxids < 0) + if (nxids <= 0) elog(ERROR, "invalid nxids argument: %lld", (long long) nxids); - - if (nxids == 0) - lastxid = ReadNextFullTransactionId(); else lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids); After calling elog(ERROR) we don't return here, so the "else" is not necessary. That is, we can rewrite it to: if (nxids <= 0) elog(ERROR, "invalid nxids argument: %lld", (long long) nxids); lastxid = consume_xids_common(InvalidFullTransactionId, (uint64) nxids); --- + * + * If no top-level XID is assigned, a new one is obtained, + * and the consumed XID counter is incremented. I'm not sure this comment is appropriate since what we do here is obvious and the comment doesn't explain why we do that. IMO we don't need to update these comments. support_blksz_32k.diff: - if (xids_left > 2000 && + if (xids_left > COMMIT_TS_XACTS_PER_PAGE && I think it's better to just replace 2000 with 4000 and explain why 4000 is sufficient. Probably we can define '#define XID_SHORTCUT_THRESHOLD 4000" and put an assertion to XidSkip() or consume_xids_common() to check if the number of consumed XIDs doesn't exceed XID_SHORTCUT_THRESHOLD. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed, 30 Oct 2024 at 12:01, Yushi Ogiwara <btogiwarayuushi@oss.nttdata.com> wrote: > > I made a new patch (skip_xid_correctly.diff) that incorporates the > points we discussed: > > 1. Fix the issue that consume_xids consumes nxids+1 XIDs. > 2. Update lastxid when calling GetTopFullTransactionId() to support > nxids==1 case. > 3. Forbid consume_xids when nxids==0. > 4. Add comments explaining the return values of consume_xids and > consume_xids_until, and the rationale for incrementing consumed++ when > GetTopFullTransactionId() is called. > > Also, I made another patch (support_blksz_32k.diff) that supports the > block size == 32K case. > > Best, > Yushi Ogiwara > Hi! There is review comments that need to be addressed in [1] Patch status now is waiting on author [1] https://www.postgresql.org/message-id/CAD21AoCthHcSQ5zeeivNpiz7HMi_FPG-dtwDDNYUx2oKG36bCQ%40mail.gmail.com -- Best regards, Kirill Reshke