Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Date
Msg-id CAJcOf-dVeKS2YJaS_BVdF7ghQ=xFj2d68TzGjYCvzo8cM0FtDQ@mail.gmail.com
Whole thread Raw
In response to RE: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  ("Pengchengliu" <pengchengliu@tju.edu.cn>)
Responses Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Thu, May 20, 2021 at 11:18 AM Pengchengliu <pengchengliu@tju.edu.cn> wrote:
>
> Hi Greg,
>    Thanks a lot for you explanation and your fix.
>
>    I think your fix can resolve the core dump issue. As with your fix, parallel process reset Transaction Xmin from
ActiveSnapshot.
>    But it will change Transaction snapshot for all parallel  scenarios. I don't know whether it bring in other
issue.
>    For test only, I think it is enough.
>
>    So is there anybody can explain what's exactly difference between ActiveSnapshot and TransactionSnapshot in
parallelwork process.
 
>    Then maybe we can find a better solution and try to fix it really.
>

I am proposing the attached patch to fix this issue (patches for both
PG13.2 and latest PG14 source are provided).

Perhaps this will trigger others who better know the intricacies of
snapshot handling, and know the reasons and history behind why the
transaction_snapshot and active_snapshot are currently passed
separately to parallel workers, to speak up here and discuss the issue
further, and check my fix (and note, I haven't attempted to modify
README.parallel in the patch until I get further clarification on this
issue).
Perhaps someone can explain the purpose of calling
GetTransactionSnapshot() AGAIN in InitializeParallelDSM() and how is
this consistent with the current ActiveSnapshot?
AFAICS, that doesn't seem correct, and that's why the patch removes it.

I've rebuilt Postgres with the patch applied and run the regression
tests, with and without "force_parallel_mode=regress", and all tests
pass.
So if the fix is wrong, no tests currently exist that detect issues with it.

Regards,
Greg Nancarrow
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set
Next
From: Nitin Jadhav
Date:
Subject: Re: Query about time zone patterns in to_char