RE: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump - Mailing list pgsql-hackers

From Pengchengliu
Subject RE: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Date
Msg-id 000801d74d16$150494c0$3f0dbe40$@tju.edu.cn
Whole thread Raw
In response to Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
Hi Greg,
   Thanks a lot for you explanation and your fix.

   I think your fix can resolve the core dump issue. As with your fix, parallel process reset Transaction Xmin from
ActiveSnapshot. 
   But it will change Transaction snapshot for all parallel  scenarios. I don't know whether it bring in other issue.
   For test only, I think it is enough.

   So is there anybody can explain what's exactly difference between ActiveSnapshot and TransactionSnapshot in parallel
workprocess.  
   Then maybe we can find a better solution and try to fix it really.

Thanks
Pengcheng

-----Original Message-----
From: Greg Nancarrow <gregn4422@gmail.com>
Sent: 2021年5月18日 17:15
To: Pengchengliu <pengchengliu@tju.edu.cn>
Cc: Andres Freund <andres@anarazel.de>; PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

On Tue, May 18, 2021 at 11:27 AM Pengchengliu <pengchengliu@tju.edu.cn> wrote:
>
> Hi Greg,
>
>    Actually I am very confused about ActiveSnapshot and TransactionSnapshot. I don't know why main process send
ActiveSnapshotand TransactionSnapshot separately.  And what is exact difference between them? 
>    If you know that, could you explain that for me? It will be very appreciated.

In the context of a parallel-worker, I am a little confused too, so I can't explain it either.
It is not really explained in the file
"src\backend\access\transam\README.parallel", it only mentions the following as part of the state that needs to be
copiedto each worker: 

 - The transaction snapshot.
 - The active snapshot, which might be different from the transaction snapshot.

So they might be different, but exactly when and why?

When I debugged a typical parallel-SELECT case, I found that prior to plan execution, GetTransactionSnapshot() was
calledand its return value was stored in both the QueryDesc and the estate (es_snapshot), which was then pushed on the
ActiveSnapshotstack. So by the time 
InitializeParallelDSM() was called, the (top) ActiveSnapshot was the last snapshot returned from
GetTransactionSnapshot().
So why InitializeParallelDSM() calls GetTransactionSnapshot() again is not clear to me (because isn't then the
ActiveSnapshota potentially earlier snapshot? - which it shouldn't be, AFAIK. And also, it's then different to the
non-parallelcase). 

>    Before we know them exactly, I think we should not modify the TransactionSnapshot to ActiveSnapshot in main
process.If it is, the main process should send ActiveSnapshot only. 

I think it would be worth you trying my suggested change (if you have a development environment, which I assume you
have).Sure, IF it was deemed a proper solution, you'd only send the one snapshot, and adjust accordingly in
ParallelWorkerMain(),but we need not worry about that in order to test it. 


Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
Next
From: Michael Paquier
Date:
Subject: Installation of regress.so?