Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date
Msg-id CAH2-Wz=oT=PePbGjS2k5p2SivA80J3S865tvCKHPEGOUJPp-+A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
List pgsql-hackers
On Fri, Jan 12, 2018 at 6:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jan 12, 2018 at 8:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> 1.
>> + if (!IsBootstrapProcessingMode() && !indexInfo->ii_Concurrent)
>>   {
>> - snapshot = RegisterSnapshot(GetTransactionSnapshot());
>> - OldestXmin = InvalidTransactionId; /* not used */
>> + OldestXmin = GetOldestXmin(heapRelation, true);
>>
>> I think leader and workers should have the same idea of oldestXmin for
>> the purpose of deciding the visibility of tuples.  I think this is
>> ensured in all form of parallel query as we do share the snapshot,
>> however, same doesn't seem to be true for Parallel Index builds.
>
> Hmm.  Does it break anything if they use different snapshots?  In the
> case of a query that would be disastrous because then you might get
> inconsistent results, but if the snapshot is only being used to
> determine what is and is not dead then I'm not sure it makes much
> difference ... unless the different snapshots will create confusion of
> some other sort.

I think that this is fine. GetOldestXmin() is only used when we have a
ShareLock on the heap relation, and the snapshot is SnapshotAny. We're
only talking about the difference between HEAPTUPLE_DEAD and
HEAPTUPLE_RECENTLY_DEAD here. Indexing a heap tuple when that wasn't
strictly necessary by the time you got to it is normal.

However, it's not okay that GetOldestXmin()'s second argument is true
in the patch, rather than PROCARRAY_FLAGS_VACUUM. That's due to bitrot
that was not caught during some previous rebase (commit af4b1a08
changed the signature). Will fix.

You've given me a lot more to work through in your most recent mail,
Robert. I will probably get the next revision to you on Monday.
Doesn't seem like there is much point in posting what I've done so
far.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Mark Rofail
Date:
Subject: Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Next
From: legrand legrand
Date:
Subject: Re: PoC: custom signal handler for extensions