Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date
Msg-id CAH2-Wzk=y1KVZT-2J9GgqTvBFNo8fc8ecStYDSR=avstJ6R9ZA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Sun, Jan 14, 2018 at 8:25 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sun, Jan 14, 2018 at 1:43 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>> On Sat, Jan 13, 2018 at 4:32 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Yeah, but this would mean that now with parallel create index, it is
>>> possible that some tuples from the transaction would end up in index
>>> and others won't.
>>
>> You mean some tuples from some past transaction that deleted a bunch
>> of tuples and committed, but not before someone acquired a still-held
>> snapshot that didn't see the deleter's transaction as committed yet?
>>
>
> I think I am talking about something different.  Let me try to explain
> in some more detail.  Consider a transaction T-1 has deleted two
> tuples from tab-1, first on page-1 and second on page-2 and committed.
> There is a parallel transaction T-2 which has an open snapshot/query
> due to which oldestXmin will be smaller than T-1.   Now, in another
> session, we started parallel Create Index on tab-1 which has launched
> one worker.  The worker decided to scan page-1 and will found that the
> deleted tuple on page-1 is Recently Dead, so will include it in Index.
> In the meantime transaction, T-2 got committed/aborted which allows
> oldestXmin to be greater than the value of transaction T-1 and now
> leader decides to scan the page-2 with freshly computed oldestXmin and
> found that the tuple on that page is Dead and has decided not to
> include it in the index.  So, this leads to a situation where some
> tuples deleted by the transaction will end up in index whereas others
> won't.  Note that I am not arguing that there is any fundamental
> problem with this, but just want to highlight that such a case doesn't
> seem to exist with Create Index.

I must have not done a good job of explaining myself ("You mean some
tuples from some past transaction..."), because this is exactly what I
meant, and was exactly how I understood your original remarks from
Saturday.

In summary, while I do agree that this is different to what we see
with serial index builds, I still don't think that this is a concern
for us.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: "Tels"
Date:
Subject: Re: proposal: alternative psql commands quit and exit
Next
From: Joe Wildish
Date:
Subject: Re: Implementing SQL ASSERTION