Re: Making all nbtree entries unique by having heap TIDs participatein comparisons - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
Date
Msg-id d66ba339-86eb-fb95-a123-92aec00816a7@iki.fi
Whole thread Raw
In response to Re: Making all nbtree entries unique by having heap TIDs participatein comparisons  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On 07/03/2019 15:41, Heikki Linnakangas wrote:
> On 07/03/2019 14:54, Peter Geoghegan wrote:
>> On Wed, Mar 6, 2019 at 10:15 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> After staring at the first patch for bit longer, a few things started to
>>> bother me:
>>>
>>> * The new struct is called BTScanInsert, but it's used for searches,
>>> too. It makes sense when you read the README, which explains the
>>> difference between "search scan keys" and "insertion scan keys", but now
>>> that we have a separate struct for this, perhaps we call insertion scan
>>> keys with a less confusing name. I don't know what to suggest, though.
>>> "Positioning key"?
>>
>> I think that insertion scan key is fine. It's been called that for
>> almost twenty years. It's not like it's an intuitive concept that
>> could be conveyed easily if only we came up with a new, pithy name.
> 
> Yeah. It's been like that forever, but I must confess I hadn't paid any
> attention to it, until now. I had not understood that text in the README
> explaining the difference between search and insertion scan keys, before
> looking at this patch. Not sure I ever read it with any thought. Now
> that I understand it, I don't like the "insertion scan key" name.
> 
>> BTW, I would like to hear what you think of the idea of minusinfkey
>> (and the !minusinfkey optimization) specifically.
> 
> I don't understand it :-(. I guess that's valuable feedback on its own.
> I'll spend more time reading the code around that, but meanwhile, if you
> can think of a simpler way to explain it in the comments, that'd be good.
> 
>>> The new BTInsertStateData struct also
>>> holds the current buffer we're considering to insert to, and a
>>> 'bounds_valid' flag to indicate if the saved bounds are valid for the
>>> current buffer. That way, it's more straightforward to clear the
>>> 'bounds_valid' flag whenever we move right.
>>
>> I'm not sure that that's an improvement. Moving right should be very
>> rare with my patch. gcov shows that we never move right here anymore
>> with the regression tests, or within _bt_check_unique() -- not once.
> 
> I haven't given performance much thought, really. I don't expect this
> method to be any slower, but the point of the refactoring is to make the
> code easier to understand.
> 
>>> /*
>>>     * Do the insertion. First move right to find the correct page to
>>>     * insert to, if necessary. If we're inserting to a non-unique index,
>>>     * _bt_search() already did this when it checked if a move to the
>>>     * right was required for leaf page.  Insertion scankey's scantid
>>>     * would have been filled out at the time. On a unique index, the
>>>     * current buffer is the first buffer containing duplicates, however,
>>>     * so we may need to move right to the correct location for this
>>>     * tuple.
>>>     */
>>> if (checkingunique || itup_key->heapkeyspace)
>>>           _bt_findinsertpage(rel, &insertstate, stack, heapRel);
>>>
>>> newitemoff = _bt_binsrch_insert(rel, &insertstate);
>>>
>>> _bt_insertonpg(rel, insertstate.buf, InvalidBuffer, stack, itup,
>>> newitemoff, false);
>>>
>>> Does this make sense?
>>
>> I guess you're saying this because you noticed that the for (;;) loop
>> in _bt_findinsertloc() doesn't do that much in many cases, because of
>> the fastpath.
> 
> The idea is that _bt_findinsertpage() would not need to know whether the
> unique checks were performed or not. I'd like to encapsulate all the
> information about the "insert position we're considering" in the
> BTInsertStateData struct. Passing 'checkingunique' as a separate
> argument violates that, because when it's set, the key means something
> slightly different.
> 
> Hmm. Actually, with patch #2, _bt_findinsertloc() could look at whether
> 'scantid' is set, instead of 'checkingunique'. That would seem better.
> If it looks like 'checkingunique', it's making the assumption that if
> unique checks were not performed, then we are already positioned on the
> correct page, according to the heap TID. But looking at 'scantid' seems
> like a more direct way of getting the same information. And then we
> won't need to pass the 'checkingunique' flag as an "out-of-band" argument.
> 
> So I'm specifically suggesting that we replace this, in _bt_findinsertloc:
> 
>         if (!checkingunique && itup_key->heapkeyspace)
>             break;
> 
> With this:
> 
>         if (itup_key->scantid)
>             break;
> 
> And remove the 'checkingunique' argument from _bt_findinsertloc.

Ah, scratch that. By the time we call _bt_findinsertloc(), scantid has 
already been restored, even if it was not set originally when we did 
_bt_search.

My dislike here is that passing 'checkingunique' as a separate argument 
acts like a "modifier", changing slightly the meaning of the insertion 
scan key. If it's not set, we know we're positioned on the correct page. 
Otherwise, we might not be. And it's a pretty indirect way of saying 
that, as it also depends 'heapkeyspace'. Perhaps add a flag to 
BTInsertStateData, to indicate the same thing more explicitly. Something 
like "bool is_final_insertion_page; /* when set, no need to move right */".

- Heikki


pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: pg_basebackup against older server versions
Next
From: Fabien COELHO
Date:
Subject: Re: Re: [HACKERS] proposal: schema variables