Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | Anastasia Lubennikova |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | 56F0271D.5010004@postgrespro.ru Whole thread Raw |
In response to | Re: WIP: Covering + unique indexes. (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: WIP: Covering + unique indexes.
Re: WIP: Covering + unique indexes. |
List | pgsql-hackers |
19.03.2016 08:00, Peter Geoghegan: > On Fri, Mar 18, 2016 at 5:15 AM, David Steele <david@pgmasters.net> wrote: >> It looks like this patch should be marked "needs review" and I have done so. > Uh, no it shouldn't. I've posted an extensive review on the original > design thread. See CF entry: > > https://commitfest.postgresql.org/9/433/ > > Marked "Waiting on Author". Thanks to David, I've missed these letters at first. I'll answer here. > * You truncate (remove suffix attributes -- the "included" attributes) > within _bt_insertonpg(): > > - right_item = CopyIndexTuple(item); > + indnatts = IndexRelationGetNumberOfAttributes(rel); > + indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel); > + > + if (indnatts != indnkeyatts) > + { > + right_item = index_reform_tuple(rel, item, indnatts, indnkeyatts); > + right_item_sz = IndexTupleDSize(*right_item); > + right_item_sz = MAXALIGN(right_item_sz); > + } > + else > + right_item = CopyIndexTuple(item); > ItemPointerSet(&(right_item->t_tid), rbkno, P_HIKEY); > > I suggest that you do this within _bt_insert_parent(), instead, iff > the original target page is know to be a leaf page. That's where it > needs to happen for conventional suffix truncation, which has special > considerations when determining which attributes are safe to truncate > (or even which byte in the first distinguishing attribute it is okay > to truncate past) I agree that _bt_insertonpg() is not right for truncation. Furthermore, I've noticed that all internal keys are solely the copies of "High keys" from the leaf pages. Which is pretty logical. Therefore, if we have already truncated the tuple, when it became a High key, we do not need the same truncation within _bt_insert_parent() or any other function. So the only thing to worry about is the HighKey truncation. I rewrote the code. Now only _bt_split cares about truncation. It's a bit more complicated to add it into index creation algorithm. There's a trick with a "high key". /* * We copy the last item on the page into the new page, and then * rearrange the old page so that the 'last item' becomes its high key * rather than a true data item. There had better be at least two * items on the page already, else the page would be empty of useful * data. */ /* * Move 'last' into the high key position on opage */ To be consistent with other steps of algorithm ( all high keys must be truncated tuples), I had to update this high key on place: delete the old one, and insert truncated high key. The very same logic I use to truncate posting list of a compressed tuple in the "btree_compression" patch. [1] I hope, both patches will be accepted, and then I'll thoroughly merge them . > * I think the comparison logic may have a bug. > > Does this work with amcheck? Maybe it works with bt_index_check(), but > not bt_index_parent_check()? I think that you need to make sure that > _bt_compare() knows about this, too. That's because it isn't good > enough to let a truncated internal IndexTuple compare equal to a > scankey when non-truncated attributes are equal. It is a very important issue. But I don't think it's a bug there. I've read amcheck sources thoroughly and found that the problem appears at "invariant_key_less_than_equal_nontarget_offset() static bool invariant_key_less_than_equal_nontarget_offset(BtreeCheckState *state, Page nontarget, ScanKey key, OffsetNumber upperbound) { int16 natts = state->rel->rd_rel->relnatts; int32 cmp; cmp = _bt_compare(state->rel, natts, key, nontarget, upperbound); return cmp <= 0; } It uses scankey, made with _bt_mkscankey() which uses only key attributes, but calls _bt_compare with wrong keysz. If we wiil use nkeyatts = state->rel->rd_index->relnatts; instead of natts, all the checks would be passed successfully. Same for invariant_key_greater_than_equal_offset() and invariant_key_less_than_equal_nontarget_offset(). In my view, it's the correct way to fix this problem, because the caller is responsible for passing proper arguments to the function. Of course I will add a check into bt_compare, but I'd rather make it an assertion (see the patch attached). I'll add a flag to distinguish regular and truncated tuples, but it will not be used in this patch. Please, comment, if I've missed something. As you've already mentioned, neither high keys, nor tuples on internal pages are using "itup->t_tid.ip_posid", so I'll take one bit of it. It will definitely require changes in the future works on suffix truncation or something like that, but IMHO for now it's enough. Do you have any objections or comments? [1] https://commitfest.postgresql.org/9/494/ -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: