Thread: Pathological performance when inserting many NULLs into a unique index
I thought of a case that results in pathological performance due to a behavior of my nbtree patch series: regression=# create table uniquenulls(nully int4, constraint pp unique(nully)); CREATE TABLE Time: 10.694 ms regression=# insert into uniquenulls select i from generate_series(1, 1e6) i; INSERT 0 1000000 Time: 1356.025 ms (00:01.356) regression=# insert into uniquenulls select null from generate_series(1, 1e6) i; INSERT 0 1000000 Time: 270834.196 ms (04:30.834) The issue here is that the duration of the second INSERT statement is wildly excessive, because _bt_stepright() needs to step right many many times for each tuple inserted. I would expect the second insert to take approximately as long as the first, but it takes ~200x longer here. It could take much longer still if we pushed this example further. What we see here is a limited case in which the O(n ^ 2) performance that "getting tired" used to prevent can occur. Clearly that's totally unacceptable. Mea culpa. Sure enough, the problem goes away when the index isn't a unique index (i.e. in the UNIQUE_CHECK_NO case): regression=# alter table uniquenulls drop constraint pp; ALTER TABLE Time: 28.968 ms regression=# create index on uniquenulls (nully); CREATE INDEX Time: 1159.958 ms (00:01.160) regression=# insert into uniquenulls select null from generate_series(1, 1e6) i; INSERT 0 1000000 Time: 1155.708 ms (00:01.156) Tentatively, I think that the fix here is to not "itup_key->scantid = NULL" when a NULL value is involved (i.e. don't "itup_key->scantid = NULL" when we see IndexTupleHasNulls(itup) within _bt_doinsert()). We may also want to avoid calling _bt_check_unique() entirely in this situation. That way, the performance should be the same as the UNIQUE_CHECK_NO case: we descend to the appropriate leaf page directly, and then we're done. We don't have to find the appropriate leaf page by groveling through indefinitely-many existing leaf pages that are full of NULL values, because we know that there cannot ever be a unique violation for us to detect. I'll create an open item for this, and begin work on a patch tomorrow. -- Peter Geoghegan
Re: Pathological performance when inserting many NULLs into a unique index
From
Peter Geoghegan
Date:
On Wed, Apr 17, 2019 at 7:37 PM Peter Geoghegan <pg@bowt.ie> wrote: > Tentatively, I think that the fix here is to not "itup_key->scantid = > NULL" when a NULL value is involved (i.e. don't "itup_key->scantid = > NULL" when we see IndexTupleHasNulls(itup) within _bt_doinsert()). We > may also want to avoid calling _bt_check_unique() entirely in this > situation. > I'll create an open item for this, and begin work on a patch tomorrow. I came up with the attached patch, which effectively treats a unique index insertion as if the index was not unique at all in the case where new tuple is IndexTupleHasNulls() within _bt_doinsert(). This is correct because inserting a new tuple with a NULL value can neither contribute to somebody else's duplicate violation, nor have a duplicate violation error of its own. I need to test this some more, though I am fairly confident that I have the right idea. We didn't have this problem before my v12 work due to the "getting tired" strategy, but that's not the full story. We also didn't (and don't) move right within _bt_check_unique() when IndexTupleHasNulls(itup), because _bt_isequal() treats NULL values as not equal to any value, including even NULL -- this meant that there was no useless search to the right within _bt_check_unique(). Actually, the overall effect of how _bt_isequal() is used is that _bt_check_unique() does *no* useful work at all with a NULL scankey. It's far simpler to remove responsibility for new tuples/scankeys with NULL values from _bt_check_unique() entirely, which is what I propose to do with this patch. The patch actually ends up removing quite a lot more code than it adds, because it removes _bt_isequal() outright. I always thought that nbtinsert.c dealt with NULL values in unique indexes at the wrong level, and I'm glad to see _bt_isequal() go. Vadim's accompanying 1997 comment about "not using _bt_compare in real comparison" seems confused to me. The new _bt_check_unique() may still need to compare the scankey to some existing, adjacent tuple with a NULL value, but _bt_compare() is perfectly capable of recognizing that that adjacent tuple shouldn't be considered equal. IOW, _bt_compare() is merely incapable of behaving as if "NULL != NULL", which is a bad reason for keeping _bt_isequal() around. -- Peter Geoghegan
Attachment
Re: Pathological performance when inserting many NULLs into a uniqueindex
From
Michael Paquier
Date:
On Wed, Apr 17, 2019 at 07:37:17PM -0700, Peter Geoghegan wrote: > I'll create an open item for this, and begin work on a patch tomorrow. Adding an open item is adapted, nice you found this issue. Testing on my laptop with v11, the non-NULL case takes 5s and the NULL case 6s. On HEAD, the non-NULL case takes 6s, and the NULL case takes... I just gave up and cancelled it :) -- Michael
Attachment
Re: Pathological performance when inserting many NULLs into a unique index
From
Peter Geoghegan
Date:
On Thu, Apr 18, 2019 at 5:46 PM Michael Paquier <michael@paquier.xyz> wrote: > Adding an open item is adapted, nice you found this issue. Testing on > my laptop with v11, the non-NULL case takes 5s and the NULL case 6s. > On HEAD, the non-NULL case takes 6s, and the NULL case takes... I > just gave up and cancelled it :) This was one of those things that occurred to me out of the blue. It just occurred to me that the final patch will need to be more careful about non-key attributes in INCLUDE indexes. It's not okay for it to avoid calling _bt_check_unique() just because a non-key attribute was NULL. It should only do that when a key attribute is NULL. -- Peter Geoghegan
Re: Pathological performance when inserting many NULLs into a unique index
From
Peter Geoghegan
Date:
On Thu, Apr 18, 2019 at 8:13 PM Peter Geoghegan <pg@bowt.ie> wrote: > It just occurred to me that the final patch will need to be more > careful about non-key attributes in INCLUDE indexes. It's not okay for > it to avoid calling _bt_check_unique() just because a non-key > attribute was NULL. It should only do that when a key attribute is > NULL. Attached revision does it that way, specifically by adding a new field to the insertion scankey struct (BTScanInsertData). The field is filled-in when initializing an insertion scankey in the usual way. I was able to reuse alignment padding for the new field, so there is no additional space overhead on Linux x86-64. This is a bit more complicated than v1, but there is still an overall net reduction in overall complexity (and in LOC). I'm going to commit this early next week, barring any objections, and assuming I don't think of any more problems between now and then. -- Peter Geoghegan
Attachment
Re: Pathological performance when inserting many NULLs into a unique index
From
Peter Geoghegan
Date:
On Fri, Apr 19, 2019 at 6:34 PM Peter Geoghegan <pg@bowt.ie> wrote: > Attached revision does it that way, specifically by adding a new field > to the insertion scankey struct (BTScanInsertData). Pushed. -- Peter Geoghegan