Re: Index Skip Scan - Mailing list pgsql-hackers
From | Dmitry Dolgov |
---|---|
Subject | Re: Index Skip Scan |
Date | |
Msg-id | 20191103164547.oqcvno6467eikizg@localhost Whole thread Raw |
In response to | Re: Index Skip Scan (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Index Skip Scan
|
List | pgsql-hackers |
> On Wed, Sep 25, 2019 at 2:33 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > v27-0001-Index-skip-scan.patch > > Some random thoughts on this: Thanks a lot for the commentaries! > * Is _bt_scankey_within_page() really doing the right thing within empty pages? > > It looks like you're accidentally using the high key when the leaf > page is empty with forward scans (assuming that the leaf page isn't > rightmost). You'll need to think about empty pages for both forward > and backward direction scans there. Yes, you're right, that's an issue I need to fix. > Actually, using the high key in some cases may be desirable, once the > details are worked out -- the high key is actually very helpful with > low cardinality indexes. If you populate an index with retail > insertions (i.e. don't just do a CREATE INDEX after the table is > populated), and use low cardinality data in the indexed columns then > you'll see this effect. Can you please elaborate a bit more? I see that using high key will help a forward index scans to access the minimum number of leaf pages, but I'm not following how is it connected to the _bt_scankey_within_page? Or is this commentary related in general to the whole implementation? > * The extra scankeys that you are using in most of the new nbtsearch.c > code is an insertion scankey -- not a search style scankey. I think > that you should try to be a bit clearer on that distinction in > comments. It is already confusing now, but at least there is only zero > or one insertion scankeys per scan (for the initial positioning). > > * There are two _bt_skip() prototypes in nbtree.h (actually, there is > a btskip() and a _bt_skip()). I understand that the former is a public > wrapper of the latter, but I find the naming a little bit confusing. > Maybe rename _bt_skip() to something that is a little bit more > suggestive of its purpose. > > * Suggest running pgindent on the patch. Sure, I'll incorporate mentioned improvements into the next patch version (hopefully soon). > And now some more: > > * I'm confused about this code in _bt_skip(): > Yeah, it shouldn't be there, but rather before _bt_next, that expects unlocked buffer. Will fix. > * It also seems a bit odd that you assume that the scan is > "scan->xs_want_itup", but then check that condition many times. Why > bother? > > * Similarly, why bother using _bt_drop_lock_and_maybe_pin() at all, > rather than just unlocking the buffer directly? We'll only drop the > pin for a scan that is "!scan->xs_want_itup", which is never the case > within _bt_skip(). > > I think that the macros and stuff that manage pins and buffer locks in > nbtsearch.c is kind of a disaster anyway [1]. Maybe there is some > value in trying to be consistent with existing nbtsearch.c code in > ways that aren't strictly necessary. Yep, I've seen this thread, but tried to be consistent with the surrounding core style. Probably it indeed doesn't make sense. > * Not sure why you need this code after throwing an error: > > > else > > { > > elog(ERROR, "Could not read closest index tuples: %d", offnum); > > pfree(so->skipScanKey); > > so->skipScanKey = NULL; > > return false; > > } Unfortunately this is just a leftover from a previous version. Sorry for that, will get rid of it.
pgsql-hackers by date: