Re: Avoid full GIN index scan when possible - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Avoid full GIN index scan when possible
Date
Msg-id CAPpHfdusQSm6Nx+hipY+ArU0Udr8ZYivWvoCx2AT2Yvujd=WPA@mail.gmail.com
Whole thread Raw
In response to Re: Avoid full GIN index scan when possible  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Avoid full GIN index scan when possible
List pgsql-hackers
Hi!

Thank you for feedback!

On Sat, Jan 11, 2020 at 3:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> On Sat, Jan 11, 2020 at 1:10 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> >
> > On Fri, Jan 10, 2020 at 7:36 PM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > >
> > > On Fri, Jan 10, 2020 at 6:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>
> > >> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > >> > So, I think v10 is a version of patch, which can be committed after
> > >> > some cleanup.  And we can try doing better nulls handling in a separate
> > >> > patch.
> > >>
> > >> The cfbot reports that this doesn't pass regression testing.
> > >> I haven't looked into why not.
> > >
> > >
> > > Thank you for noticing.  I'll take care of it.
> >
> >
> > In the v10 I've fixed a bug with nulls handling, but it appears that
> > test contained wrong expected result.  I've modified this test so that
> > it directly compares sequential scan results with bitmap indexscan
> > results.
>
> Thanks a lot for working on that!  I'm not very familiar with gin
> internals so additional eyes are definitely needed here but I agree
> that this approach is simpler and cleaner.  I didn't find any problem
> with the modified logic, the patch applies cleanly, compiles without
> warning and all regression tests pass, so it all seems good to me.
>
> Here are a few comments:
>
> - In keyGetItem(), it seems that some comments would need to be
> updated wrt. the new excludeOnly flag.  I'm thinking of:
>
>      * Ok, we now know that there are no matches < minItem.
>      *
>      * If minItem is lossy, it means that there were no exact items on the
>      * page among requiredEntries, because lossy pointers sort after exact
>      * items. However, there might be exact items for the same page among
>      * additionalEntries, so we mustn't advance past them.
>
> and
>
>         /*
>          * Normally, none of the items in additionalEntries can have a curItem
>          * larger than minItem. But if minItem is a lossy page, then there
>          * might be exact items on the same page among additionalEntries.
>          */        if (ginCompareItemPointers(&entry->curItem, &minItem) < 0)
>         {
>             Assert(ItemPointerIsLossyPage(&minItem) || key->nrequired == 0);
>             minItem = entry->curItem;
>         }

Sure, thank you for pointing.  I'm working on improving comments.
I'll provide updated patch soon.

> While at it, IIUC only excludeOnly key can have nrequired == 0 (if
> that's the case, this could be explicitly said in startScanKey
> relevant comment), so it'd be more consistent to also use excludeOnly
> rather than nrequired in this assert?

Make sense.  I'll adjust the assert as well as comment.

> - the pg_trgm regression tests check for the number of rows returned
> with the new "excludeOnly" permutations, but only with an indexscan,
> should we make sure that the same results are returned with a seq
> scan?

Yes, I recently fixed similar issue in gin regression test.  I'll
adjust pg_trgm test as well.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)
Next
From: Tom Lane
Date:
Subject: Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)