Re: GIN improvements part2: fast scan - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: GIN improvements part2: fast scan
Date
Msg-id CAPpHfdvMvEGXXUO0hpj=SU1tbVTpg1cP6kXjkyDqZpOD_9o7zQ@mail.gmail.com
Whole thread Raw
In response to Re: GIN improvements part2: fast scan  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: GIN improvements part2: fast scan  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On Thu, Feb 6, 2014 at 2:21 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 02/05/2014 12:42 PM, Alexander Korotkov wrote:
Attached patch is "light" version of fast scan. It does extra consistent
function calls only on startScanKey, no extra calls during scan of the
index.
It finds subset of rarest entries absence of which guarantee false
consistent function result.

Sounds good. Since the extra consistent calls are only performed at startup, it's unlikely to cause any noticeable performance regression, even when it's not helping.


I've run real-life tests based of postgresql.org logs (33318 queries). Here
is a table with summary time of running whole test case.

=# select method, sum(time) from test_result group by method order by
method;
        method        |       sum
---------------------+------------------
  fast-scan-11        | 126916.111999999
  fast-scan-light     |       137321.211
  heikki              |       466751.693
  heikki-true-ternary | 454113.623999997
  master              |       446976.288
(6 rows)

where 'heikki' is gin-ternary-logic binary-heap
preconsistent-only-on-new-page.patch and 'heikki-true-ternary' is version
with my catalog changes promoting ternary consistent function to opclass.

Looks good.


We can see fast-scan-light gives almost same performance benefit on "&"
queries. However, I test only CPU effect, not IO effect.
Any thoughts?

This test doesn't handle lossy-pages correctly:

                /*
                 * Check if all items marked as scanEntryUseForSkip is not present in
                 * minItem. If so, we can correct advancePast.
                 */
                if (ginCompareItemPointers(&minItem, &minItemSkip) < 0)
                {
                        advancePast = minItemSkip;
                        advancePast.ip_posid--;
                        continue;
                }


If minItemSkip is a lossy page, we skip all exact items on the same page. That's wrong, here's a test case to demonstrate:

CREATE TABLE foo (ts tsvector);
INSERT INTO foo SELECT to_tsvector('foo bar'||g) from generate_series(1, 1000000) g;
CREATE INDEX i_foo ON foo USING gin (ts);

set work_mem='64'; -- to force lossy pages
-- should return 111111
select count(*) from foo where ts @@ to_tsquery('foo & bar4:*');

After some fiddling (including fixing the above), I ended up with the attached version of your patch. I still left out the catalog changes, again not because I don't think we should have them, but to split this into smaller patches that can be reviewed separately. I extended the "both ways" shim function to work with more than one "maybe" input. Of course, that is O(n^2), where n is the number of "maybe" arguments, so that won't scale, but it's OK for testing purposes. And many if not most real world queries, too.

I had a very hard time understanding what it really means for an entry to be in the scanEntryUseForSkip array. What does it mean to "use" an entry for skipping?

So I refactored that logic, to hopefully make it more clear. In essence, the entries are divided into two groups, required and other items. If none of the entries in the required set are true, then we know that the overall qual cannot match. See the comments for a more detailed explanation. I'm not wedded to the term "required" here; one might think that *all* the entries in the required set must be TRUE for a match, while it's actually that at least one of them must be TRUE.

In keyGetItem, we fetch the minimum item among all the entries in the requiredEntries set. That's the next item we're going to return, regardless of any items in otherEntries set. But before calling the consistent function, we advance the other entries up to that point, so that we know whether to pass TRUE or FALSE to the consistent function. IOW, otherEntries only provide extra information for the consistent function to decide if we have a match or not - they don't affect which items we return to the caller.

I think this is pretty close to committable state now. I'm slightly disappointed that we can't do the skipping in more scenarios. For example, in "foo & bar", we can currently skip "bar" entry up to the next "foo", but we cannot skip "foo" up to the next "bar". But this is fairly simple, and since we sort the entries by estimated frequency, should already cover all the pathological "rare & frequent" type queries. So that's OK.

Sorry for my scanty explanation. Your version of patch looks good for me. I've rerun my testcase:

=# select method, sum(time) from test_result group by method order by method;
         method         |       sum        
------------------------+------------------
 fast-scan-11           | 126916.111999999
 fast-scan-light        |       137321.211
 fast-scan-light-heikki | 138168.028000001
 heikki                 |       466751.693
 heikki-tru-ternary     | 454113.623999997
 master                 |       446976.288
 tri-state              |       444725.515
(7 rows)

Difference is very small. For me, it looks ready for commit.

------
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Retain dynamic shared memory segments for postmaster lifetime
Next
From: Heikki Linnakangas
Date:
Subject: Small GIN optimizations (after 9.4)