Re: GIN improvements part2: fast scan - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: GIN improvements part2: fast scan
Date
Msg-id CAPpHfdtSuSoZS4SwYQu4Rc8NOwDHoX3DsLGaxJF9HZ6gW0TdVA@mail.gmail.com
Whole thread Raw
In response to Re: GIN improvements part2: fast scan  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: GIN improvements part2: fast scan  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On Wed, Feb 5, 2014 at 1:23 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, Feb 3, 2014 at 6:31 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, Jan 27, 2014 at 7:30 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, Jan 27, 2014 at 2:32 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
Every single change you did in fast scan seems to be reasonable, but testing shows that something went wrong. Simple test with 3 words of different selectivities. 

After applying your patches:
 
# select count(*) from fts_test where fti @@ plainto_tsquery('english', 'gin index select');
 count 
───────
   627
(1 row)

Time: 21,252 ms

In original fast-scan:

# select count(*) from fts_test where fti @@ plainto_tsquery('english', 'gin index select');
 count 
───────
   627
(1 row)

Time: 3,382 ms

I'm trying to get deeper into it.

I had two guesses about why it's become so slower than in my original fast-scan:
1) Not using native consistent function
2) Not sorting entries

I attach two patches which rollback these two features (sorry for awful quality of second). Native consistent function accelerates thing significantly, as expected. Tt seems that sorting entries have almost no effect. However it's still not as fast as initial fast-scan:

# select count(*) from fts_test where fti @@ plainto_tsquery('english', 'gin index select');
 count 
───────
   627
(1 row)

Time: 5,381 ms

Tomas, could you rerun your tests with first and both these patches applied against patches by Heikki?

I found my patch "0005-Ternary-consistent-implementation.patch" to be completely wrong. It introduces ternary consistent function to opclass, but don't uses it, because I forgot to include ginlogic.c change into patch. So, it shouldn't make any impact on performance. However, testing results with that patch significantly differs. That makes me very uneasy. Can we now reproduce exact same?

Right version of these two patches in one against current head is attached. I've rerun tests with it, results are /mnt/sas-raid10/gin-testing/queries/9.4-fast-scan-10. Could you rerun postprocessing including graph drawing?

Sometimes test cases are not what we expect. For example:

=# explain SELECT id FROM messages WHERE body_tsvector @@ to_tsquery('english','(5alpha1-initdb''d)');
                                               QUERY PLAN                                                
─────────────────────────────────────────────────────────────────────────────────────────────────────────
 Bitmap Heap Scan on messages  (cost=84.00..88.01 rows=1 width=4)
   Recheck Cond: (body_tsvector @@ '''5alpha1-initdb'' & ''5alpha1'' & ''initdb'' & ''d'''::tsquery)
   ->  Bitmap Index Scan on messages_body_tsvector_idx  (cost=0.00..84.00 rows=1 width=0)
         Index Cond: (body_tsvector @@ '''5alpha1-initdb'' & ''5alpha1'' & ''initdb'' & ''d'''::tsquery)
 Planning time: 0.257 ms
(5 rows)

5alpha1-initdb'd is 3 gin entries with different frequencies.

Also, these patches are not intended to change relevance ordering speed. When number of results are high, most of time is relevance calculating and sorting. I propose to remove ORDER BY clause from test cases to see scan speed more clear.

I've dump of postgresql.org search queries from Magnus. We can add them to our test case.

It turns out that version 10 contained bug in ternary consistent function implementation for tsvector. Fixed in attached version.

Attached patch is "light" version of fast scan. It does extra consistent function calls only on startScanKey, no extra calls during scan of the index.
It finds subset of rarest entries absence of which guarantee false consistent function result.

I've run real-life tests based of postgresql.org logs (33318 queries). Here is a table with summary time of running whole test case.

=# select method, sum(time) from test_result group by method order by method;
       method        |       sum        
---------------------+------------------
 fast-scan-11        | 126916.111999999
 fast-scan-light     |       137321.211
 heikki              |       466751.693
 heikki-true-ternary | 454113.623999997
 master              |       446976.288
(6 rows)

where 'heikki' is gin-ternary-logic binary-heap preconsistent-only-on-new-page.patch and 'heikki-true-ternary' is version with my catalog changes promoting ternary consistent function to opclass.

We can see fast-scan-light gives almost same performance benefit on "&" queries. However, I test only CPU effect, not IO effect.
Any thoughts?

------
With best regards,
Alexander Korotkov.

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Next
From: Alexander Korotkov
Date:
Subject: Re: Fix picksplit with nan values