Re: Patch for fast gin cache performance improvement - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Patch for fast gin cache performance improvement
Date
Msg-id 006801cebab8$b6352c90$229f85b0$@lab.ntt.co.jp
Whole thread Raw
In response to Patch for fast gin cache performance improvement  (Ian Link <ian@ilink.io>)
Responses Re: Patch for fast gin cache performance improvement
List pgsql-hackers
Hi Ian,

> This patch contains a performance improvement for the fast gin cache. As you
> may know, the performance of the fast gin cache decreases with its size.
> Currently, the size of the fast gin cache is tied to work_mem. The size of
> work_mem can often be quite high. The large size of work_mem is inappropriate
> for the fast gin cache size. Therefore, we created a separate cache size
called
> gin_fast_limit. This global variable controls the size of the fast gin cache,
> independently of work_mem. Currently, the default gin_fast_limit is set to
128kB.
> However, that value could need tweaking. 64kB may work better, but it's hard
> to say with only my single machine to test on.

> On my machine, this patch results in a nice speed up. Our test queries improve
> from about 0.9 ms to 0.030 ms. Please feel free to use the test case yourself:
> it should be attached. I can look into additional test cases (tsvectors) if
> anyone is interested.

> In addition to the global limit, we have provided a per-index limit:
> fast_cache_size. This per-index limit begins at -1, which means that it is
> disabled. If the user does not specify a per-index limit, the index will
simply
> use the global limit.

I had a look over this patch.  I think this patch is interesting and very
useful.  Here are my review points:

1. Patch applies cleanly.
2. make, make install and make check is good.
3. I did performance evaluation using your test queries with 64kB and 128kB of
gin_fast_limit (or fast_cache_size), and saw that both values achieved the
performance gains over gin_fast_limit = '256MB'.  64kB worked better than 128kB.
64kB improved from 1.057 ms to 0.075 ms.  Great!
4. In my understanding, the small value of gin_fast_limit/fast_cache_size leads
to the increase in GIN search performance, which, however, leads to the decrease
in GIN update performance.  Am I right?  If so, I think the tradeoff should be
noted in the documentation.
5. The following documents in Chapter 57. GIN Indexes need to be updated:* 57.3.1. GIN Fast Update Technique* 57.4. GIN
Tipsand Tricks
 
6. I would like to see the results for the additional test cases (tsvectors).
7. The commented-out elog() code should be removed.
8. I think there are no issues in this patch.  However, I have one question: how
this patch works in the case where gin_fast_limit/fast_cache_size = 0?  In this
case, in my understanding, this patch inserts new entries into the pending list
temporarily and immediately moves them to the main GIN data structure using
ginInsertCleanup().  Am I right?  If so, that is obviously inefficient.

Sorry for the delay.

Best regards,
Etsuro Fujita





pgsql-hackers by date:

Previous
From: Samrat Revagade
Date:
Subject: setting separate values of replication parameters to each standby to provide more granularity
Next
From: Bruce Momjian
Date:
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE