Re: contrib/cache_scan (Re: What's needed for cache-only table scan?) - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F8F897B8@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
Thanks for your efforts!
>                                          Head          patched
> Diff
> Select -  500K                772ms        2659ms        -200%
> Insert - 400K                   3429ms     1948ms          43% (I am
> not sure how it improved in this case)
> delete - 200K                 2066ms     3978ms        -92%
> update - 200K                3915ms      5899ms        -50%
>
> This patch shown how the custom scan can be used very well but coming to
> patch as It is having some performance problem which needs to be
> investigated.
>
> I attached the test script file used for the performance test.
>
First of all, it seems to me your test case has too small data set that
allows to hold all the data in memory - briefly 500K of 200bytes record
will consume about 100MB. Your configuration allocates 512MB of
shared_buffer, and about 3GB of OS-level page cache is available.
(Note that Linux uses free memory as disk cache adaptively.)

This cache is designed to hide latency of disk accesses, so this test
case does not fit its intention.
(Also, the primary purpose of this module is a demonstration for
heap_page_prune_hook to hook vacuuming, so simple code was preferred
than complicated implementation but better performance.)

I could reproduce the overall trend, no cache scan is faster than
cached scan if buffer is in memory. Probably, it comes from the
cost to walk down T-tree index using ctid per reference.
Performance penalty around UPDATE and DELETE likely come from
trigger invocation per row.
I could observe performance gain on INSERT a little bit.
It's strange for me, also. :-(

On the other hand, the discussion around custom-plan interface
effects this module because it uses this API as foundation.
Please wait for a few days to rebase the cache_scan module onto
the newer custom-plan interface; that I submitted just a moment
before.

Also, is it really necessary to tune the performance stuff in this
example module of the heap_page_prune_hook?
Even though I have a few ideas to improve the cache performance,
like insertion of multiple rows at once or local chunk copy instead
of t-tree walk down, I'm not sure whether it is productive in the
current v9.4 timeframe. ;-(

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Haribabu Kommi
> Sent: Wednesday, March 12, 2014 1:14 PM
> To: Kohei KaiGai
> Cc: Kaigai Kouhei(海外 浩平); Tom Lane; PgHacker; Robert Haas
> Subject: Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only
> table scan?)
>
> On Thu, Mar 6, 2014 at 10:15 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> > 2014-03-06 18:17 GMT+09:00 Haribabu Kommi <kommi.haribabu@gmail.com>:
> >> I will update you later regarding the performance test results.
> >>
>
> I ran the performance test on the cache scan patch and below are the readings.
>
> Configuration:
>
> Shared_buffers - 512MB
> cache_scan.num_blocks - 600
> checkpoint_segments - 255
>
> Machine:
> OS - centos - 6.4
> CPU - 4 core 2.5 GHZ
> Memory - 4GB
>
>                                          Head          patched
> Diff
> Select -  500K                772ms        2659ms        -200%
> Insert - 400K                   3429ms     1948ms          43% (I am
> not sure how it improved in this case)
> delete - 200K                 2066ms     3978ms        -92%
> update - 200K                3915ms      5899ms        -50%
>
> This patch shown how the custom scan can be used very well but coming to
> patch as It is having some performance problem which needs to be
> investigated.
>
> I attached the test script file used for the performance test.
>
> Regards,
> Hari Babu
> Fujitsu Australia



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: COPY table FROM STDIN doesn't show count tag
Next
From: Haribabu Kommi
Date:
Subject: Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)