Home > mailing lists

Re: Cache relation sizes? - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: Cache relation sizes?
Date	December 31, 2019 07:05:31
Msg-id	CA+hUKG+d-9sETQaGfBGbGBOAPS-GjDns_vSMYhDuRW=VsYrzZw@mail.gmail.com Whole thread Raw
In response to	Re: Cache relation sizes? (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses	Re: Cache relation sizes?
List	pgsql-hackers

Tree view

On Tue, Dec 31, 2019 at 4:43 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I still believe that one shared memory element for every
> non-mapped relation is not only too-complex but also too-much, as
> Andres (and implicitly I) wrote. I feel that just one flag for
> all works fine but partitioned flags (that is, relations or files
> corresponds to the same hash value share one flag) can reduce the
> shared memory elements to a fixed small number.

There is one potentially interesting case that doesn't require any
kind of shared cache invalidation AFAICS.  XLogReadBufferExtended()
calls smgrnblocks() for every buffer access, even if the buffer is
already in our buffer pool.  I tried to make yet another quick
experiment-grade patch to cache the size[1], this time for use in
recovery only.

initdb -D pgdata
postgres -D pgdata -c checkpoint_timeout=60min

In another shell:
pgbench -i -s100 postgres
pgbench -M prepared -T60 postgres
killall -9 postgres
mv pgdata pgdata-save

Master branch:

cp -r pgdata-save pgdata
strace -c -f postgres -D pgdata
[... wait for "redo done", then hit ^C ...]
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
...
 18.61   22.492286          26    849396           lseek
  6.95    8.404369          30    277134           pwrite64
  6.63    8.009679          28    277892           pread64
  0.50    0.604037          39     15169           sync_file_range
...

Patched:

rm -fr pgdata
cp -r pgdata-save pgdata
strace -c -f ~/install/bin/postgres -D pgdata
[... wait for "redo done", then hit ^C ...]
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
...
 16.33    8.097631          29    277134           pwrite64
 15.56    7.715052          27    277892           pread64
  1.13    0.559648          39     14137           sync_file_range
...
  0.00    0.001505          25        59           lseek

> Note: I'm still not sure how much lseek impacts performance.

It doesn't seem great that we are effectively making system calls for
most WAL records we replay, but, sadly, in this case the patch didn't
really make any measurable difference when run without strace on this
Linux VM.  I suspect there is some workload and stack where it would
make a difference (CF the read(postmaster pipe) call for every WAL
record that was removed), but this is just something I noticed in
passing while working on something else, so I haven't investigated
much.

[1] https://github.com/postgres/postgres/compare/master...macdice:cache-nblocks
(just a test, unfinished, probably has bugs)

pgsql-hackers by date:

From: Tom Lane
Date: 31 December 2019, 06:50:21
Subject: Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno

From: Amit Kapila
Date: 31 December 2019, 09:13:52
Subject: Re: [HACKERS] Block level parallel vacuum

Re: Cache relation sizes? - Mailing list pgsql-hackers

Previous

Next