Re: Cache relation sizes? - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Cache relation sizes? |
Date | |
Msg-id | CA+hUKG+d-9sETQaGfBGbGBOAPS-GjDns_vSMYhDuRW=VsYrzZw@mail.gmail.com Whole thread Raw |
In response to | Re: Cache relation sizes? (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Cache relation sizes?
|
List | pgsql-hackers |
On Tue, Dec 31, 2019 at 4:43 PM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > I still believe that one shared memory element for every > non-mapped relation is not only too-complex but also too-much, as > Andres (and implicitly I) wrote. I feel that just one flag for > all works fine but partitioned flags (that is, relations or files > corresponds to the same hash value share one flag) can reduce the > shared memory elements to a fixed small number. There is one potentially interesting case that doesn't require any kind of shared cache invalidation AFAICS. XLogReadBufferExtended() calls smgrnblocks() for every buffer access, even if the buffer is already in our buffer pool. I tried to make yet another quick experiment-grade patch to cache the size[1], this time for use in recovery only. initdb -D pgdata postgres -D pgdata -c checkpoint_timeout=60min In another shell: pgbench -i -s100 postgres pgbench -M prepared -T60 postgres killall -9 postgres mv pgdata pgdata-save Master branch: cp -r pgdata-save pgdata strace -c -f postgres -D pgdata [... wait for "redo done", then hit ^C ...] % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- ... 18.61 22.492286 26 849396 lseek 6.95 8.404369 30 277134 pwrite64 6.63 8.009679 28 277892 pread64 0.50 0.604037 39 15169 sync_file_range ... Patched: rm -fr pgdata cp -r pgdata-save pgdata strace -c -f ~/install/bin/postgres -D pgdata [... wait for "redo done", then hit ^C ...] % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- ... 16.33 8.097631 29 277134 pwrite64 15.56 7.715052 27 277892 pread64 1.13 0.559648 39 14137 sync_file_range ... 0.00 0.001505 25 59 lseek > Note: I'm still not sure how much lseek impacts performance. It doesn't seem great that we are effectively making system calls for most WAL records we replay, but, sadly, in this case the patch didn't really make any measurable difference when run without strace on this Linux VM. I suspect there is some workload and stack where it would make a difference (CF the read(postmaster pipe) call for every WAL record that was removed), but this is just something I noticed in passing while working on something else, so I haven't investigated much. [1] https://github.com/postgres/postgres/compare/master...macdice:cache-nblocks (just a test, unfinished, probably has bugs)
pgsql-hackers by date: