Re: Cache relation sizes? - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Cache relation sizes?
Date
Msg-id CA+hUKGKEW7-9pq+s2_4Q-Fcpr9cc7_0b3pkno5qzPKC3y2nOPA@mail.gmail.com
Whole thread Raw
In response to Re: Cache relation sizes?  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Cache relation sizes?  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
On Fri, Jul 31, 2020 at 2:36 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> There's still the matter of crazy numbers of lseeks in regular
> backends; looking at all processes while running the above test, I get
> 1,469,060 (9.18 per pgbench transaction) without -M prepared, and
> 193,722 with -M prepared (1.21 per pgbench transaction).  Fixing that
> with this approach will require bullet-proof shared invalidation, but
> I think it's doable, in later work.

I couldn't help hacking on this a bit.  Perhaps instead of
bullet-proof general shared invalidation, we should have a way for
localised bits of code to state that they are ok with a "relaxed"
value.  Then they should explain the theory for why that is safe in
each case based on arguments about memory barrier pairs, but leave all
other callers making the system call so that we don't torpedo the
whole project by making it too hard.  For now, the main cases I'm
interested in are the ones that show up all the time as the dominant
system call in various workloads:

(1) Sequential scan relation-size probe.  This should be OK with a
relaxed value.  You can't miss the invalidation for a truncation,
because the read barrier in your lock acquisition on the relation
pairs with the write barrier in the exclusive lock release of any
session that truncated, and you can't miss relation any relation
extension that your snapshot can see, because the release of the
extension lock pairs with the lock involved in snapshot acquisition.

(2) Planner relation-size probe, which should be OK with a relaxed
value.  Similar arguments give you a fresh enough view, I think.

Or maybe there is a theory along these lines that already covers every
use of smgrnblocks(), so a separate mode isn't require... I don't
know!

The attached sketch-quality patch shows some of the required elements
working, though I ran out of steam trying to figure out how to thread
this thing through the right API layers so for now it always asks for
a relaxed value in table_block_relation_size().

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: display offset along with block number in vacuum errors
Next
From: Konstantin Knizhnik
Date:
Subject: Confusing behavior of create table like