Re: Cache relation sizes? - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: Cache relation sizes?
Date
Msg-id d37adb76-caef-35d7-c7b6-797b6cc10b5c@postgrespro.ru
Whole thread Raw
In response to Re: Cache relation sizes?  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Cache relation sizes?  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers

On 16.11.2020 10:11, Thomas Munro wrote:
> On Tue, Aug 4, 2020 at 2:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>> On Tue, Aug 4, 2020 at 3:54 AM Konstantin Knizhnik
>> <k.knizhnik@postgrespro.ru> wrote:
>>> This shared relation cache can easily store relation size as well.
>>> In addition it will solve a lot of other problems:
>>> - noticeable overhead of local relcache warming
>>> - large memory consumption in case of larger number of relations
>>> O(max_connections*n_relations)
>>> - sophisticated invalidation protocol and related performance issues
>>> Certainly access to shared cache requires extra synchronization.But DDL
>>> operations are relatively rare.
>>> So in most cases we will have only shared locks. May be overhead of
>>> locking will not be too large?
>> Yeah, I would be very happy if we get a high performance shared
>> sys/rel/plan/... caches in the future, and separately, having the
>> relation size available in shmem is something that has come up in
>> discussions about other topics too (tree-based buffer mapping,
>> multi-relation data files, ...).  ...
> After recent discussions about the limitations of relying on SEEK_END
> in a nearby thread[1], I decided to try to prototype a system for
> tracking relation sizes properly in shared memory.  Earlier in this
> thread I was talking about invalidation schemes for backend-local
> caches, because I only cared about performance.  In contrast, this new
> system has SMgrRelation objects that point to SMgrSharedRelation
> objects (better names welcome) that live in a pool in shared memory,
> so that all backends agree on the size.  The scheme is described in
> the commit message and comments.  The short version is that smgr.c
> tracks the "authoritative" size of any relation that has recently been
> extended or truncated, until it has been fsync'd.  By authoritative, I
> mean that there may be dirty buffers in that range in our buffer pool,
> even if the filesystem has vaporised the allocation of disk blocks and
> shrunk the file.
>
> That is, it's not really a "cache".  It's also not like a shared
> catalog, which Konstantin was talking about... it's more like the pool
> of inodes in a kernel's memory.  It holds all currently dirty SRs
> (SMgrSharedRelations), plus as many clean ones as it can fit, with
> some kind of reclamation scheme, much like buffers.  Here, "dirty"
> means the size changed.

I will look at your implementation more precisely latter.
But now I just to ask one question: is it reasonable to spent 
substantial amount of efforts trying to solve
the problem of redundant calculations of relation size, which seems to 
be just part of more general problem: shared relation cache?

It is well known problem: private caches of system catalog can cause 
unacceptable memory consumption for large number of backends.
If there are thousands relations in the database (which is not so rare 
case, especially taken in account ORMs and temporary tables)
then size of catalog cache can be several megabytes. If it is multiplies 
by thousand backends, then we get gigabytes of memory used just for
private catalog caches. Thanks to Andres optimizations of taking 
snapshots, now Postgres can normally handle thousands of connections.
So this extremely inefficient use of memory for private catalog caches 
becomes more and more critical.
Also private caches requires sophisticated and error prone invalidation 
mechanism.

If we will have such shared cache, then keeping shared relation size 
seems to be trivial task, isn't it?
So may be we before "diving" into the problem of maintaining shared pool 
of dirtied "inodes" we can better discuss and conerntrate on solving 
more generic problem?





pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: a.pervushina@postgrespro.ru
Date:
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed