RE: Global shared meta cache - Mailing list pgsql-hackers

From Ideriha, Takeshi
Subject RE: Global shared meta cache
Date
Msg-id 4E72940DA2BF16479384A86D54D0988A7DB2CABC@G01JPEXMBKW04
Whole thread Raw
In response to RE: Global shared meta cache  ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
Responses RE: Global shared meta cache
List pgsql-hackers
>From: Ideriha, Takeshi [mailto:ideriha.takeshi@jp.fujitsu.com]
>Do you have any thoughts?
>
Hi, I updated my idea, hoping get some feedback.

[TL; DR]
The basic idea is following 4 points:
A. User can choose which database to put a cache (relation and catalog) on shared memory and how much memory is used
B. Caches of committed data are on the shared memory. Caches of uncommitted data are on the local memory.
C. Caches on the shared memory have xid information (xmin, xmax)
D. Evict not recently used cache from shared memory


[A]
Regarding point A, I can imagine some databases are connected by lots of clients but others don't.
So I introduced a new parameter in postgresql.conf, "shared_meta_cache", 
which is disabled by default and needs server restart to enable.
ex. shared_meta_cache = 'db1:500MB, db2:100MB'. 

Some catcaches like pg_database are shared among the whole database, 
so such shared catcaches are allocated in a dedicated space within shared memory. 
This space can be controlled by "shared_meta_global_catcache" parameter, which is named after global directory.
But I want this parameter to be hidden in postgresql.conf to make it simple for users. It's too detailed.

[B & C]
Regarding B & C, the motivation is we don't want other backends to see uncommitted tables.
Search order is local memory -> shared memory -> disk. 
Local process searches cache in shared memory based from its own snapshot and xid of cache. 
When cache is not found in shared memory, cache with xmin is made in shared memory ( but not in local one).

When cache definition is changed by DDL, new cache is created in local one, and thus next commands refer to local cache
ifneeded. 
 
When it's committed, local cache is cleared and shared cache is updated. This update is done by adding xmax to old
cache
and also make a new one with xmin. The idea behind adding a new one is that newly created cache (new table or altered
table)
is likely to be used in next transactions. At this point maybe we can make use of current invalidation mechanism, 
even though invalidation message to other backends is not sent. 

[D]
As for D, I'm thinking to do benchmark with simple LRU. If the performance is bad, change to other algorithm like
Clock.
We don't care about eviction of local cache because its lifetime is in a transaction, and I don't want to make it
bloat.

best regards,
Takeshi Ideriha




pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] proposal: schema variables
Next
From: Michael Paquier
Date:
Subject: Re: "make installcheck" fails in src/test/recovery