RE: Global shared meta cache - Mailing list pgsql-hackers
From | Ideriha, Takeshi |
---|---|
Subject | RE: Global shared meta cache |
Date | |
Msg-id | 4E72940DA2BF16479384A86D54D0988A7DB56638@G01JPEXMBKW04 Whole thread Raw |
In response to | RE: Global shared meta cache ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>) |
Responses |
RE: Global shared meta cache
|
List | pgsql-hackers |
>From: Ideriha, Takeshi [mailto:ideriha.takeshi@jp.fujitsu.com] >[TL; DR] >The basic idea is following 4 points: >A. User can choose which database to put a cache (relation and catalog) on shared >memory and how much memory is used >B. Caches of committed data are on the >shared memory. Caches of uncommitted data are on the local memory. >C. Caches on the shared memory have xid information (xmin, xmax) >D. Evict not recently used cache from shared memory I updated some thoughts about B and C for CatCache. I would be very happy if you put some comments. >[B & C] >Regarding B & C, the motivation is we don't want other backends to see uncommitted >tables. >Search order is local memory -> shared memory -> disk. >Local process searches cache in shared memory based from its own snapshot and xid >of cache. >When cache is not found in shared memory, cache with xmin is made in shared >memory ( but not in local one). > >When cache definition is changed by DDL, new cache is created in local one, and thus >next commands refer to local cache if needed. >When it's committed, local cache is cleared and shared cache is updated. This update >is done by adding xmax to old cache and also make a new one with xmin. The idea >behind adding a new one is that newly created cache (new table or altered table) is >likely to be used in next transactions. At this point maybe we can make use of current >invalidation mechanism, even though invalidation message to other backends is not >sent. My current thoughts: - Each catcache has (maybe partial) HeapTupleHeader - put every catcache on shared memory and no local catcache - but catcache for aborted tuple is not put on shared memory - Hash table exists per kind of CatCache - These hash tables exists for each database and shared - e.g) there is a hash table for pg_class of a DB Why I'm leaning toward not to use local cache follows: - At commit moment you need to copy local cache to global cache. This would delay the response time. - Even if uncommitted catcache is on shared memory, other transaction cannot see the cache. In my idea they have xid information and visibility is checked by comparing xmin, xmax of catcache and snapshot. OK, then if we put catcache on shared memory, we need to check their visibility. But if we use the exact same visibility check mechanism as heap tuple, it takes much more steps compared to current local catcache search. Current visibility check is based on snapshot check and commit/abort check. So I'm thinking to only put in-progress caches or committed one. This would save time for checking catcache status (commit/abort) while searching cache. But basically I'm going to use current visibility check mechanism except commit/ abort check (in other words check of clog). These are how it works. - When creating a catcache, copy heap tuple with heapTupleHeader - When update/delete command for catalog tuple is finished, update xmax of corresponding cache - If there is a cache whose xmin is aborted xid, delete the cache - If there is a cache whose xmax is aborted xid, initialize xmax information - At commit time, there is no action to the shared cache Pending items are - thoughts about shared relcache - "vacuum" process for shared cache Regards, Ideriha Takeshi
pgsql-hackers by date: