Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Protect syscache from bloating with negative cache entries |
Date | |
Msg-id | 20190701.160259.247597303.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Protect syscache from bloating with negative cache entries (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Protect syscache from bloating with negative cache entries
|
List | pgsql-hackers |
Hello, my_gripe> But, still fluctulates by around 5%.. my_gripe> my_gripe> If this level of the degradation is still not acceptable, that my_gripe> means that nothing can be inserted in the code path and the new my_gripe> code path should be isolated from existing code by using indirect my_gripe> call. Finally, after some struggling, I think I could manage to measure the impact on performace precisely and reliably. Starting from "make distclean" every time building, then removing all in $TARGET before installation makes things stable enough. (I don't think it's good but I didin't investigate the cause..) I measured time/call by directly calling SearchSysCache3() many times. It showed that the patch causes around 0.1 microsec degradation per call. (The funtion overall took about 6.9 microsec on average.) Next, I counted how many times SearchSysCache is called during a planning with, as an instance, a query on a partitioned table having 3000 columns and 1000 partitions. explain analyze select sum(c0000) from test.p; Planner made 6020608 times syscache calls while planning and the overall planning time was 8641ms. (Exec time was 48ms.) 6020608 times 0.1 us is 602 ms of degradation. So roughly -7% degradation in planning time in estimation. The degradation was given by really only the two successive instructions "ADD/conditional MOVE(CMOVE)". The fact leads to the conclusion that the existing code path as is doesn't have room for any additional code. So I sought for room at least for one branch and found that (on gcc 7.3.1/CentOS7/x64). Interestingly, de-inlining SearchCatCacheInternal gave me gain on performance by about 3%. Further inlining of CatalogCacheComputeHashValue() gave another gain about 3%. I could add a branch in SearchCatCacheInteral within the gain. I also tried indirect calls but the degradation overwhelmed the gain, so I choosed branching rather than indirect calls. I didn't investigated how it happens. The following is the result. The binaries are build with the same configuration using -O2. binary means master : master HEAD. patched_off : patched, but pruning disabled (catalog_cache_prune_min_age=-1). patched_on : patched with pruning enabled. ("300s" for 1, "1s" for2, "0" for 3) bench: 1: corresponds to catcachebench(1); fetching STATRELATTINH 3000 * 1000 times generating new cache entriies. (Massive cache creatiion) Pruning doesn't happen while running this. 2: catcachebench(2); 60000 times cache access on 1000 STATRELATTINH entries. (Frequent cache reference) Pruning doesn't happen while running this. 3: catcachebench(3); fetching 1000(tbls) * 3000(cols) STATRELATTINH entries. Catcache clock advancing with the interval of 100(tbls) * 3000(cols) times of access and pruning happenshoge. While running catcachebench(3) once, pruning happens 28 times and most of the time 202202 entries are removed and the total number of entries was limite to 524289. (The systable has 3000 * 1001 = 3003000 tuples.) iter: Number of iterations. Time ms and stddev is calculated over the iterations. binar | bench | iter | time ms | stddev -------------+-------+-------+----------+-------- master | 1 | 10 | 8150.30 | 12.96 master | 2 | 10 | 4002.88 | 16.18 master | 3 | 10 | 9065.06 | 11.46 -------------+-------+-------+----------+-------- patched_off | 1 | 10 | 8090.95 | 9.95 patched_off | 2 | 10 | 3984.67 | 12.33 patched_off | 3 | 10 | 9050.46 | 4.64 -------------+-------+-------+----------+-------- patched_on | 1 | 10 | 8158.95 | 6.29 patched_on | 2 | 10 | 4023.72 | 10.41 patched_on | 3 | 10 | 16532.66 | 18.39 patched_off is slightly faster than master. patched_on is generally a bit slower. Even though patched_on/3 seems take too long time, the extra time comes from increased catalog table acess in exchange of memory saving. (That is, it is expected behavior.) I ran it several times and most them showed the same tendency. As a side-effect, once the branch added, the shared syscache in a neighbour thread will be able to be inserted together without impact on existing code path. === The benchmark script is used as the follows: - create many (3000, as example) tables in "test" schema. I created a partitioned table with 3000 children. - The tables have many columns, 1000 for me. - Run the following commands. =# select catcachebench(0); -- warm up systables. =# set catalog_cache_prune_min_age = any; -- as required =# select catcachebench(n); -- 3 >= n >= 1, the number of "bench" above. The above result is taked with the following query. =# select 'patched_on', '3' , count(a), avg(a)::numeric(10,2), stddev(a)::numeric(10,2) from (select catcachebench(3) fromgenerate_series(1, 10)) as a(a); ==== The attached patches are: 0001-Adjust-inlining-of-some-functions.patch: Changes inlining property of two functions, SearchCatCacheInternal and CatalogCacheComputeHashValue. 0002-Benchmark-extension-and-required-core-change.patch: Micro benchmark of SearchSysCache3() and core-side tweaks, which is out-of this patch set in the view of functionality. Works for 0001 but not for 0004 or later. 0003 adjusts that. 0003-Adjust-catcachebench-for-later-patches.patch Adjustment of 0002, benchmark for 0004, the body of this patchset. Breaks code consistency until 0004 applied. 0004-Catcache-pruning-feature.patch The feature patch, intentionally unchanges indentation of an existing code block in SearchCatCacheInternal for smaller size of the patch. It is adjusted in the next 0005 patch. 0005-Adjust-indentation-of-SearchCatCacheInternal.patch Adjusts indentation of 0004. 0001+4+5 is the final shape of the patch set and 0002+3 is only for benchmarking. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 927ce9035e13240378c7c332610bdde9377c2d7b Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Fri, 28 Jun 2019 16:29:52 +0900 Subject: [PATCH 1/5] Adjust inlining of some functions SearchCatCacheInternal code path is quite short and hot so that it doesn't accept additional cycles in the function. But changing inline attribute of SearchCatCacheInternal and CatalogCacheComputeHashValue makes SearchCatCacheN faster by about 6%. This makes room for an extra branch to be the door to other implementations of catcache. --- src/backend/utils/cache/catcache.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c index 00def27881..8fc067ce31 100644 --- a/src/backend/utils/cache/catcache.c +++ b/src/backend/utils/cache/catcache.c @@ -63,10 +63,10 @@ /* Cache management header --- pointer is NULL until created */ static CatCacheHeader *CacheHdr = NULL; -static inline HeapTuple SearchCatCacheInternal(CatCache *cache, - int nkeys, - Datum v1, Datum v2, - Datum v3, Datum v4); +static HeapTuple SearchCatCacheInternal(CatCache *cache, + int nkeys, + Datum v1, Datum v2, + Datum v3, Datum v4); static pg_noinline HeapTuple SearchCatCacheMiss(CatCache *cache, int nkeys, @@ -75,8 +75,9 @@ static pg_noinline HeapTuple SearchCatCacheMiss(CatCache *cache, Datum v1, Datum v2, Datum v3, Datum v4); -static uint32 CatalogCacheComputeHashValue(CatCache *cache, int nkeys, - Datum v1, Datum v2, Datum v3, Datum v4); +static inline uint32 CatalogCacheComputeHashValue(CatCache *cache, int nkeys, + Datum v1, Datum v2, + Datum v3, Datum v4); static uint32 CatalogCacheComputeTupleHashValue(CatCache *cache, int nkeys, HeapTuple tuple); static inline bool CatalogCacheCompareTuple(const CatCache *cache, int nkeys, @@ -266,7 +267,7 @@ GetCCHashEqFuncs(Oid keytype, CCHashFN *hashfunc, RegProcedure *eqfunc, CCFastEq * * Compute the hash value associated with a given set of lookup keys */ -static uint32 +static inline uint32 CatalogCacheComputeHashValue(CatCache *cache, int nkeys, Datum v1, Datum v2, Datum v3, Datum v4) { @@ -1194,7 +1195,7 @@ SearchCatCache4(CatCache *cache, /* * Work-horse for SearchCatCache/SearchCatCacheN. */ -static inline HeapTuple +static HeapTuple SearchCatCacheInternal(CatCache *cache, int nkeys, Datum v1, -- 2.16.3 From f0f5833ddfd0aac934cc7b5ded93541810c486d3 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Fri, 28 Jun 2019 17:03:07 +0900 Subject: [PATCH 2/5] Benchmark extension and required core change Micro benchmark extension for SearchSysCache and required core-side code. --- contrib/catcachebench/Makefile | 17 ++ contrib/catcachebench/catcachebench--0.0.sql | 9 + contrib/catcachebench/catcachebench.c | 281 +++++++++++++++++++++++++++ contrib/catcachebench/catcachebench.control | 6 + src/backend/utils/cache/catcache.c | 13 ++ 5 files changed, 326 insertions(+) create mode 100644 contrib/catcachebench/Makefile create mode 100644 contrib/catcachebench/catcachebench--0.0.sql create mode 100644 contrib/catcachebench/catcachebench.c create mode 100644 contrib/catcachebench/catcachebench.control diff --git a/contrib/catcachebench/Makefile b/contrib/catcachebench/Makefile new file mode 100644 index 0000000000..0478818b25 --- /dev/null +++ b/contrib/catcachebench/Makefile @@ -0,0 +1,17 @@ +MODULE_big = catcachebench +OBJS = catcachebench.o + +EXTENSION = catcachebench +DATA = catcachebench--0.0.sql +PGFILEDESC = "catcachebench - benchmark for catcache pruning feature" + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/catcachebench +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/catcachebench/catcachebench--0.0.sql b/contrib/catcachebench/catcachebench--0.0.sql new file mode 100644 index 0000000000..e091baaaa7 --- /dev/null +++ b/contrib/catcachebench/catcachebench--0.0.sql @@ -0,0 +1,9 @@ +/* contrib/catcachebench/catcachebench--0.0.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "CREATE EXTENSION catcachebench" to load this file. \quit + +CREATE FUNCTION catcachebench(IN type int) +RETURNS double precision +AS 'MODULE_PATHNAME', 'catcachebench' +LANGUAGE C STRICT VOLATILE; diff --git a/contrib/catcachebench/catcachebench.c b/contrib/catcachebench/catcachebench.c new file mode 100644 index 0000000000..0cebbbde4f --- /dev/null +++ b/contrib/catcachebench/catcachebench.c @@ -0,0 +1,281 @@ +/* + * catcachebench: test code for cache pruning feature + */ +#include "postgres.h" +#include "catalog/pg_type.h" +#include "catalog/pg_statistic.h" +#include "executor/spi.h" +#include "libpq/pqsignal.h" +#include "utils/catcache.h" +#include "utils/syscache.h" +#include "utils/timestamp.h" + +Oid tableoids[10000]; +int ntables = 0; +int16 attnums[1000]; +int natts = 0; + +PG_MODULE_MAGIC; + +double catcachebench1(void); +double catcachebench2(void); +double catcachebench3(void); +void collectinfo(void); +void catcachewarmup(void); + +PG_FUNCTION_INFO_V1(catcachebench); + +Datum +catcachebench(PG_FUNCTION_ARGS) +{ + int testtype = PG_GETARG_INT32(0); + double ms; + extern bool _catcache_shrink_buckets; + + collectinfo(); + + /* flush the catalog -- safe? don't mind. */ + + _catcache_shrink_buckets = true; + CatalogCacheFlushCatalog(StatisticRelationId); + _catcache_shrink_buckets = false; + + switch (testtype) + { + case 0: + catcachewarmup(); /* prewarm of syscatalog */ + PG_RETURN_NULL(); + case 1: + ms = catcachebench1(); break; + case 2: + ms = catcachebench2(); break; + case 3: + ms = catcachebench3(); break; + default: + elog(ERROR, "Invalid test type: %d", testtype); + } + + PG_RETURN_DATUM(Float8GetDatum(ms)); +} + +/* + * fetch all attribute entires of all tables. + */ +double +catcachebench1(void) +{ + int t, a; + instr_time start, + duration; + + PG_SETMASK(&BlockSig); + INSTR_TIME_SET_CURRENT(start); + for (t = 0 ; t < ntables ; t++) + { + for (a = 0 ; a < natts ; a++) + { + HeapTuple tup; + + tup = SearchSysCache3(STATRELATTINH, + ObjectIdGetDatum(tableoids[t]), + Int16GetDatum(attnums[a]), + BoolGetDatum(false)); + /* should be null, but.. */ + if (HeapTupleIsValid(tup)) + ReleaseSysCache(tup); + } + } + INSTR_TIME_SET_CURRENT(duration); + INSTR_TIME_SUBTRACT(duration, start); + PG_SETMASK(&UnBlockSig); + + return INSTR_TIME_GET_MILLISEC(duration); +}; + +/* + * fetch all attribute entires of a table 6000 times. + */ +double +catcachebench2(void) +{ + const int clock_step = 100; + int t, a; + instr_time start, + duration; + + PG_SETMASK(&BlockSig); + INSTR_TIME_SET_CURRENT(start); + for (t = 0 ; t < 60000 ; t++) + { + int ct = clock_step; + + /* + * catcacheclock is updated by transaction timestamp, so needs to + * be updated by other means for this test to work. Here I choosed + * to update the clock every 100 times of table scans. + */ + if (--ct < 0) + { + // We don't have it yet. + //SetCatCacheClock(GetCurrentTimestamp()); + GetCurrentTimestamp(); + ct = clock_step; + } + for (a = 0 ; a < natts ; a++) + { + HeapTuple tup; + + tup = SearchSysCache3(STATRELATTINH, + ObjectIdGetDatum(tableoids[0]), + Int16GetDatum(attnums[a]), + BoolGetDatum(false)); + /* should be null, but.. */ + if (HeapTupleIsValid(tup)) + ReleaseSysCache(tup); + } + } + INSTR_TIME_SET_CURRENT(duration); + INSTR_TIME_SUBTRACT(duration, start); + PG_SETMASK(&UnBlockSig); + + return INSTR_TIME_GET_MILLISEC(duration); +}; + +/* + * fetch all attribute entires of all tables twice with having expiration + * happen. + */ +double +catcachebench3(void) +{ + const int clock_step = 100; + int i, t, a; + instr_time start, + duration; + + PG_SETMASK(&BlockSig); + INSTR_TIME_SET_CURRENT(start); + for (i = 0 ; i < 2 ; i++) + { + int ct = clock_step; + + for (t = 0 ; t < ntables ; t++) + { + /* + * catcacheclock is updated by transaction timestamp, so needs to + * be updated by other means for this test to work. Here I choosed + * to update the clock every 100 tables scan. + */ + if (--ct < 0) + { + // We don't have it yet. + //SetCatCacheClock(GetCurrentTimestamp()); + GetCurrentTimestamp(); + ct = clock_step; + } + for (a = 0 ; a < natts ; a++) + { + HeapTuple tup; + + tup = SearchSysCache3(STATRELATTINH, + ObjectIdGetDatum(tableoids[t]), + Int16GetDatum(attnums[a]), + BoolGetDatum(false)); + /* should be null, but.. */ + if (HeapTupleIsValid(tup)) + ReleaseSysCache(tup); + } + } + } + INSTR_TIME_SET_CURRENT(duration); + INSTR_TIME_SUBTRACT(duration, start); + PG_SETMASK(&UnBlockSig); + + return INSTR_TIME_GET_MILLISEC(duration); +}; + +void +catcachewarmup(void) +{ + int t, a; + + /* load up catalog tables */ + for (t = 0 ; t < ntables ; t++) + { + for (a = 0 ; a < natts ; a++) + { + HeapTuple tup; + + tup = SearchSysCache3(STATRELATTINH, + ObjectIdGetDatum(tableoids[t]), + Int16GetDatum(attnums[a]), + BoolGetDatum(false)); + /* should be null, but.. */ + if (HeapTupleIsValid(tup)) + ReleaseSysCache(tup); + } + } +} + +void +collectinfo(void) +{ + int ret; + Datum values[10000]; + bool nulls[10000]; + Oid types0[] = {OIDOID}; + int i; + + ntables = 0; + natts = 0; + + SPI_connect(); + /* collect target tables */ + ret = SPI_execute("select oid from pg_class where relnamespace = (select oid from pg_namespace where nspname = \'test\')", + true, 0); + if (ret != SPI_OK_SELECT) + elog(ERROR, "Failed 1"); + if (SPI_processed == 0) + elog(ERROR, "no relation found in schema \"test\""); + if (SPI_processed > 10000) + elog(ERROR, "too many relation found in schema \"test\""); + + for (i = 0 ; i < SPI_processed ; i++) + { + heap_deform_tuple(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, + values, nulls); + if (nulls[0]) + elog(ERROR, "Failed 2"); + + tableoids[ntables++] = DatumGetObjectId(values[0]); + } + SPI_finish(); + elog(DEBUG1, "%d tables found", ntables); + + values[0] = ObjectIdGetDatum(tableoids[0]); + nulls[0] = false; + SPI_connect(); + ret = SPI_execute_with_args("select attnum from pg_attribute where attrelid = (select oid from pg_class where oid =$1)", + 1, types0, values, NULL, true, 0); + if (SPI_processed == 0) + elog(ERROR, "no attribute found in table %d", tableoids[0]); + if (SPI_processed > 10000) + elog(ERROR, "too many relation found in table %d", tableoids[0]); + + /* collect target attributes. assuming all tables have the same attnums */ + for (i = 0 ; i < SPI_processed ; i++) + { + int16 attnum; + + heap_deform_tuple(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, + values, nulls); + if (nulls[0]) + elog(ERROR, "Failed 3"); + attnum = DatumGetInt16(values[0]); + + if (attnum > 0) + attnums[natts++] = attnum; + } + SPI_finish(); + elog(DEBUG1, "%d attributes found", natts); +} diff --git a/contrib/catcachebench/catcachebench.control b/contrib/catcachebench/catcachebench.control new file mode 100644 index 0000000000..3fc9d2e420 --- /dev/null +++ b/contrib/catcachebench/catcachebench.control @@ -0,0 +1,6 @@ +# catcachebench + +comment = 'benchmark for catcache pruning' +default_version = '0.0' +module_pathname = '$libdir/catcachebench' +relocatable = true diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c index 8fc067ce31..98427b67cd 100644 --- a/src/backend/utils/cache/catcache.c +++ b/src/backend/utils/cache/catcache.c @@ -716,6 +716,9 @@ ResetCatalogCaches(void) * rather than relying on the relcache to keep a tupdesc for us. Of course * this assumes the tupdesc of a cachable system table will not change...) */ +/* CODE FOR catcachebench: REMOVE ME AFTER USE */ +bool _catcache_shrink_buckets = false; +/* END: CODE FOR catcachebench*/ void CatalogCacheFlushCatalog(Oid catId) { @@ -735,6 +738,16 @@ CatalogCacheFlushCatalog(Oid catId) /* Tell inval.c to call syscache callbacks for this cache */ CallSyscacheCallbacks(cache->id, 0); + + /* CODE FOR catcachebench: REMOVE ME AFTER USE */ + if (_catcache_shrink_buckets) + { + cache->cc_nbuckets = 128; + pfree(cache->cc_bucket); + cache->cc_bucket = palloc0(128 * sizeof(dlist_head)); + elog(LOG, "Catcache reset"); + } + /* END: CODE FOR catcachebench*/ } } -- 2.16.3 From fcd0273933f3f53979749ba2e315043f1a6f6f31 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Mon, 1 Jul 2019 15:08:11 +0900 Subject: [PATCH 3/5] Adjust catcachebench for later patches Make the benchmark use SetCatCacheClock, which is being introduced by the next patch. This temprarily breaks consistency until the next patch is applied. --- contrib/catcachebench/catcachebench.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/contrib/catcachebench/catcachebench.c b/contrib/catcachebench/catcachebench.c index 0cebbbde4f..63a7400463 100644 --- a/contrib/catcachebench/catcachebench.c +++ b/contrib/catcachebench/catcachebench.c @@ -116,9 +116,7 @@ catcachebench2(void) */ if (--ct < 0) { - // We don't have it yet. - //SetCatCacheClock(GetCurrentTimestamp()); - GetCurrentTimestamp(); + SetCatCacheClock(GetCurrentTimestamp()); ct = clock_step; } for (a = 0 ; a < natts ; a++) @@ -168,9 +166,7 @@ catcachebench3(void) */ if (--ct < 0) { - // We don't have it yet. - //SetCatCacheClock(GetCurrentTimestamp()); - GetCurrentTimestamp(); + SetCatCacheClock(GetCurrentTimestamp()); ct = clock_step; } for (a = 0 ; a < natts ; a++) -- 2.16.3 From 57129a5a001b7729f8888579bc11e47cf7192801 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Mon, 1 Jul 2019 11:31:54 +0900 Subject: [PATCH 4/5] Catcache pruning feature. Currently we don't have a mechanism to limit the memory amount for syscache. Syscache bloat often causes process die by OOM killer or other problems. This patch lets old syscache entries removed to eventually limit the amount of cache. This patch intentionally unchanges indentation of an existing code block in SearchCatCacheInternal for the patch size to be smaller. It is adjusted in the next patch. --- src/backend/utils/cache/catcache.c | 186 +++++++++++++++++++++++++++++++++++++ src/backend/utils/misc/guc.c | 12 +++ src/include/utils/catcache.h | 17 ++++ 3 files changed, 215 insertions(+) diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c index 98427b67cd..b552ae960c 100644 --- a/src/backend/utils/cache/catcache.c +++ b/src/backend/utils/cache/catcache.c @@ -60,9 +60,18 @@ #define CACHE_elog(...) #endif +/* + * GUC variable to define the minimum age of entries that will be considered + * to be evicted in seconds. -1 to disable the feature. + */ +int catalog_cache_prune_min_age = -1; + /* Cache management header --- pointer is NULL until created */ static CatCacheHeader *CacheHdr = NULL; +/* Clock for the last accessed time of a catcache entry. */ +TimestampTz catcacheclock = 0; + static HeapTuple SearchCatCacheInternal(CatCache *cache, int nkeys, Datum v1, Datum v2, @@ -864,9 +873,107 @@ InitCatCache(int id, */ MemoryContextSwitchTo(oldcxt); + /* initialize catcache reference clock if haven't done yet */ + if (catcacheclock == 0) + catcacheclock = GetCurrentTimestamp(); + + /* + * This cache doesn't contain a tuple older than the current time. Prevent + * the first pruning from happening too early. + */ + cp->cc_oldest_ts = catcacheclock; + return cp; } +/* + * CatCacheCleanupOldEntries - Remove infrequently-used entries + * + * Catcache entries happen to be left unused for a long time for several + * reasons. Remove such entries to prevent catcache from bloating. It is based + * on the similar algorithm with buffer eviction. Entries that are accessed + * several times in a certain period live longer than those that have had less + * access in the same duration. + */ +static bool +CatCacheCleanupOldEntries(CatCache *cp) +{ + int nremoved = 0; + int i; + long oldest_ts = catcacheclock; + long age; + int us; + + /* Return immediately if disabled */ + if (catalog_cache_prune_min_age < 0) + return false; + + /* Don't scan the hash when we know we don't have prunable entries */ + TimestampDifference(cp->cc_oldest_ts, catcacheclock, &age, &us); + if (age < catalog_cache_prune_min_age) + return false; + + /* Scan over the whole hash to find entries to remove */ + for (i = 0 ; i < cp->cc_nbuckets ; i++) + { + dlist_mutable_iter iter; + + dlist_foreach_modify(iter, &cp->cc_bucket[i]) + { + CatCTup *ct = dlist_container(CatCTup, cache_elem, iter.cur); + + /* Don't remove referenced entries */ + if (ct->refcount == 0 && + (ct->c_list == NULL || ct->c_list->refcount == 0)) + { + /* + * Calculate the duration from the time from the last access + * to the "current" time. catcacheclock is updated + * per-statement basis and additionaly udpated periodically + * during a long running query. + */ + TimestampDifference(ct->lastaccess, catcacheclock, &age, &us); + + if (age > catalog_cache_prune_min_age) + { + /* + * Entries that are not accessed after the last pruning + * are removed in that seconds, and their lives are + * prolonged according to how many times they are accessed + * up to three times of the duration. We don't try shrink + * buckets since pruning effectively caps catcache + * expansion in the long term. + */ + if (ct->naccess > 2) + ct->naccess = 1; + else if (ct->naccess > 0) + ct->naccess--; + else + { + CatCacheRemoveCTup(cp, ct); + nremoved++; + + /* don't update oldest_ts by removed entry */ + continue; + } + } + } + + /* update oldest timestamp if the entry remains alive */ + if (ct->lastaccess < oldest_ts) + oldest_ts = ct->lastaccess; + } + } + + cp->cc_oldest_ts = oldest_ts; + + if (nremoved > 0) + elog(DEBUG1, "pruning catalog cache id=%d for %s: removed %d / %d", + cp->id, cp->cc_relname, nremoved, cp->cc_ntup + nremoved); + + return nremoved > 0; +} + /* * Enlarge a catcache, doubling the number of buckets. */ @@ -880,6 +987,10 @@ RehashCatCache(CatCache *cp) elog(DEBUG1, "rehashing catalog cache id %d for %s; %d tups, %d buckets", cp->id, cp->cc_relname, cp->cc_ntup, cp->cc_nbuckets); + /* try removing old entries before expanding the hash */ + if (CatCacheCleanupOldEntries(cp)) + return; + /* Allocate a new, larger, hash table. */ newnbuckets = cp->cc_nbuckets * 2; newbucket = (dlist_head *) MemoryContextAllocZero(CacheMemoryContext, newnbuckets * sizeof(dlist_head)); @@ -1257,6 +1368,14 @@ SearchCatCacheInternal(CatCache *cache, * dlist within the loop, because we don't continue the loop afterwards. */ bucket = &cache->cc_bucket[hashIndex]; + + /* + * Even though this branch leads to duplicate of a bit much code, we want + * as less branches as possible here to keep fastest when pruning is + * disabled. Don't try to move this branch to foreach to save lines. + */ + if (likely(catalog_cache_prune_min_age < 0)) + { dlist_foreach(iter, bucket) { ct = dlist_container(CatCTup, cache_elem, iter.cur); @@ -1309,6 +1428,71 @@ SearchCatCacheInternal(CatCache *cache, return NULL; } } + } + else + { + /* + * We manage the age of each entries for pruning in this branch. + */ + dlist_foreach(iter, bucket) + { + /* The following section is the same with the if() block */ + ct = dlist_container(CatCTup, cache_elem, iter.cur); + + if (ct->dead) + continue; + + if (ct->hash_value != hashValue) + continue; + + if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments)) + continue; + + dlist_move_head(bucket, &ct->cache_elem); + + /* + * Prolong life of this entry. Since we want run as less + * instructions as possible and want the branch be stable for + * performance reasons, we don't give a strict cap on the + * counter. All numbers above 1 will be regarded as 2 in + * CatCacheCleanupOldEntries(). + */ + ct->naccess++; + if (unlikely(ct->naccess == 0)) + ct->naccess = 2; + ct->lastaccess = catcacheclock; + + /* Following part is also the same with if() block above */ + if (!ct->negative) + { + ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner); + ct->refcount++; + ResourceOwnerRememberCatCacheRef(CurrentResourceOwner, + &ct->tuple); + + CACHE_elog(DEBUG2, "SearchCatCache(%s): found in bucket %d", + cache->cc_relname, hashIndex); + +#ifdef CATCACHE_STATS + cache->cc_hits++; +#endif + + + return &ct->tuple; + } + else + { + CACHE_elog(DEBUG2, "SearchCatCache(%s): found neg entry in bucket %d", + cache->cc_relname, hashIndex); + +#ifdef CATCACHE_STATS + cache->cc_neg_hits++; +#endif + + return NULL; + } + } + } return SearchCatCacheMiss(cache, nkeys, hashValue, hashIndex, v1, v2, v3, v4); } @@ -1902,6 +2086,8 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments, ct->dead = false; ct->negative = negative; ct->hash_value = hashValue; + ct->naccess = 0; + ct->lastaccess = catcacheclock; dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem); diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 92c4fee8f8..c2a4caa44b 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -82,6 +82,7 @@ #include "tsearch/ts_cache.h" #include "utils/builtins.h" #include "utils/bytea.h" +#include "utils/catcache.h" #include "utils/guc_tables.h" #include "utils/float.h" #include "utils/memutils.h" @@ -2252,6 +2253,17 @@ static struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, + { + {"catalog_cache_prune_min_age", PGC_USERSET, RESOURCES_MEM, + gettext_noop("System catalog cache entries that live unused for longer than this seconds are considered forremoval."), + gettext_noop("The value of -1 turns off pruning."), + GUC_UNIT_S + }, + &catalog_cache_prune_min_age, + 300, -1, INT_MAX, + NULL, NULL, NULL + }, + /* * We use the hopefully-safely-small value of 100kB as the compiled-in * default for max_stack_depth. InitializeGUCOptions will increase it if diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h index ff1fabaca1..ad962fb096 100644 --- a/src/include/utils/catcache.h +++ b/src/include/utils/catcache.h @@ -22,6 +22,7 @@ #include "access/htup.h" #include "access/skey.h" +#include "datatype/timestamp.h" #include "lib/ilist.h" #include "utils/relcache.h" @@ -61,6 +62,7 @@ typedef struct catcache slist_node cc_next; /* list link */ ScanKeyData cc_skey[CATCACHE_MAXKEYS]; /* precomputed key info for heap * scans */ + TimestampTz cc_oldest_ts; /* timestamp of the oldest tuple in the hash */ /* * Keep these at the end, so that compiling catcache.c with CATCACHE_STATS @@ -119,6 +121,8 @@ typedef struct catctup bool dead; /* dead but not yet removed? */ bool negative; /* negative cache entry? */ HeapTupleData tuple; /* tuple management header */ + unsigned int naccess; /* # of access to this entry */ + TimestampTz lastaccess; /* timestamp of the last usage */ /* * The tuple may also be a member of at most one CatCList. (If a single @@ -189,6 +193,19 @@ typedef struct catcacheheader /* this extern duplicates utils/memutils.h... */ extern PGDLLIMPORT MemoryContext CacheMemoryContext; +/* for guc.c, not PGDLLPMPORT'ed */ +extern int catalog_cache_prune_min_age; + +/* source clock for access timestamp of catcache entries */ +extern TimestampTz catcacheclock; + +/* SetCatCacheClock - set catcache timestamp source clodk */ +static inline void +SetCatCacheClock(TimestampTz ts) +{ + catcacheclock = ts; +} + extern void CreateCacheMemoryContext(void); extern CatCache *InitCatCache(int id, Oid reloid, Oid indexoid, -- 2.16.3 From 772d7fda030e8990fbd84ec44f07120d73682256 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Mon, 1 Jul 2019 14:11:08 +0900 Subject: [PATCH 5/5] Adjust indentation of SearchCatCacheInternal The previous patch leaves indentation of an existing code block in SearchCatCacheInternal unchanged for diff size to be small. This adjusts the indentation. --- src/backend/utils/cache/catcache.c | 82 +++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 41 deletions(-) diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c index b552ae960c..60d0fd28a8 100644 --- a/src/backend/utils/cache/catcache.c +++ b/src/backend/utils/cache/catcache.c @@ -1376,59 +1376,59 @@ SearchCatCacheInternal(CatCache *cache, */ if (likely(catalog_cache_prune_min_age < 0)) { - dlist_foreach(iter, bucket) - { - ct = dlist_container(CatCTup, cache_elem, iter.cur); - - if (ct->dead) - continue; /* ignore dead entries */ - - if (ct->hash_value != hashValue) - continue; /* quickly skip entry if wrong hash val */ - - if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments)) - continue; - - /* - * We found a match in the cache. Move it to the front of the list - * for its hashbucket, in order to speed subsequent searches. (The - * most frequently accessed elements in any hashbucket will tend to be - * near the front of the hashbucket's list.) - */ - dlist_move_head(bucket, &ct->cache_elem); - - /* - * If it's a positive entry, bump its refcount and return it. If it's - * negative, we can report failure to the caller. - */ - if (!ct->negative) + dlist_foreach(iter, bucket) { - ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner); - ct->refcount++; - ResourceOwnerRememberCatCacheRef(CurrentResourceOwner, &ct->tuple); + ct = dlist_container(CatCTup, cache_elem, iter.cur); - CACHE_elog(DEBUG2, "SearchCatCache(%s): found in bucket %d", - cache->cc_relname, hashIndex); + if (ct->dead) + continue; /* ignore dead entries */ + + if (ct->hash_value != hashValue) + continue; /* quickly skip entry if wrong hash val */ + + if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments)) + continue; + + /* + * We found a match in the cache. Move it to the front of the + * list for its hashbucket, in order to speed subsequent searches. + * (The most frequently accessed elements in any hashbucket will + * tend to be near the front of the hashbucket's list.) + */ + dlist_move_head(bucket, &ct->cache_elem); + + /* + * If it's a positive entry, bump its refcount and return it. If + * it's negative, we can report failure to the caller. + */ + if (!ct->negative) + { + ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner); + ct->refcount++; + ResourceOwnerRememberCatCacheRef(CurrentResourceOwner, &ct->tuple); + + CACHE_elog(DEBUG2, "SearchCatCache(%s): found in bucket %d", + cache->cc_relname, hashIndex); #ifdef CATCACHE_STATS - cache->cc_hits++; + cache->cc_hits++; #endif - return &ct->tuple; - } - else - { - CACHE_elog(DEBUG2, "SearchCatCache(%s): found neg entry in bucket %d", - cache->cc_relname, hashIndex); + return &ct->tuple; + } + else + { + CACHE_elog(DEBUG2, "SearchCatCache(%s): found neg entry in bucket %d", + cache->cc_relname, hashIndex); #ifdef CATCACHE_STATS - cache->cc_neg_hits++; + cache->cc_neg_hits++; #endif - return NULL; + return NULL; + } } } - } else { /* -- 2.16.3
pgsql-hackers by date: