Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Protect syscache from bloating with negative cache entries
Date
Msg-id 20180307.172545.231972323.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Protect syscache from bloating with negative cache entries  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Protect syscache from bloating with negative cache entries  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Oops! The previous ptach contained garbage printing in debugging
output.

The attached is the new version without the garbage. Addition to
it, I changed my mind to use DEBUG1 for the debug message since
the frequency is quite low.

No changes in the following cited previous mail.

At Wed, 07 Mar 2018 16:19:23 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180307.161923.178158050.horiguchi.kyotaro@lab.ntt.co.jp>
==================
Hello.

Thank you for the discussion, and sorry for being late to come.

At Thu, 1 Mar 2018 12:26:30 -0800, Andres Freund <andres@anarazel.de> wrote in
<20180301202630.2s6untij2x5hpksn@alap3.anarazel.de>
> Hi,
> 
> On 2018-03-01 15:19:26 -0500, Robert Haas wrote:
> > On Thu, Mar 1, 2018 at 3:01 PM, Andres Freund <andres@anarazel.de> wrote:
> > > On 2018-03-01 14:49:26 -0500, Robert Haas wrote:
> > >> On Thu, Mar 1, 2018 at 2:29 PM, Andres Freund <andres@anarazel.de> wrote:
> > >> > Right. Which might be very painful latency wise. And with poolers it's
> > >> > pretty easy to get into situations like that, without the app
> > >> > influencing it.
> > >>
> > >> Really?  I'm not sure I believe that.  You're talking perhaps a few
> > >> milliseconds - maybe less - of additional latency on a connection
> > >> that's been idle for many minutes.
> > >
> > > I've seen latency increases in second+ ranges due to empty cat/sys/rel
> > > caches.
> > 
> > How is that even possible unless the system is grossly overloaded?
> 
> You just need to have catalog contents out of cache and statements
> touching a few relations, functions, etc. Indexscan + heap fetch
> latencies do add up quite quickly if done sequentially.
> 
> 
> > > I don't think that'd quite address my concern. I just don't think that
> > > the granularity (drop all entries older than xxx sec at the next resize)
> > > is right. For one I don't want to drop stuff if the cache size isn't a
> > > problem for the current memory budget. For another, I'm not convinced
> > > that dropping entries from the current "generation" at resize won't end
> > > up throwing away too much.
> > 
> > I think that a fixed memory budget for the syscache is an idea that
> > was tried many years ago and basically failed, because it's very easy
> > to end up with terrible eviction patterns -- e.g. if you are accessing
> > 11 relations in round-robin fashion with a 10-relation cache, your
> > cache nets you a 0% hit rate but takes a lot more maintenance than
> > having no cache at all.  The time-based approach lets the cache grow
> > with no fixed upper limit without allowing unused entries to stick
> > around forever.
> 
> I definitely think we want a time based component to this, I just want
> to not prune at all if we're below a certain size.
> 
> 
> > > If we'd a guc 'syscache_memory_target' and we'd only start pruning if
> > > above it, I'd be much happier.
> > 
> > It does seem reasonable to skip pruning altogether if the cache is
> > below some threshold size.
> 
> Cool. There might be some issues making that check performant enough,
> but I don't have a good intuition on it.

So..

- Now it gets two new GUC variables named syscache_prune_min_age
  and syscache_memory_target. The former is the replacement of
  the previous magic number 600 and defaults to the same
  number. The latter prevens syscache pruning until exceeding the
  size and defaults to 0, means that pruning is always
  considered.  Documentation for the two variables are also
  added.

- Revised the pointed mysterious comment for
  CatcacheCleanupOldEntries and some comments are added.

- Fixed the name of the variables for CATCACHE_STATS to be more
  descriptive, and added some comments for the code.

The catcache entries accessed within the current transaction
won't be pruned so theoretically a long transaction can bloat
catcache. But I believe it is quite rare, or at least this saves
the most other cases.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 975e7e82d4eeb7d7b7ecf981141a8924297c46ef Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 26 Dec 2017 17:43:09 +0900
Subject: [PATCH] Remove entries that haven't been used for a certain time

Catcache entries can be left alone for several reasons. It is not
desirable that they eat up memory. With this patch, This adds
consideration of removal of entries that haven't been used for a
certain time before enlarging the hash array.
---
 doc/src/sgml/config.sgml                      |  38 +++++++
 src/backend/access/transam/xact.c             |   3 +
 src/backend/utils/cache/catcache.c            | 152 +++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c                  |  23 ++++
 src/backend/utils/misc/postgresql.conf.sample |   2 +
 src/include/utils/catcache.h                  |  19 ++++
 6 files changed, 233 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 259a2d83b4..782b506984 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1556,6 +1556,44 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-syscache-memory-target" xreflabel="syscache_memory_target">
+      <term><varname>syscache_memory_target</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>syscache_memory_target</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the maximum amount of memory to which syscache is expanded
+        without pruning. The value defaults to 0, indicating that pruning is
+        always considered. After exceeding this size, syscache pruning is
+        considered according to
+        <xref linkend="guc-syscache-prune-min-age"/>. If you need to keep
+        certain amount of syscache entries with intermittent usage, try
+        increase this setting.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry id="guc-syscache-prune-min-age" xreflabel="syscache_prune_min_age">
+      <term><varname>syscache_prune_min_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>syscache_prune_min_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the minimum amount of unused time in seconds at which a
+        syscache entry is considered to be removed. -1 indicates that syscache
+        pruning is disabled at all. The value defaults to 600 seconds
+        (<literal>10 minutes</literal>). The syscache entries that are not
+        used for the duration can be removed to prevent syscache bloat. This
+        behavior is suppressed until the size of syscache exceeds
+        <xref linkend="guc-syscache-memory-target"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth">
       <term><varname>max_stack_depth</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index dbaaf8e005..86d76917bb 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -733,6 +733,9 @@ void
 SetCurrentStatementStartTimestamp(void)
 {
     stmtStartTimestamp = GetCurrentTimestamp();
+
+    /* Set this timestamp as aproximated current time */
+    SetCatCacheClock(stmtStartTimestamp);
 }
 
 /*
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 5ddbf6eab1..e4a9ab8789 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -71,9 +71,23 @@
 #define CACHE6_elog(a,b,c,d,e,f,g)
 #endif
 
+/*
+ * GUC variable to define the minimum size of hash to cosider entry eviction.
+ * Let the name be the same with the guc variable name, not using 'catcache'.
+ */
+int syscache_memory_target = 0;
+
+/* GUC variable to define the minimum age of entries that will be cosidered
+ * to be evicted in seconds. Ditto for the name.
+ */
+int syscache_prune_min_age = 600;
+
 /* Cache management header --- pointer is NULL until created */
 static CatCacheHeader *CacheHdr = NULL;
 
+/* Timestamp used for any operation on caches. */
+TimestampTz    catcacheclock = 0;
+
 static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
                        int nkeys,
                        Datum v1, Datum v2,
@@ -866,9 +880,130 @@ InitCatCache(int id,
      */
     MemoryContextSwitchTo(oldcxt);
 
+    /* initilize catcache reference clock if haven't done yet */
+    if (catcacheclock == 0)
+        catcacheclock = GetCurrentTimestamp();
+
     return cp;
 }
 
+/*
+ * CatCacheCleanupOldEntries - Remove infrequently-used entries
+ *
+ * Catcache entries can be left alone for several reasons. We remove them if
+ * they not accessed for a certain time to prevent catcache from bloating. The
+ * eviction is performed with the similar algorithm with buffer eviction using
+ * access counter. Entries that are accessed several times can live longer
+ * than those that have had no access in the same duration.
+ */
+static bool
+CatCacheCleanupOldEntries(CatCache *cp)
+{
+    int            i;
+    int            nremoved = 0;
+    size_t        hash_size;
+#ifdef CATCACHE_STATS
+    /* These variables are only for debugging purpose */
+    int            ntotal = 0;
+    /*
+     * nth element in nentries stores the number of cache entries that have
+     * lived unaccessed for corresponding multiple in ageclass of
+     * syscache_prune_min_age. The index of nremoved_entry is the value of the
+     * clock-sweep counter, which takes from 0 up to 2.
+     */
+    double        ageclass[] = {0.05, 0.1, 1.0, 2.0, 3.0, 0.0};
+    int            nentries[] = {0, 0, 0, 0, 0, 0};
+    int            nremoved_entry[3] = {0, 0, 0};
+    int            j;
+#endif
+
+    /* Return immediately if no pruning is wanted */
+    if (syscache_prune_min_age < 0)
+        return false;
+
+    /*
+     * Return without pruning if the size of the hash is below the target.
+     * Since the area for bucket array is dominant, consider only it.
+     */
+    hash_size = cp->cc_nbuckets * sizeof(dlist_head);
+    if (hash_size < syscache_memory_target * 1024)
+        return false;
+    
+    /* Search the whole hash for entries to remove */
+    for (i = 0; i < cp->cc_nbuckets; i++)
+    {
+        dlist_mutable_iter iter;
+
+        dlist_foreach_modify(iter, &cp->cc_bucket[i])
+        {
+            CatCTup    *ct = dlist_container(CatCTup, cache_elem, iter.cur);
+            long entry_age;
+            int us;
+
+
+            /*
+             * Calculate the duration from the time of the last access to the
+             * "current" time. Since catcacheclock is not advanced within a
+             * transaction, the entries that are accessed within the current
+             * transaction won't be pruned.
+             */
+            TimestampDifference(ct->lastaccess, catcacheclock, &entry_age, &us);
+
+#ifdef CATCACHE_STATS
+            /* count catcache entries for each age class */
+            ntotal++;
+            for (j = 0 ;
+                 ageclass[j] != 0.0 &&
+                     entry_age > syscache_prune_min_age * ageclass[j] ;
+                 j++);
+            if (ageclass[j] == 0.0) j--;
+            nentries[j]++;
+#endif
+
+            /*
+             * Try to remove entries older than syscache_prune_min_age
+             * seconds.  Entries that are not accessed after last pruning are
+             * removed in that seconds, and that has been accessed several
+             * times are removed after leaving alone for up to three times of
+             * the duration. We don't try shrink buckets since pruning
+             * effectively caps catcache expansion in the long term.
+             */
+            if (entry_age > syscache_prune_min_age)
+            {
+#ifdef CATCACHE_STATS
+                Assert (ct->naccess >= 0 && ct->naccess <= 2);
+                nremoved_entry[ct->naccess]++;
+#endif
+                if (ct->naccess > 0)
+                    ct->naccess--;
+                else
+                {
+                    if (!ct->c_list || ct->c_list->refcount == 0)
+                    {
+                        CatCacheRemoveCTup(cp, ct);
+                        nremoved++;
+                    }
+                }
+            }
+        }
+    }
+
+#ifdef CATCACHE_STATS
+    ereport(DEBUG1,
+            (errmsg ("removed %d/%d, age(-%.0fs:%d, -%.0fs:%d, *-%.0fs:%d, -%.0fs:%d, -%.0fs:%d) naccessed(0:%d, 1:%d,
2:%d)",
+                     nremoved, ntotal,
+                     ageclass[0] * syscache_prune_min_age, nentries[0],
+                     ageclass[1] * syscache_prune_min_age, nentries[1],
+                     ageclass[2] * syscache_prune_min_age, nentries[2],
+                     ageclass[3] * syscache_prune_min_age, nentries[3],
+                     ageclass[4] * syscache_prune_min_age, nentries[4],
+                     nremoved_entry[0], nremoved_entry[1], nremoved_entry[2]),
+             errhidestmt(true)));
+#endif
+
+    return nremoved > 0;
+}
+
 /*
  * Enlarge a catcache, doubling the number of buckets.
  */
@@ -1282,6 +1417,11 @@ SearchCatCacheInternal(CatCache *cache,
          */
         dlist_move_head(bucket, &ct->cache_elem);
 
+        /* Update access information for pruning */
+        if (ct->naccess < 2)
+            ct->naccess++;
+        ct->lastaccess = catcacheclock;
+
         /*
          * If it's a positive entry, bump its refcount and return it. If it's
          * negative, we can report failure to the caller.
@@ -1813,7 +1953,6 @@ ReleaseCatCacheList(CatCList *list)
         CatCacheRemoveCList(list->my_cache, list);
 }
 
-
 /*
  * CatalogCacheCreateEntry
  *        Create a new CatCTup entry, copying the given HeapTuple and other
@@ -1906,6 +2045,8 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
     ct->dead = false;
     ct->negative = negative;
     ct->hash_value = hashValue;
+    ct->naccess = 0;
+    ct->lastaccess = catcacheclock;
 
     dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
 
@@ -1913,10 +2054,13 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
     CacheHdr->ch_ntup++;
 
     /*
-     * If the hash table has become too full, enlarge the buckets array. Quite
-     * arbitrarily, we enlarge when fill factor > 2.
+     * If the hash table has become too full, try cleanup by removing
+     * infrequently used entries to make a room for the new entry. If it
+     * failed, enlarge the bucket array instead.  Quite arbitrarily, we try
+     * this when fill factor > 2.
      */
-    if (cache->cc_ntup > cache->cc_nbuckets * 2)
+    if (cache->cc_ntup > cache->cc_nbuckets * 2 &&
+        !CatCacheCleanupOldEntries(cache))
         RehashCatCache(cache);
 
     return ct;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1db7845d5a..33abe04efe 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -78,6 +78,7 @@
 #include "tsearch/ts_cache.h"
 #include "utils/builtins.h"
 #include "utils/bytea.h"
+#include "utils/catcache.h"
 #include "utils/guc_tables.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
@@ -1971,6 +1972,28 @@ static struct config_int ConfigureNamesInt[] =
         NULL, NULL, NULL
     },
 
+    {
+        {"syscache_memory_target", PGC_USERSET, RESOURCES_MEM,
+            gettext_noop("Sets the minimum syscache size to keep."),
+            gettext_noop("Syscache is not pruned before exceeding this size."),
+            GUC_UNIT_KB
+        },
+        &syscache_memory_target,
+        0, 0, INT_MAX,
+        NULL, NULL, NULL
+    },
+
+    {
+        {"syscache_prune_min_age", PGC_USERSET, RESOURCES_MEM,
+            gettext_noop("Sets the minimum duration of an unused syscache entry to remove."),
+            gettext_noop("Syscache entries that live unused for longer than this seconds are considered to be
removed."),
+            GUC_UNIT_S
+        },
+        &syscache_prune_min_age,
+        600, -1, INT_MAX,
+        NULL, NULL, NULL
+    },
+
     /*
      * We use the hopefully-safely-small value of 100kB as the compiled-in
      * default for max_stack_depth.  InitializeGUCOptions will increase it if
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 39272925fb..5a5729a88f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -124,6 +124,8 @@
 #work_mem = 4MB                # min 64kB
 #maintenance_work_mem = 64MB        # min 1MB
 #autovacuum_work_mem = -1        # min 1MB, or -1 to use maintenance_work_mem
+#syscache_memory_target = 0kB    # in kB. zero disables the feature
+#syscache_prune_min_age = 600s    # -1 disables the feature
 #max_stack_depth = 2MB            # min 100kB
 #dynamic_shared_memory_type = posix    # the default is the first option
                     # supported by the operating system:
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index 7b22f9c7bc..c3c4d65998 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -22,6 +22,7 @@
 
 #include "access/htup.h"
 #include "access/skey.h"
+#include "datatype/timestamp.h"
 #include "lib/ilist.h"
 #include "utils/relcache.h"
 
@@ -119,6 +120,8 @@ typedef struct catctup
     bool        dead;            /* dead but not yet removed? */
     bool        negative;        /* negative cache entry? */
     HeapTupleData tuple;        /* tuple management header */
+    int            naccess;        /* # of access to this entry, up to 2  */
+    TimestampTz    lastaccess;        /* approx. timestamp of the last usage */
 
     /*
      * The tuple may also be a member of at most one CatCList.  (If a single
@@ -189,6 +192,22 @@ typedef struct catcacheheader
 /* this extern duplicates utils/memutils.h... */
 extern PGDLLIMPORT MemoryContext CacheMemoryContext;
 
+/* for guc.c, not PGDLLPMPORT'ed */
+extern int syscache_prune_min_age;
+extern int syscache_memory_target;
+
+/* to use as access timestamp of catcache entries */
+extern TimestampTz catcacheclock;
+
+/*
+ * SetCatCacheClock - set timestamp for catcache access record
+ */
+static inline void
+SetCatCacheClock(TimestampTz ts)
+{
+    catcacheclock = ts;
+}
+
 extern void CreateCacheMemoryContext(void);
 
 extern CatCache *InitCatCache(int id, Oid reloid, Oid indexoid,
-- 
2.16.2


pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Temporary tables prevent autovacuum, leading to XID wraparound
Next
From: Arthur Zakirov
Date:
Subject: Re: [PROPOSAL] Shared Ispell dictionaries