Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Protect syscache from bloating with negative cache entries
Date
Msg-id 20180301185455.ybyw5yajodjezdoa@alap3.anarazel.de
Whole thread Raw
In response to Re: Protect syscache from bloating with negative cache entries  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Protect syscache from bloating with negative cache entries  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2017-12-26 18:19:16 +0900, Kyotaro HORIGUCHI wrote:
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -733,6 +733,9 @@ void
>  SetCurrentStatementStartTimestamp(void)
>  {
>      stmtStartTimestamp = GetCurrentTimestamp();
> +
> +    /* Set this time stamp as aproximated current time */
> +    SetCatCacheClock(stmtStartTimestamp);
>  }

Hm.


> + * Remove entries that haven't been accessed for a certain time.
> + *
> + * Sometimes catcache entries are left unremoved for several reasons.

I'm unconvinced that that's ok for positive entries, entirely regardless
of this patch.


> We
> + * cannot allow them to eat up the usable memory and still it is better to
> + * remove entries that are no longer accessed from the perspective of memory
> + * performance ratio. Unfortunately we cannot predict that but we can assume
> + * that entries that are not accessed for long time no longer contribute to
> + * performance.
> + */

This needs polish.


> +static bool
> +CatCacheCleanupOldEntries(CatCache *cp)
> +{
> +    int            i;
> +    int            nremoved = 0;
> +#ifdef CATCACHE_STATS
> +    int            ntotal = 0;
> +    int            tm[] = {30, 60, 600, 1200, 1800, 0};
> +    int            cn[6] = {0, 0, 0, 0, 0};
> +    int            cage[3] = {0, 0, 0};
> +#endif

This doesn't look nice, the names descriptive enough to be self evident,
and there's no comments what these random arrays mean. And some specify
lenght (and have differing number of elements!) and others don't.


> +    /* Move all entries from old hash table to new. */
> +    for (i = 0; i < cp->cc_nbuckets; i++)
> +    {
> +        dlist_mutable_iter iter;
> +
> +        dlist_foreach_modify(iter, &cp->cc_bucket[i])
> +        {
> +            CatCTup    *ct = dlist_container(CatCTup, cache_elem, iter.cur);
> +            long s;
> +            int us;
> +
> +
> +            TimestampDifference(ct->lastaccess, catcacheclock, &s, &us);
> +
> +#ifdef CATCACHE_STATS
> +            {
> +                int j;
> +
> +                ntotal++;
> +                for (j = 0 ; tm[j] != 0 && s > tm[j] ; j++);
> +                if (tm[j] == 0) j--;
> +                cn[j]++;
> +            }
> +#endif

What?


> +            /*
> +             * Remove entries older than 600 seconds but not recently used.
> +             * Entries that are not accessed after creation are removed in 600
> +             * seconds, and that has been used several times are removed after
> +             * 30 minumtes ignorance. We don't try shrink buckets since they
> +             * are not the major part of syscache bloat and they are expected
> +             * to be filled shortly again.
> +             */
> +            if (s > 600)
> +            {

So this is hardcoded, without any sort of cache pressure logic? Doesn't
that mean we'll often *severely* degrade performance if a backend is
idle for a while?


Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)
Next
From: Alvaro Herrera
Date:
Subject: Re: [PoC PATCH] Parallel dump to /dev/null