Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Protect syscache from bloating with negative cache entries
Date
Msg-id CA+TgmoaQVtw=D8sDe78NwrOAPmJFjsR6XWtQ29C=fquoBvhCVw@mail.gmail.com
Whole thread Raw
In response to Re: Protect syscache from bloating with negative cache entries  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Protect syscache from bloating with negative cache entries  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Jan 17, 2019 at 2:48 PM Bruce Momjian <bruce@momjian.us> wrote:
> Well, I think everyone agrees there are workloads that cause undesired
> cache bloat.  What we have not found is a solution that doesn't cause
> code complexity or undesired overhead, or one that >1% of users will
> know how to use.
>
> Unfortunately, because we have not found something we are happy with, we
> have done nothing.  I agree LRU can be expensive.  What if we do some
> kind of clock sweep and expiration like we do for shared buffers?  I
> think the trick is figuring how frequently to do the sweep.  What if we
> mark entries as unused every 10 queries, mark them as used on first use,
> and delete cache entries that have not be used in the past 10 queries.

I still think wall-clock time is a perfectly reasonable heuristic.
Say every 5 or 10 minutes you walk through the cache.  Anything that
hasn't been touched since the last scan you throw away.  If you do
this, you MIGHT flush an entry that you're just about to need again,
but (1) it's not very likely, because if it hasn't been touched in
many minutes, the chances that it's about to be needed again are low,
and (2) even if it does happen, it probably won't cost all that much,
because *occasionally* reloading a cache entry unnecessarily isn't
that costly; the big problem is when you do it over and over again,
which can easily happen with a fixed size limit on the cache, and (3)
if somebody does have a workload where they touch the same object
every 11 minutes, we can give them a GUC to control the timeout
between cache sweeps and it's really not that hard to understand how
to set it.  And most people won't need to.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Early WIP/PoC for inlining CTEs
Next
From: Raúl Marín Rodríguez
Date:
Subject: Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character