Home > mailing lists

Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

From	andres@anarazel.de
Subject	Re: Protect syscache from bloating with negative cache entries
Date	January 18, 2019 21:23:30
Msg-id	20190118212330.t5ibohxlavuy55bg@alap3.anarazel.de Whole thread Raw
In response to	Re: Protect syscache from bloating with negative cache entries (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Protect syscache from bloating with negative cache entries
List	pgsql-hackers

Tree view

On 2019-01-18 15:57:17 -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Thu, Jan 17, 2019 at 2:48 PM Bruce Momjian <bruce@momjian.us> wrote:
> >> Unfortunately, because we have not found something we are happy with, we
> >> have done nothing.  I agree LRU can be expensive.  What if we do some
> >> kind of clock sweep and expiration like we do for shared buffers?  I
> >> think the trick is figuring how frequently to do the sweep.  What if we
> >> mark entries as unused every 10 queries, mark them as used on first use,
> >> and delete cache entries that have not be used in the past 10 queries.
> 
> > I still think wall-clock time is a perfectly reasonable heuristic.
> 
> The easy implementations of that involve putting gettimeofday() calls
> into hot code paths, which would be a Bad Thing.  But maybe we could
> do this only at transaction or statement start, and piggyback on the
> gettimeofday() calls that already happen at those times.

My proposal for this was to attach a 'generation' to cache entries. Upon
access cache entries are marked to be of the current
generation. Whenever existing memory isn't sufficient for further cache
entries and, on a less frequent schedule, triggered by a timer, the
cache generation is increased and th new generation's "creation time" is
measured.  Then generations that are older than a certain threshold are
purged, and if there are any, the entries of the purged generation are
removed from the caches using a sequential scan through the cache.

This outline achieves:
- no additional time measurements in hot code paths
- no need for a sequential scan of the entire cache when no generations
  are too old
- both size and time limits can be implemented reasonably cheaply
- overhead when feature disabled should be close to zero

Greetings,

Andres Freund

pgsql-hackers by date:

From: Andres Freund
Date: 18 January 2019, 21:12:58
Subject: TestForOldSnapshot() seems to be in the wrong place

From: "Daniel Verite"
Date: 18 January 2019, 21:31:30
Subject: Re: Alternative to \copy in psql modelled after \g

Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

Previous

Next