Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Protect syscache from bloating with negative cache entries
Date
Msg-id 20190118.163929.229869562.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Protect syscache from bloating with negative cache entries  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
Responses Re: Protect syscache from bloating with negative cache entries  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
Hello.

At Fri, 18 Jan 2019 11:46:03 +1300, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote in
<4e62e6b7-0ffb-54ae-3757-5583fcca38c0@archidevsys.co.nz>
> On 18/01/2019 08:48, Bruce Momjian wrote:
> > On Thu, Jan 17, 2019 at 11:33:35AM -0500, Robert Haas wrote:
> >> The flaw in your thinking, as it seems to me, is that in your concern
> >> for "the likelihood that cache flushes will simply remove entries
> >> we'll soon have to rebuild," you're apparently unwilling to consider
> >> the possibility of workloads where cache flushes will remove entries
> >> we *won't* soon have to rebuild.  Every time that issue gets raised,
> >> you seem to blow it off as if it were not a thing that really happens.
> >> I can't make sense of that position.  Is it really so hard to imagine
> >> a connection pooler that switches the same connection back and forth
> >> between two applications with different working sets?  Or a system
> >> that keeps persistent connections open even when they are idle?  Do
> >> you really believe that a connection that has not accessed a cache
> >> entry in 10 minutes still derives more benefit from that cache entry
> >> than it would from freeing up some memory?
> > Well, I think everyone agrees there are workloads that cause undesired
> > cache bloat.  What we have not found is a solution that doesn't cause
> > code complexity or undesired overhead, or one that >1% of users will
> > know how to use.
> >
> > Unfortunately, because we have not found something we are happy with,
> > we
> > have done nothing.  I agree LRU can be expensive.  What if we do some
> > kind of clock sweep and expiration like we do for shared buffers?  I

So, it doesn't use LRU but a kind of clock-sweep method. If it
finds the size is about to exceed the threshold by
resiz(doubl)ing when the current hash is filled up, it tries to
trim away the entries that are left for a duration corresponding
to usage count. This is not a hard limit but seems to be a good
compromise.

> > think the trick is figuring how frequently to do the sweep.  What if
> > we
> > mark entries as unused every 10 queries, mark them as used on first
> > use,
> > and delete cache entries that have not be used in the past 10 queries.

As above, it tires pruning at every resizing time. So this adds
complexity to the frequent paths only by setting last accessed
time and incrementing access counter. It scans the whole hash at
resize time but it doesn't add much comparing to resizing itself.

> If you take that approach, then this number should be configurable. 
> What if I had 12 common queries I used in rotation?

This basically has two knobs. The minimum hash size to do the
pruning and idle time before reaping unused entries, per
catcache.

> The ARM3 processor cache logic was to simply eject an entry at random,
> as the obviously Acorn felt that the silicon required to have a more
> sophisticated algorithm would reduce the cache size too much!
>
> I upgraded my Acorn Archimedes that had an 8MHZ bus, from an 8MHz ARM2
> to a 25MZ ARM3. that is a clock rate improvement of about 3 times. 
> However BASIC programs ran about 7 times faster, which I put down to
> the ARM3 having a cache.
>
> Obviously for Postgres this is not directly relevant, but I think it
> suggests that it may be worth considering replacing cache items at
> random.  As there are no pathological corner cases, and the logic is
> very simple.

Memory was expensive than nowadays by.. about 10^3 times?  An
obvious advantage of random reaping is requiring less silicon. I
think we don't need to be so stingy but perhaps clock-sweep is at
the maximum we can pay.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Libpq support to connect to standby server as priority
Next
From: Peter Eisentraut
Date:
Subject: Re: Python versions (was Re: RHEL 8.0 build)