Re: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Protect syscache from bloating with negative cache entries |
Date | |
Msg-id | d3b291ff-d993-78d1-8d28-61bcf72793d6@2ndquadrant.com Whole thread Raw |
In response to | Re: Protect syscache from bloating with negative cache entries (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
RE: Protect syscache from bloating with negative cache entries
Re: Protect syscache from bloating with negative cache entries |
List | pgsql-hackers |
On 2/12/19 12:35 PM, Kyotaro HORIGUCHI wrote: > Thank you for testing and the commits, Tomas. > > At Sat, 9 Feb 2019 19:09:59 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <74386116-0bc5-84f2-e614-0cff19aca2de@2ndquadrant.com> >> On 2/7/19 1:18 PM, Kyotaro HORIGUCHI wrote: >>> At Thu, 07 Feb 2019 15:24:18 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in<20190207.152418.139132570.horiguchi.kyotaro@lab.ntt.co.jp> >> I've done a bunch of benchmarks on v13, and I don't see any serious >> regression either. Each test creates a number of tables (100, 1k, 10k, >> 100k and 1M) and then runs SELECT queries on them. The tables are >> accessed randomly - with either uniform or exponential distribution. For >> each combination there are 5 runs, 60 seconds each (see the attached >> shell scripts, it should be pretty obvious). >> >> I've done the tests on two different machines - small one (i5 with 8GB >> of RAM) and large one (e5-2620v4 with 64GB RAM), but the behavior is >> almost exactly the same (with the exception of 1M tables, which does not >> fit into RAM on the smaller one). >> >> On the xeon, the results (throughput compared to master) look like this: >> >> >> uniform 100 1000 10000 100000 1000000 >> ------------------------------------------------------------ >> v13 105.04% 100.28% 102.96% 102.11% 101.54% >> v13 (nodata) 97.05% 98.30% 97.42% 96.60% 107.55% >> >> >> exponential 100 1000 10000 100000 1000000 >> ------------------------------------------------------------ >> v13 100.04% 103.48% 101.70% 98.56% 103.20% >> v13 (nodata) 97.12% 98.43% 98.86% 98.48% 104.94% >> >> The "nodata" case means the tables were empty (so no files created), >> while in the other case each table contained 1 row. >> >> Per the results it's mostly break even, and in some cases there is >> actually a measurable improvement. > > Great! I guess it comes from reduced size of hash? > Not sure about that. I haven't actually verified that it reduces the cache size at all - I was measuring the overhead of the extra work. And I don't think the syscache actually shrunk significantly, because the throughput was quite high (~15-30k tps, IIRC) so pretty much everything was touched within the default 600 seconds. >> That being said, the question is whether the patch actually reduces >> memory usage in a useful way - that's not something this benchmark >> validates. I plan to modify the tests to make pgbench script >> time-dependent (i.e. to pick a subset of tables depending on time). > > Thank you. > >> A couple of things I've happened to notice during a quick review: >> >> 1) The sgml docs in 0002 talk about "syscache_memory_target" and >> "syscache_prune_min_age", but those options were renamed to just >> "cache_memory_target" and "cache_prune_min_age". > > I'm at a loss how call syscache for users. I think it is "catalog > cache". The most basic component is called catcache, which is > covered by the syscache layer, both of then are not revealed to > users, and it is shown to user as "catalog cache". > > "catalog_cache_prune_min_age", "catalog_cache_memory_target", (if > exists) "catalog_cache_entry_limit" and > "catalog_cache_prune_ratio" make sense? > I think "catalog_cache" sounds about right, although my point was simply that there's a discrepancy between sgml docs and code. >> 2) "cache_entry_limit" is not mentioned in sgml docs at all, and it's >> defined three times in guc.c for some reason. > > It is just PoC, added to show how it looks. (The multiple > instances must bex a result of a convulsion of my fingers..) I > think this is not useful unless it can be specfied per-relation > or per-cache basis. I'll remove the GUC and add reloptions for > the purpose. (But it won't work for pg_class and pg_attribute > for now). > OK, although I'd just keep it as simple as possible. TBH I can't really imagine users tuning limits for individual caches in any meaningful way. >> 3) I don't see why to define PRUNE_BY_AGE and PRUNE_BY_NUMBER, instead >> of just using two bool variables prune_by_age and prune_by_number doing >> the same thing. > > Agreed. It's a kind of memory-stingy, which is useless there. > >> 4) I'm not entirely sure about using stmtStartTimestamp. Doesn't that >> pretty much mean long-running statements will set the lastaccess to very >> old timestamp? Also, it means that long-running statements (like a PL >> function accessing a bunch of tables) won't do any eviction at all, no? >> AFAICS we'll set the timestamp only once, at the very beginning. >> >> I wonder whether using some other timestamp source (like a timestamp >> updated regularly from a timer, or something like that). > > I didin't consider planning that happen within a function. If > 5min is the default for catalog_cache_prune_min_age, 10% of it > (30s) seems enough and gettieofday() with such intervals wouldn't > affect forground jobs. I'd choose catalog_c_p_m_age/10 rather > than fixed value 30s and 1s as the minimal. > Actually, I see CatCacheCleanupOldEntries contains this comment: /* * Calculate the duration from the time of the last access to the * "current" time. Since catcacheclock is not advanced within a * transaction, the entries that are accessed within the current * transaction won't be pruned. */ which I think is pretty much what I've been saying ... But the question is whether we need to do something about it. > I obeserved significant degradation by setting up timer at every > statement start. The patch is doing the followings to get rid of > the degradation. > > (1) Every statement updates the catcache timestamp as currently > does. (SetCatCacheClock) > > (2) The timestamp is also updated periodically using timer > separately from (1). The timer starts if not yet at the time > of (1). (SetCatCacheClock, UpdateCatCacheClock) > > (3) Statement end and transaction end don't stop the timer, to > avoid overhead of setting up a timer. ( > > (4) But it stops by error. I choosed not to change the thing in > PostgresMain that it kills all timers on error. > > (5) Also changing the GUC catalog_cache_prune_min_age kills the > timer, in order to reflect the change quickly especially when > it is shortened. > Interesting. What was the frequency of the timer / how often was it executed? Can you share the code somehow? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: