Home > mailing lists

RE: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

From	Tsunakawa, Takayuki
Subject	RE: Protect syscache from bloating with negative cache entries
Date	February 18, 2019 23:43:42
Msg-id	0A3221C70F24FB45833433255569204D1FB9D37E@G01JPEXMBYT05 Whole thread Raw
In response to	RE: Protect syscache from bloating with negative cache entries ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
Responses	RE: Protect syscache from bloating with negative cache entries Re: Protect syscache from bloating with negative cache entries
List	pgsql-hackers

Tree view

Hi Horiguchi-san,

I've looked through your patches.  This is the first part of my review results.  Let me post the rest after another
worktoday.
 

BTW, how about merging 0003 and 0005, and separating and deferring 0004 in another thread?  That may help to relieve
othercommunity members by making this patch set not so large and complex.
 



[Bottleneck investigation]
Ideriha-san and I are trying to find the bottleneck.  My first try shows there's little overhead.  Here's what I did:

<postgresql.conf>
shared_buffers = 1GB
catalog_cache_prune_min_age = -1
catalog_cache_max_size = 10MB

<benchmark>
$ pgbench -i -s 10
$ pg_ctl stop and then start
$ cache all data in shared buffers by running pg_prewarm on branches, tellers, accounts, and their indexes
$ pgbench --select-only -c 1 -T 60

<result>
master : 8612 tps
patched: 8553 tps (-0.7%)

There's little (0.7%) performance overhead with:
* one additional dlist_move_tail() in every catcache access
* memory usage accounting in operations other than catcache access (relevant catcache entries should be cached in the
firstpgbench transaction)
 

I'll check other patterns to find out how big overhead there is.


[Source code review]
Below are my findings on the patch set v15:

(1) patch 0001
All right.


(2) patch 0002
@@ -87,6 +87,7 @@ typedef struct MemoryContextData
     const char *name;            /* context name (just for debugging) */
     const char *ident;            /* context ID if any (just for debugging) */
     MemoryContextCallback *reset_cbs;    /* list of reset/delete callbacks */
+    uint64        consumption;    /* accumulates consumed memory size */
 } MemoryContextData;

Size is more appropriate as a data type than uint64 because other places use Size for memory size variables.

How about "usedspace" instead of "consumption"?  Because that aligns better with the naming used for
MemoryContextCounters'smember variables, totalspace and freespace.
 


(3) patch 0002
+        context->consumption += chunk_size;
(and similar sites)

The used space should include the size of the context-type-specific chunk header, so that the count is closer to the
actualmemory size seen by the user.
 

Here, let's make consensus on what the used space represents.  Is it either of the following?

a) The total space allocated from OS.  i.e., the sum of the malloc()ed regions for a given memory context.
b) The total space of all chunks, including their headers, of a given memory context.

a) is better because that's the actual memory usage from the DBA's standpoint.  But a) cannot be used because
CacheMemoryContextis used for various things.  So we have to compromise on b).  Is this OK?
 

One possible future improvement is to use a separate memory context exclusively for the catcache, which is a child of
CacheMemoryContext. That way, we can adopt a).
 



(4) patch 0002
@@ -614,6 +614,9 @@ AllocSetReset(MemoryContext context)
+    set->header.consumption = 0;

This can be put in MemoryContextResetOnly() instead of context-type-specific reset functions.


Regards
Takayuki Tsunakawa

pgsql-hackers by date:

From: Tom Lane
Date: 18 February 2019, 23:42:32
Subject: Re: Speed up transaction completion faster after many relations are accessed in a transaction

From: Tom Lane
Date: 18 February 2019, 23:50:34
Subject: Re: Delay locking partitions during query execution

RE: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

Previous

Next