Home > mailing lists

RE: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

From	Tsunakawa, Takayuki
Subject	RE: Protect syscache from bloating with negative cache entries
Date	February 19, 2019 05:57:58
Msg-id	0A3221C70F24FB45833433255569204D1FB9D796@G01JPEXMBYT05 Whole thread Raw
In response to	RE: Protect syscache from bloating with negative cache entries ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
List	pgsql-hackers

Tree view

Hi Horiguchi-san,

This is the rest of my review comments.



(5) patch 0003
        CatcacheClockTimeoutPending = 0;
+
+        /* Update timetamp then set up the next timeout */
+

false is better than 0, to follow other **Pending variables.

timetamp -> timestamp


(6) patch 0003
GetCatCacheClock() is not used now.  Why don't we add it when the need arises?


(7) patch 0003
Why don't we remove the catcache timer (Setup/UpdateCatCacheClockTimer), unless we need it by all means?  That
simplifiesthe code.
 

Long-running queries can be thought as follows:

* A single lengthy SQL statement, e.g. SELECT for reporting/analytics, COPY for data loading, and UPDATE/DELETE for
batchprocessing, should only require small number of catalog entries during their query analysis/planning.  They won't
sufferfrom cache eviction during query execution.
 

* Do not have to evict cache entries while executing a long-running stored procedure, because its constituent SQL
statementsmay access the same tables.  If the stored procedure accesses so many tables that you are worried about the
catcachememory overuse, then catalog_cache_max_size can be used.  Another natural idea would be to update the cache
clockwhen SPI executes each SQL statement.
 


(8) patch 0003
+    uint64        base_size;
+    uint64        base_size = MemoryContextGetConsumption(CacheMemoryContext);

This may also as well be Size, not uint64.


(9) patch 0003
@@ -1940,7 +2208,7 @@ CatCacheFreeKeys(TupleDesc tupdesc, int nkeys, int *attnos, Datum *keys)
 /*
  * Helper routine that copies the keys in the srckeys array into the dstkeys
  * one, guaranteeing that the datums are fully allocated in the current memory
- * context.
+ * context. Returns allocated memory size.
  */
 static void
 CatCacheCopyKeys(TupleDesc tupdesc, int nkeys, int *attnos,
@@ -1976,7 +2244,6 @@ CatCacheCopyKeys(TupleDesc tupdesc, int nkeys, int *attnos,
                                att->attbyval,
                                att->attlen);
     }
-
 }

This change seem to be no longer necessary thanks to the memory accounting.


(10) patch 0004
How about separating this in another thread, so that the rest of the patch set becomes easier to review and commit?

Regarding the design, I'm inclined to avoid each backend writing the file.  To simplify the code, I think we can take
advantageof the fortunate situation -- the number of backends and catcaches are fixed at server startup.  My rough
sketchis:
 

* Allocate an array of statistics entries in shared memory, whose element is (pid or backend id, catcache id or name,
hits,misses, ...).  The number of array elements is MaxBackends * number of catcaches (some dozens).
 

* Each backend updates its own entry in the shared memory during query execution.

* Stats collector periodically scans the array and write it to the stats file.


(11) patch 0005
+dlist_head    cc_lru_list = {0};
+Size        global_size = 0;

It is better to put these in CatCacheHeader.  That way, backends that do not access the catcache (archiver, stats
collector,etc.) do not have to waste memory for these global variables.
 


(12) patch 0005
+    else if (catalog_cache_max_size > 0 &&
+             global_size > catalog_cache_max_size * 1024)
         CatCacheCleanupOldEntries(cache);

On the second line, catalog_cache_max_size should be cast to Size to avoid overflow.


(13) patch 0005
+            gettext_noop("Sets the maximum size of catcache in kilobytes."),

catcache -> catalog cache


(14) patch 0005
+    CatCache   *owner;            /* owner catcache */

CatCTup already has my_cache member.


(15) patch 0005
     if (nremoved > 0)
         elog(DEBUG1, "pruning catalog cache id=%d for %s: removed %d / %d",
              cp->id, cp->cc_relname, nremoved, nelems_before);

In prune-by-size case, this elog doesn't very meaningful data.  How about dividing this function into two, one is for
prune-by-ageand another for prune-by-size?  I supppose that would make the functions easier to understand.
 


Regards
Takayuki Tsunakawa

pgsql-hackers by date:

From: Shawn Debnath
Date: 19 February 2019, 05:54:28
Subject: Re: Change of email address

From: "Matsumura, Ryo"
Date: 19 February 2019, 06:58:51
Subject: RE: SQL statement PREPARE does not work in ECPG

RE: Protect syscache from bloating with negative cache entries - Mailing list pgsql-hackers

Previous

Next