Re: [COMMITTERS] pgsql: Fix an O(N^2) problem in foreign key references. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [COMMITTERS] pgsql: Fix an O(N^2) problem in foreign key references.
Date
Msg-id 16223.1443204470@sss.pgh.pa.us
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Fix an O(N^2) problem in foreign key references.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Would it make sense to remove the only the few oldest entries, instead
>> of all of them?  As is, I think this causes a storm of reloads every
>> once in a while, if the number of FKs in the system is large enough.
>> Maybe on a cache hit we could push the entry to the head of the list,
>> and then remove N entries from the back of the list when the threshold
>> is reached.

> (FWIW, it might take less code to put the recently-used entries at the
> back of the list.  Then the loop in InvalidateConstraintCacheCallBack
> could just invalidate/delete entries if either they're targets, or
> the current list length exceeds the threshold.)

Specifically, we could change it as per attached.  But this adds some
cycles to the mainline usage of fetching a valid cache entry, so I'd
want to see some evidence that there are use-cases it helps before
applying it.

            regards, tom lane

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 018cb99..54376ed 100644
*** a/src/backend/utils/adt/ri_triggers.c
--- b/src/backend/utils/adt/ri_triggers.c
*************** ri_LoadConstraintInfo(Oid constraintOid)
*** 2814,2820 ****
          ri_InitHashTables();

      /*
!      * Find or create a hash entry.  If we find a valid one, just return it.
       */
      riinfo = (RI_ConstraintInfo *) hash_search(ri_constraint_cache,
                                                 (void *) &constraintOid,
--- 2814,2821 ----
          ri_InitHashTables();

      /*
!      * Find or create a hash entry.  If we find a valid one, just return it,
!      * after pushing it to the end of the list to mark it recently used.
       */
      riinfo = (RI_ConstraintInfo *) hash_search(ri_constraint_cache,
                                                 (void *) &constraintOid,
*************** ri_LoadConstraintInfo(Oid constraintOid)
*** 2822,2828 ****
--- 2823,2833 ----
      if (!found)
          riinfo->valid = false;
      else if (riinfo->valid)
+     {
+         dlist_delete(&riinfo->valid_link);
+         dlist_push_tail(&ri_constraint_cache_valid_list, &riinfo->valid_link);
          return riinfo;
+     }

      /*
       * Fetch the pg_constraint row so we can fill in the entry.
*************** ri_LoadConstraintInfo(Oid constraintOid)
*** 2931,2936 ****
--- 2936,2942 ----
      /*
       * For efficient processing of invalidation messages below, we keep a
       * doubly-linked list, and a count, of all currently valid entries.
+      * The most recently used entries are at the *end* of the list.
       */
      dlist_push_tail(&ri_constraint_cache_valid_list, &riinfo->valid_link);
      ri_constraint_cache_valid_count++;
*************** ri_LoadConstraintInfo(Oid constraintOid)
*** 2948,2953 ****
--- 2954,2965 ----
   * Invalidate any ri_constraint_cache entry associated with the syscache
   * entry with the specified hash value, or all entries if hashvalue == 0.
   *
+  * We also invalidate entries if there are too many valid entries.  This rule
+  * avoids O(N^2) behavior in situations where a session touches many foreign
+  * keys and also does many ALTER TABLEs, such as a restore from pg_dump.  When
+  * we do kill entries for that reason, they'll be the least recently used
+  * ones, since they're at the front of the list.
+  *
   * Note: at the time a cache invalidation message is processed there may be
   * active references to the cache.  Because of this we never remove entries
   * from the cache, but only mark them invalid, which is harmless to active
*************** InvalidateConstraintCacheCallBack(Datum
*** 2961,2981 ****

      Assert(ri_constraint_cache != NULL);

-     /*
-      * If the list of currently valid entries gets excessively large, we mark
-      * them all invalid so we can empty the list.  This arrangement avoids
-      * O(N^2) behavior in situations where a session touches many foreign keys
-      * and also does many ALTER TABLEs, such as a restore from pg_dump.
-      */
-     if (ri_constraint_cache_valid_count > 1000)
-         hashvalue = 0;            /* pretend it's a cache reset */
-
      dlist_foreach_modify(iter, &ri_constraint_cache_valid_list)
      {
          RI_ConstraintInfo *riinfo = dlist_container(RI_ConstraintInfo,
                                                      valid_link, iter.cur);

!         if (hashvalue == 0 || riinfo->oidHashValue == hashvalue)
          {
              riinfo->valid = false;
              /* Remove invalidated entries from the list, too */
--- 2973,2985 ----

      Assert(ri_constraint_cache != NULL);

      dlist_foreach_modify(iter, &ri_constraint_cache_valid_list)
      {
          RI_ConstraintInfo *riinfo = dlist_container(RI_ConstraintInfo,
                                                      valid_link, iter.cur);

!         if (hashvalue == 0 || riinfo->oidHashValue == hashvalue ||
!             ri_constraint_cache_valid_count > 1000)
          {
              riinfo->valid = false;
              /* Remove invalidated entries from the list, too */

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Fix an O(N^2) problem in foreign key references.
Next
From: Stefan Kaltenbrunner
Date:
Subject: upcoming infrastructure changes/OS upgrades on *.postgresql.org