Proposed patch for operator lookup caching - Mailing list pgsql-patches

From Tom Lane
Subject Proposed patch for operator lookup caching
Date
Msg-id 8740.1196129626@sss.pgh.pa.us
Whole thread Raw
Responses Re: Proposed patch for operator lookup caching
Re: Proposed patch for operator lookup caching
List pgsql-patches
Since Simon seems intent on hacking something in there, here is a patch
that I think is actually sane for improving operator lookup speed.
This patch caches all lookups, exact or ambiguous (since even the exact
ones require multiple cache searches in common cases); and behaves sanely
in the presence of search_path, pg_operator, or pg_cast changes.

I see about a 45% speedup (2110 vs 1445 tps) on Guillame Smet's test case.
On straight pgbench --- which has no ambiguous operators, plus it's not
read-only --- it's hard to measure any consistent speedup, but I can say
that it's not slower.  Some other test cases would be nice.

I went through the code that's being bypassed in some detail, to see what
dependencies were being skipped over.  I think that as long as we assume
that no *existing* type changes its domain base type, typtype, array
status, type category, or preferred-type status, we don't need to flush
the cache on pg_type changes.  This is a good thing since pg_type changes
frequently (eg, at temp table create or drop).

The only case that I believe to be unhandled is that the cache doesn't pay
attention to ALTER TABLE ... INHERIT / NO INHERIT events.  This means it
is theoretically possible to return the wrong operator if an operator
takes a complex type as input and the calling situation involves another
complex type whose inheritance relationship to that one changes.  That's
sufficiently far out of the normal case that I'm not very worried about it
(in fact, we probably have bugs in that area even without this patch,
since for instance cached plans don't respond to such changes either).
We could plug the hole by forcing a system-wide cache reset during ALTER
TABLE ... INHERIT / NO INHERIT, if anyone insists.

I'm not entirely happy about applying a patch like this so late in
the beta cycle, but I'd much rather do this than than any of the
less-than-half-baked ideas that have been floated in the discussion
so far.

            regards, tom lane

Index: src/backend/catalog/namespace.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/catalog/namespace.c,v
retrieving revision 1.102
diff -c -r1.102 namespace.c
*** src/backend/catalog/namespace.c    25 Nov 2007 02:09:46 -0000    1.102
--- src/backend/catalog/namespace.c    27 Nov 2007 02:07:01 -0000
***************
*** 3007,3012 ****
--- 3007,3046 ----
  }

  /*
+  * Fetch the active search path into a caller-allocated array of OIDs.
+  * Returns the number of path entries.  (If this is more than sarray_len,
+  * then the data didn't fit and is not all stored.)
+  *
+  * The returned list always includes the implicitly-prepended namespaces,
+  * but never includes the temp namespace.  (This is suitable for existing
+  * users, which would want to ignore the temp namespace anyway.)  This
+  * definition allows us to not worry about initializing the temp namespace.
+  */
+ int
+ fetch_search_path_array(Oid *sarray, int sarray_len)
+ {
+     int            count = 0;
+     ListCell   *l;
+
+     recomputeNamespacePath();
+
+     foreach(l, activeSearchPath)
+     {
+         Oid            namespaceId = lfirst_oid(l);
+
+         if (namespaceId == myTempNamespace)
+             continue;            /* do not include temp namespace */
+
+         if (count < sarray_len)
+             sarray[count] = namespaceId;
+         count++;
+     }
+
+     return count;
+ }
+
+
+ /*
   * Export the FooIsVisible functions as SQL-callable functions.
   */

Index: src/backend/parser/parse_oper.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/parse_oper.c,v
retrieving revision 1.98
diff -c -r1.98 parse_oper.c
*** src/backend/parser/parse_oper.c    22 Nov 2007 19:40:25 -0000    1.98
--- src/backend/parser/parse_oper.c    27 Nov 2007 02:07:01 -0000
***************
*** 24,34 ****
--- 24,70 ----
  #include "parser/parse_oper.h"
  #include "parser/parse_type.h"
  #include "utils/builtins.h"
+ #include "utils/hsearch.h"
+ #include "utils/inval.h"
  #include "utils/lsyscache.h"
  #include "utils/syscache.h"
  #include "utils/typcache.h"


+ /*
+  * The lookup key for the operator lookaside hash table.  Unused bits must be
+  * zeroes to ensure hashing works consistently --- in particular, oprname
+  * must be zero-padded and any unused entries in search_path must be zero.
+  *
+  * search_path contains the actual search_path with which the entry was
+  * derived (minus temp namespace if any), or else the single specified
+  * schema OID if we are looking up an explicitly-qualified operator name.
+  *
+  * search_path has to be fixed-length since the hashtable code insists on
+  * fixed-size keys.  If your search path is longer than that, we just punt
+  * and don't cache anything.
+  */
+
+ /* If your search_path is longer than this, sucks to be you ... */
+ #define MAX_CACHED_PATH_LEN        16
+
+ typedef struct OprCacheKey
+ {
+     char        oprname[NAMEDATALEN];
+     Oid            left_arg;        /* Left input OID, or 0 if prefix op */
+     Oid            right_arg;        /* Right input OID, or 0 if postfix op */
+     Oid            search_path[MAX_CACHED_PATH_LEN];
+ } OprCacheKey;
+
+ typedef struct OprCacheEntry
+ {
+     /* the hash lookup key MUST BE FIRST */
+     OprCacheKey    key;
+
+     Oid            opr_oid;        /* OID of the resolved operator */
+ } OprCacheEntry;
+
+
  static Oid    binary_oper_exact(List *opname, Oid arg1, Oid arg2);
  static FuncDetailCode oper_select_candidate(int nargs,
                        Oid *input_typeids,
***************
*** 42,47 ****
--- 78,88 ----
  static Expr *make_op_expr(ParseState *pstate, Operator op,
               Node *ltree, Node *rtree,
               Oid ltypeId, Oid rtypeId);
+ static bool make_oper_cache_key(OprCacheKey *key, List *opname,
+                                 Oid ltypeId, Oid rtypeId);
+ static Oid    find_oper_cache_entry(OprCacheKey *key);
+ static void make_oper_cache_entry(OprCacheKey *key, Oid opr_oid);
+ static void InvalidateOprCacheCallBack(Datum arg, Oid relid);


  /*
***************
*** 496,505 ****
--- 537,565 ----
       bool noError, int location)
  {
      Oid            operOid;
+     OprCacheKey    key;
+     bool        key_ok;
      FuncDetailCode fdresult = FUNCDETAIL_NOTFOUND;
      HeapTuple    tup = NULL;

      /*
+      * Try to find the mapping in the lookaside cache.
+      */
+     key_ok = make_oper_cache_key(&key, opname, ltypeId, rtypeId);
+     if (key_ok)
+     {
+         operOid = find_oper_cache_entry(&key);
+         if (OidIsValid(operOid))
+         {
+             tup = SearchSysCache(OPEROID,
+                                  ObjectIdGetDatum(operOid),
+                                  0, 0, 0);
+             if (HeapTupleIsValid(tup))
+                 return (Operator) tup;
+         }
+     }
+
+     /*
       * First try for an "exact" match.
       */
      operOid = binary_oper_exact(opname, ltypeId, rtypeId);
***************
*** 537,543 ****
                               ObjectIdGetDatum(operOid),
                               0, 0, 0);

!     if (!HeapTupleIsValid(tup) && !noError)
          op_error(pstate, opname, 'b', ltypeId, rtypeId, fdresult, location);

      return (Operator) tup;
--- 597,608 ----
                               ObjectIdGetDatum(operOid),
                               0, 0, 0);

!     if (HeapTupleIsValid(tup))
!     {
!         if (key_ok)
!             make_oper_cache_entry(&key, operOid);
!     }
!     else if (!noError)
          op_error(pstate, opname, 'b', ltypeId, rtypeId, fdresult, location);

      return (Operator) tup;
***************
*** 622,631 ****
--- 687,715 ----
  right_oper(ParseState *pstate, List *op, Oid arg, bool noError, int location)
  {
      Oid            operOid;
+     OprCacheKey    key;
+     bool        key_ok;
      FuncDetailCode fdresult = FUNCDETAIL_NOTFOUND;
      HeapTuple    tup = NULL;

      /*
+      * Try to find the mapping in the lookaside cache.
+      */
+     key_ok = make_oper_cache_key(&key, op, arg, InvalidOid);
+     if (key_ok)
+     {
+         operOid = find_oper_cache_entry(&key);
+         if (OidIsValid(operOid))
+         {
+             tup = SearchSysCache(OPEROID,
+                                  ObjectIdGetDatum(operOid),
+                                  0, 0, 0);
+             if (HeapTupleIsValid(tup))
+                 return (Operator) tup;
+         }
+     }
+
+     /*
       * First try for an "exact" match.
       */
      operOid = OpernameGetOprid(op, arg, InvalidOid);
***************
*** 655,661 ****
                               ObjectIdGetDatum(operOid),
                               0, 0, 0);

!     if (!HeapTupleIsValid(tup) && !noError)
          op_error(pstate, op, 'r', arg, InvalidOid, fdresult, location);

      return (Operator) tup;
--- 739,750 ----
                               ObjectIdGetDatum(operOid),
                               0, 0, 0);

!     if (HeapTupleIsValid(tup))
!     {
!         if (key_ok)
!             make_oper_cache_entry(&key, operOid);
!     }
!     else if (!noError)
          op_error(pstate, op, 'r', arg, InvalidOid, fdresult, location);

      return (Operator) tup;
***************
*** 680,689 ****
--- 769,797 ----
  left_oper(ParseState *pstate, List *op, Oid arg, bool noError, int location)
  {
      Oid            operOid;
+     OprCacheKey    key;
+     bool        key_ok;
      FuncDetailCode fdresult = FUNCDETAIL_NOTFOUND;
      HeapTuple    tup = NULL;

      /*
+      * Try to find the mapping in the lookaside cache.
+      */
+     key_ok = make_oper_cache_key(&key, op, InvalidOid, arg);
+     if (key_ok)
+     {
+         operOid = find_oper_cache_entry(&key);
+         if (OidIsValid(operOid))
+         {
+             tup = SearchSysCache(OPEROID,
+                                  ObjectIdGetDatum(operOid),
+                                  0, 0, 0);
+             if (HeapTupleIsValid(tup))
+                 return (Operator) tup;
+         }
+     }
+
+     /*
       * First try for an "exact" match.
       */
      operOid = OpernameGetOprid(op, InvalidOid, arg);
***************
*** 725,731 ****
                               ObjectIdGetDatum(operOid),
                               0, 0, 0);

!     if (!HeapTupleIsValid(tup) && !noError)
          op_error(pstate, op, 'l', InvalidOid, arg, fdresult, location);

      return (Operator) tup;
--- 833,844 ----
                               ObjectIdGetDatum(operOid),
                               0, 0, 0);

!     if (HeapTupleIsValid(tup))
!     {
!         if (key_ok)
!             make_oper_cache_entry(&key, operOid);
!     }
!     else if (!noError)
          op_error(pstate, op, 'l', InvalidOid, arg, fdresult, location);

      return (Operator) tup;
***************
*** 1018,1020 ****
--- 1131,1290 ----

      return (Expr *) result;
  }
+
+
+ /*
+  * Lookaside cache to speed operator lookup.  Possibly this should be in
+  * a separate module under utils/cache/ ?
+  *
+  * The idea here is that the mapping from operator name and given argument
+  * types is constant for a given search path (or single specified schema OID)
+  * so long as the contents of pg_operator and pg_cast don't change.  And that
+  * mapping is pretty expensive to compute, especially for ambiguous operators;
+  * this is mainly because there are a *lot* of instances of popular operator
+  * names such as "=", and we have to check each one to see which is the
+  * best match.  So once we have identified the correct mapping, we save it
+  * in a cache that need only be flushed on pg_operator or pg_cast change.
+  * (pg_cast must be considered because changes in the set of implicit casts
+  * affect the set of applicable operators for any given input datatype.)
+  *
+  * XXX in principle, ALTER TABLE ... INHERIT could affect the mapping as
+  * well, but we disregard that since there's no convenient way to find out
+  * about it, and it seems a pretty far-fetched corner-case anyway.
+  *
+  * Note: at some point it might be worth doing a similar cache for function
+  * lookups.  However, the potential gain is a lot less since (a) function
+  * names are generally not overloaded as heavily as operator names, and
+  * (b) we'd have to flush on pg_proc updates, which are probably a good
+  * deal more common than pg_operator updates.
+  */
+
+ /* The operator cache hashtable */
+ static HTAB *OprCacheHash = NULL;
+
+
+ /*
+  * make_oper_cache_key
+  *        Fill the lookup key struct given operator name and arg types.
+  *
+  * Returns TRUE if successful, FALSE if the search_path overflowed
+  * (hence no caching is possible).
+  */
+ static bool
+ make_oper_cache_key(OprCacheKey *key, List *opname, Oid ltypeId, Oid rtypeId)
+ {
+     char       *schemaname;
+     char       *opername;
+
+     /* deconstruct the name list */
+     DeconstructQualifiedName(opname, &schemaname, &opername);
+
+     /* ensure zero-fill for stable hashing */
+     MemSet(key, 0, sizeof(OprCacheKey));
+
+     /* save operator name and input types into key */
+     strlcpy(key->oprname, opername, NAMEDATALEN);
+     key->left_arg = ltypeId;
+     key->right_arg = rtypeId;
+
+     if (schemaname)
+     {
+         /* search only in exact schema given */
+         key->search_path[0] = LookupExplicitNamespace(schemaname);
+     }
+     else
+     {
+         /* get the active search path */
+         if (fetch_search_path_array(key->search_path,
+                                     MAX_CACHED_PATH_LEN) > MAX_CACHED_PATH_LEN)
+             return false;        /* oops, didn't fit */
+     }
+
+     return true;
+ }
+
+ /*
+  * find_oper_cache_entry
+  *
+  * Look for a cache entry matching the given key.  If found, return the
+  * contained operator OID, else return InvalidOid.
+  */
+ static Oid
+ find_oper_cache_entry(OprCacheKey *key)
+ {
+     OprCacheEntry *oprentry;
+
+     if (OprCacheHash == NULL)
+     {
+         /* First time through: initialize the hash table */
+         HASHCTL        ctl;
+
+         if (!CacheMemoryContext)
+             CreateCacheMemoryContext();
+
+         MemSet(&ctl, 0, sizeof(ctl));
+         ctl.keysize = sizeof(OprCacheKey);
+         ctl.entrysize = sizeof(OprCacheEntry);
+         ctl.hash = tag_hash;
+         OprCacheHash = hash_create("Operator lookup cache", 256,
+                                     &ctl, HASH_ELEM | HASH_FUNCTION);
+
+         /* Arrange to flush cache on pg_operator and pg_cast changes */
+         CacheRegisterSyscacheCallback(OPERNAMENSP,
+                                       InvalidateOprCacheCallBack,
+                                       (Datum) 0);
+         CacheRegisterSyscacheCallback(CASTSOURCETARGET,
+                                       InvalidateOprCacheCallBack,
+                                       (Datum) 0);
+     }
+
+     /* Look for an existing entry */
+     oprentry = (OprCacheEntry *) hash_search(OprCacheHash,
+                                              (void *) key,
+                                              HASH_FIND, NULL);
+     if (oprentry == NULL)
+         return InvalidOid;
+
+     return oprentry->opr_oid;
+ }
+
+ /*
+  * make_oper_cache_entry
+  *
+  * Insert a cache entry for the given key.
+  */
+ static void
+ make_oper_cache_entry(OprCacheKey *key, Oid opr_oid)
+ {
+     OprCacheEntry *oprentry;
+
+     Assert(OprCacheHash != NULL);
+
+     oprentry = (OprCacheEntry *) hash_search(OprCacheHash,
+                                              (void *) key,
+                                              HASH_ENTER, NULL);
+     oprentry->opr_oid = opr_oid;
+ }
+
+ /*
+  * Callback for pg_operator and pg_cast inval events
+  */
+ static void
+ InvalidateOprCacheCallBack(Datum arg, Oid relid)
+ {
+     HASH_SEQ_STATUS status;
+     OprCacheEntry *hentry;
+
+     Assert(OprCacheHash != NULL);
+
+     /* Currently we just flush all entries; hard to be smarter ... */
+     hash_seq_init(&status, OprCacheHash);
+
+     while ((hentry = (OprCacheEntry *) hash_seq_search(&status)) != NULL)
+     {
+         if (hash_search(OprCacheHash,
+                         (void *) &hentry->key,
+                         HASH_REMOVE, NULL) == NULL)
+             elog(ERROR, "hash table corrupted");
+     }
+ }
Index: src/include/catalog/namespace.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/catalog/namespace.h,v
retrieving revision 1.51
diff -c -r1.51 namespace.h
*** src/include/catalog/namespace.h    15 Nov 2007 22:25:17 -0000    1.51
--- src/include/catalog/namespace.h    27 Nov 2007 02:07:01 -0000
***************
*** 115,119 ****
--- 115,120 ----
  extern char *namespace_search_path;

  extern List *fetch_search_path(bool includeImplicit);
+ extern int    fetch_search_path_array(Oid *sarray, int sarray_len);

  #endif   /* NAMESPACE_H */

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Proposed patch to fix plpgsql's handling of block labels
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Fixes for MONEY type using locale