Re: SearchCatCacheList()/SearchSysCacheList() is O(n) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: SearchCatCacheList()/SearchSysCacheList() is O(n)
Date
Msg-id 20210512213743.fbvwt23vl3qqokmw@alap3.anarazel.de
Whole thread Raw
In response to Re: SearchCatCacheList()/SearchSysCacheList() is O(n)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: SearchCatCacheList()/SearchSysCacheList() is O(n)
List pgsql-hackers
Hi,

On 2021-05-12 17:26:28 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > The problem is that SearchCatCacheList() is not actually a hash table -
> > there are no buckets, in contrast to SearchCatCacheList().
>
> Uh, what did you mean to compare to there?

Oops, copy-and-paste failure. I was trying to reference
SearchCatCacheInternal().


> > Tom, any chance you remember if this was an oversight, or whether you
> > just considered this to be OK, given the likely numbers of objects?
>
> I'm pretty sure I didn't think the lists would get large enough to be
> a problem.  I'm not quite sure which list is a problem here, actually,
> seeing that the functions all have distinct names.

It's not an individual "result" list that's the issue. In my example
they're all exactly one element long. The problem is that CatCache->list
has one element for each cached SearchCatCacheList() result, and that
for every SearchCatCacheList() we linearly search through CatCache->list
to find a match.

This is the profile:

-   88.21%  postgres  postgres            [.] SearchCatCacheList
   - 88.21% SearchCatCacheList
      - 88.21% SearchSysCacheList
           FuncnameGetCandidates
           func_get_detail
           ParseFuncOrColumn
           transformFuncCall
+    0.65%  postgres  postgres            [.] AllocSetAlloc

IOW, a single SearchCatCacheInternal() is O(N). With a fairly small
constant, but obviously that's still not great for a cache.


> I have a vague recollection that some callers depend on the lists
> being ordered by key.  I'm also fairly sure that most callers need
> to look at all entries of whichever list they've requested.  So
> I have doubts that "use a hash table" is really going to be a
> productive idea.

I was only thinking of a hashtable to identify the relvant CatCList, not
within a CatCList or anything. The cache search code between
SearchCatCacheInternal() and SearchCatCacheList() is pretty similar, but
crucially the former uses buckets, the latter doesn't:

static inline HeapTuple
SearchCatCacheInternal(CatCache *cache,
                       int nkeys,
                       Datum v1,
                       Datum v2,
                       Datum v3,
                       Datum v4)
...

    /*
     * find the hash bucket in which to look for the tuple
     */
    hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
    hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);

    /*
     * scan the hash bucket until we find a match or exhaust our tuples
     *
     * Note: it's okay to use dlist_foreach here, even though we modify the
     * dlist within the loop, because we don't continue the loop afterwards.
     */
    bucket = &cache->cc_bucket[hashIndex];
    dlist_foreach(iter, bucket)
        {
...

SearchCatCacheList(CatCache *cache,
                   int nkeys,
                   Datum v1,
                   Datum v2,
                   Datum v3)
...
    /*
     * compute a hash value of the given keys for faster search.  We don't
     * presently divide the CatCList items into buckets, but this still lets
     * us skip non-matching items quickly most of the time.
     */
    lHashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);

    /*
     * scan the items until we find a match or exhaust our list
     *
     * Note: it's okay to use dlist_foreach here, even though we modify the
     * dlist within the loop, because we don't continue the loop afterwards.
     */
    dlist_foreach(iter, &cache->cc_lists)
    {



Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SearchCatCacheList()/SearchSysCacheList() is O(n)
Next
From: Tom Lane
Date:
Subject: Re: SearchCatCacheList()/SearchSysCacheList() is O(n)