Re: type cache cleanup improvements - Mailing list pgsql-hackers

From Pavel Borisov
Subject Re: type cache cleanup improvements
Date
Msg-id CALT9ZEFHxez3Y61nsV4tNK+=HdBqj4OeZQk0UOwecGWqFWnjbQ@mail.gmail.com
Whole thread Raw
In response to Re: type cache cleanup improvements  (Teodor Sigaev <teodor@sigaev.ru>)
List pgsql-hackers
Hi, Alexander!

On Tue, 20 Aug 2024 at 23:01, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, Aug 5, 2024 at 4:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> I've revised the patchset.  First of all, I've re-ordered the patches.
>
> 0001-0002 (former 0002-0003)
> Comprises hash_search_with_hash_value() function and its application
> to avoid full hash iteration in InvalidateAttoptCacheCallback() and
> TypeCacheTypCallback().  I think this is quite straightforward
> optimization without negative side effects.  I've revised comments,
> commit message and did some code beautification.  I'm going to push
> this if no objections.
>
> 0003 (former 0001)
> I've revised this patch.  I think main concerns expressed in the
> thread about this path is that we don't have invalidation mechanism
> for relid => typid map.  Finally due to oid wraparound same relids
> could get reused.  That could lead to invalid entries in the map about
> existing relids and typeids.  This is rather messy, but I don't think
> this could cause a material bug.  The maps items are used only for
> cache invalidation.  Extra invalidation doesn't cause a bug.  If type
> with same relid will be cached, then correspoding map item will be
> overridden, so no missing invalidation.  However, I see the following
> reasons for keeping consistent state of relid => typid map.
>
> 1) As the main use-case for this optimization is flood of temporary
> tables, it would be nice not let relid => typid map bloat in this
> case.  I see that TypeCacheHash would get bloated, because its entries
> are never deleted.  However, I would prefer to not get this situation
> even worse.
> 2) In future we may find some more use-cases for relid => typid map
> besides cache invalidation.  Keeping that in consistent state could be
> advantage then.
>
> In the attached patch, I'm keeping relid => typid map when
> corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or
> TCFLAGS_OPERATOR_FLAGS, or tupdesc.  Thus, when temporary table gets
> deleted, we would invalidate the map item.
>
> It will be also nice to get rid of iteration over all the cached
> domain types in TypeCacheRelCallback().  However, this typically
> shouldn't be a problem since domain types are less tended to bloat.
> Domain types are created manually, unlike composite types which are
> automatically created for every temporary table.  We will probably
> need to optimize this in future, but I don't feel this to be necessary
> in present patch.
>
> I think the revised 0003 requires review.

The rebased remaining patch is attached.
I've looked at patch v8.

1.
In function check_insert_rel_type_cache() the block:

+#ifdef USE_ASSERT_CHECKING
+
+       /*
+        * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+        * entry if it should exist.
+        */
+       if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+               typentry->tupDesc == NULL)
+       {
+               bool            found;
+
+               (void) hash_search(RelIdToTypeIdCacheHash,
+                                                  &typentry->typrelid,
+                                                  HASH_FIND, &found);
+               Assert(found);
+       }
+#endif

As I understand it does HASH_FIND after the same value just inserted by HASH_ENT
ER above under the same if condition:

if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+               typentry->tupDesc == NULL) 

Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in comment in a quoted block, but if condition is the same.
 
2.
For function check_delete_rel_type_cache():
I'd modify the block:
+#ifdef USE_ASSERT_CHECKING
+
+       /*
+        * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+        * entry if it should exist.
+        */
+       if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+               (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+               typentry->tupDesc != NULL)
+       {
+               bool            found;
+
+               (void) hash_search(RelIdToTypeIdCacheHash,
+                                                  &typentry->typrelid,
+                                                  HASH_FIND, &found);
+               Assert(found);
+       }
+#endif

as:
+
+       /*
+        * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+        * entry if it should exist.
+        */
+ else
+ #ifdef USE_ASSERT_CHECKING
+               bool            found;
+
+               (void) hash_search(RelIdToTypeIdCacheHash,
+                                                  &typentry->typrelid,
+                                                  HASH_FIND, &found);
+               Assert(found);
+#endif
+}

3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are better to be renamed to be more clear, though I don't have exact proposals yet,
4. I haven't looked into comments, though I'd recommend oid -> OID replacement in the comments.

Thank you for working on this patchset!

Regards,
Pavel Borisov
Supabase

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Requiring LLVM 14+ in PostgreSQL 18
Next
From: Peter Eisentraut
Date:
Subject: Re: Cirrus CI for macOS branches 16 and 15 broken