Thread: type cache cleanup improvements

type cache cleanup improvements

From

Teodor Sigaev

Date:

29 February 2024, 10:26:07

Hi!

I'd like to suggest two independent patches to improve performance of type cache 
cleanup. I found a case where type cache cleanup was a reason for low 
performance. In short, customer makes 100 thousand temporary tables in one 
transaction.

1 mapRelType.patch
   It just adds a local map between relation and its type as it was suggested in 
comment above TypeCacheRelCallback(). Unfortunately, using syscache here was 
impossible because this call back could be called outside transaction and it 
makes impossible catalog lookups.

2 hash_seq_init_with_hash_value.patch
  TypeCacheTypCallback() loop over type hash  to find entry with given hash 
value. Here there are two problems: 1) there isn't  interface to dynahash to 
search entry with given hash value and 2) hash value calculation algorithm is 
differ from system cache. But coming hashvalue is came from system cache. Patch 
is addressed to both issues. It suggests hash_seq_init_with_hash_value() call 
which inits hash sequential scan over the single bucket which could contain 
entry with given hash value, and hash_seq_search() will iterate only over such 
entries. Anf patch changes hash algorithm to match syscache. Actually, patch 
makes small refactoring of dynahash, it makes common function hash_do_lookup() 
which does initial lookup in hash.

Some artificial performance test is in attachment, command to test is 'time psql 
< custom_types_and_array.sql', here I show only last rollback time and total 
execution time:
1) master 92d2ab7554f92b841ea71bcc72eaa8ab11aae662
Time: 33353,288 ms (00:33,353)
psql < custom_types_and_array.sql  0,82s user 0,71s system 1% cpu 1:28,36 total

2) mapRelType.patch
Time: 7455,581 ms (00:07,456)
psql < custom_types_and_array.sql  1,39s user 1,19s system 6% cpu 41,220 total

3) hash_seq_init_with_hash_value.patch
Time: 24975,886 ms (00:24,976)
psql < custom_types_and_array.sql  1,33s user 1,25s system 3% cpu 1:19,77 total

4) both
Time: 89,446 ms
psql < custom_types_and_array.sql  0,72s user 0,52s system 10% cpu 12,137 total


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

Re: type cache cleanup improvements

From

Aleksander Alekseev

Date:

04 March 2024, 13:01:53

Hi Teodor,

> I'd like to suggest two independent patches to improve performance of type cache
> cleanup. I found a case where type cache cleanup was a reason for low
> performance. In short, customer makes 100 thousand temporary tables in one
> transaction.
>
> 1 mapRelType.patch
>    It just adds a local map between relation and its type as it was suggested in
> comment above TypeCacheRelCallback(). Unfortunately, using syscache here was
> impossible because this call back could be called outside transaction and it
> makes impossible catalog lookups.
>
> 2 hash_seq_init_with_hash_value.patch
>   TypeCacheTypCallback() loop over type hash  to find entry with given hash
> value. Here there are two problems: 1) there isn't  interface to dynahash to
> search entry with given hash value and 2) hash value calculation algorithm is
> differ from system cache. But coming hashvalue is came from system cache. Patch
> is addressed to both issues. It suggests hash_seq_init_with_hash_value() call
> which inits hash sequential scan over the single bucket which could contain
> entry with given hash value, and hash_seq_search() will iterate only over such
> entries. Anf patch changes hash algorithm to match syscache. Actually, patch
> makes small refactoring of dynahash, it makes common function hash_do_lookup()
> which does initial lookup in hash.
>
> Some artificial performance test is in attachment, command to test is 'time psql
> < custom_types_and_array.sql', here I show only last rollback time and total
> execution time:
> 1) master 92d2ab7554f92b841ea71bcc72eaa8ab11aae662
> Time: 33353,288 ms (00:33,353)
> psql < custom_types_and_array.sql  0,82s user 0,71s system 1% cpu 1:28,36 total
>
> 2) mapRelType.patch
> Time: 7455,581 ms (00:07,456)
> psql < custom_types_and_array.sql  1,39s user 1,19s system 6% cpu 41,220 total
>
> 3) hash_seq_init_with_hash_value.patch
> Time: 24975,886 ms (00:24,976)
> psql < custom_types_and_array.sql  1,33s user 1,25s system 3% cpu 1:19,77 total
>
> 4) both
> Time: 89,446 ms
> psql < custom_types_and_array.sql  0,72s user 0,52s system 10% cpu 12,137 total

These changes look very promising. Unfortunately the proposed patches
conflict with each other regardless the order of applying:

```
error: patch failed: src/backend/utils/cache/typcache.c:356
error: src/backend/utils/cache/typcache.c: patch does not apply
```

So it's difficult to confirm case 4, not to mention the fact that we
are unable to test the patches on cfbot.

Could you please rebase the patches against the recent master branch
(in any order) and submit the result of `git format-patch` ?

-- 
Best regards,
Aleksander Alekseev

Re: type cache cleanup improvements

From

Teodor Sigaev

Date:

05 March 2024, 09:51:37

Hi!

Thank you for interesting in it!

> These changes look very promising. Unfortunately the proposed patches
> conflict with each other regardless the order of applying:
> 
> ```
> error: patch failed: src/backend/utils/cache/typcache.c:356
> error: src/backend/utils/cache/typcache.c: patch does not apply
> ```
Try increase -F option of patch.

Anyway, union of both patches in attachment


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

mapRelType_AND_hash_seq_init_with_hash_value.patch

Re: type cache cleanup improvements

From

Aleksander Alekseev

Date:

05 March 2024, 12:31:45

Hi,

> Thank you for interesting in it!
>
> > These changes look very promising. Unfortunately the proposed patches
> > conflict with each other regardless the order of applying:
> >
> > ```
> > error: patch failed: src/backend/utils/cache/typcache.c:356
> > error: src/backend/utils/cache/typcache.c: patch does not apply
> > ```
> Try increase -F option of patch.
>
> Anyway, union of both patches in attachment

Thanks for the quick update.

I tested the patch on an Intel MacBook. A release build was used with
my typical configuration, TWIMC see single-install-meson.sh [1]. The
speedup I got on the provided benchmark is about 150 times. cfbot
seems to be happy with the patch.

I would like to tweak the patch a little bit - change some comments,
add some Asserts, etc. Don't you mind?

[1]: https://github.com/afiskon/pgscripts/

-- 
Best regards,
Aleksander Alekseev

Re: type cache cleanup improvements

From

Teodor Sigaev

Date:

05 March 2024, 14:16:30


> I would like to tweak the patch a little bit - change some comments,
> add some Asserts, etc. Don't you mind?
You are welcome!
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: type cache cleanup improvements

From

Aleksander Alekseev

Date:

08 March 2024, 15:31:45

Hi,

> > I would like to tweak the patch a little bit - change some comments,
> > add some Asserts, etc. Don't you mind?
> You are welcome!

Thanks. PFA the updated patch with some tweaks by me. I added the
commit message as well.

One thing that I couldn't immediately figure out is why 0 hash value
is treated as a magic invalid value in TypeCacheTypCallback():

```
-       hash_seq_init(&status, TypeCacheHash);
+       if (hashvalue == 0)
+               hash_seq_init(&status, TypeCacheHash);
+       else
+               hash_seq_init_with_hash_value(&status, TypeCacheHash,
hashvalue);
```

Is there anything that prevents the actual hash value from being zero?
I don't think so, but maybe I missed something.

If zero is indeed an invalid hash value I would like to reference the
corresponding code. If zero is a correct hash value we should either
change this by adding something like `if(!hash) hash++` or use an
additional boolean argument here.

-- 
Best regards,
Aleksander Alekseev

Attachment

v3-0001-Improve-performance-of-type-cache-cleanup.patch

Re: type cache cleanup improvements

From

Tom Lane

Date:

08 March 2024, 15:35:39

Aleksander Alekseev <aleksander@timescale.com> writes:
> One thing that I couldn't immediately figure out is why 0 hash value
> is treated as a magic invalid value in TypeCacheTypCallback():

I've not read this patch, but IIRC in some places we have a convention
that hash value zero is passed for an sinval reset event (that is,
"flush all cache entries").

            regards, tom lane

Re: type cache cleanup improvements

From

Teodor Sigaev

Date:

08 March 2024, 20:27:07

Yep, exacly. One time from 2^32 we reset whole cache instead of one (or several) 
entry with hash value = 0.

On 08.03.2024 18:35, Tom Lane wrote:
> Aleksander Alekseev <aleksander@timescale.com> writes:
>> One thing that I couldn't immediately figure out is why 0 hash value
>> is treated as a magic invalid value in TypeCacheTypCallback():
> 
> I've not read this patch, but IIRC in some places we have a convention
> that hash value zero is passed for an sinval reset event (that is,
> "flush all cache entries").
> 
>             regards, tom lane
> 
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: type cache cleanup improvements

From

Aleksander Alekseev

Date:

11 March 2024, 12:24:57

Hi,

> Yep, exacly. One time from 2^32 we reset whole cache instead of one (or several)
> entry with hash value = 0.

Got it. Here is an updated patch where I added a corresponding comment.

Now the patch LGTM. I'm going to change its status to RfC unless
anyone wants to review it too.

-- 
Best regards,
Aleksander Alekseev

Attachment

v4-0001-Improve-performance-of-type-cache-cleanup.patch

Re: type cache cleanup improvements

From

Teodor Sigaev

Date:

12 March 2024, 15:55:41

> Got it. Here is an updated patch where I added a corresponding comment.
Thank you!

Playing around I found one more place which could easily modified with 
hash_seq_init_with_hash_value() call.
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

attoptcache-v1.patch

Re: type cache cleanup improvements

From

Michael Paquier

Date:

13 March 2024, 05:47:55

On Tue, Mar 12, 2024 at 06:55:41PM +0300, Teodor Sigaev wrote:
> Playing around I found one more place which could easily modified with
> hash_seq_init_with_hash_value() call.

I think that this patch should be split for clarity, as there are a
few things that are independently useful.  I guess something like
that:
- Introduction of hash_initial_lookup(), that simplifies 3 places of
dynahash.c where the same code is used.  The routine should be
inlined.
- The split in hash_seq_search to force a different type of search is
weird, complicating the dynahash interface by hiding what seems like a
search mode.  Rather than hasHashvalue that's hidden in the middle of
HASH_SEQ_STATUS, could it be better to have an entirely different API
for the search?  That should be a patch on its own, as well.
- The typcache changes.
--
Michael

Attachment

signature.asc

Re: type cache cleanup improvements

From

Teodor Sigaev

Date:

13 March 2024, 13:40:38

> I think that this patch should be split for clarity, as there are a
> few things that are independently useful.  I guess something like
> that:
Done, all patches should be applied consequentially.

 > - The typcache changes.
01-map_rel_to_type.v5.patch adds map relation to its type

> - Introduction of hash_initial_lookup(), that simplifies 3 places of
> dynahash.c where the same code is used.  The routine should be
> inlined.
> - The split in hash_seq_search to force a different type of search is
> weird, complicating the dynahash interface by hiding what seems like a
> search mode.  Rather than hasHashvalue that's hidden in the middle of
> HASH_SEQ_STATUS, could it be better to have an entirely different API
> for the search?  That should be a patch on its own, as well.

02-hash_seq_init_with_hash_value.v5.patch - introduces a 
hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as 
inline, but I suppose, modern compilers are smart enough to inline it automatically.

Using separate interface for scanning hash with hash value will make scan code 
more ugly in case when we need to use special value of hash value as it is done 
in cache's scans. Look, instead of this simplified code:
    if (hashvalue == 0)
        hash_seq_init(&status, TypeCacheHash);
    else
        hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
    while ((typentry = hash_seq_search(&status)) != NULL) {
       ...
    }
we will need to code something like that:
    if (hashvalue == 0)
    {
        hash_seq_init(&status, TypeCacheHash);

        while ((typentry = hash_seq_search(&status)) != NULL) {
               ...
        }
    }
    else
    {
        hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
        while ((typentry = hash_seq_search_with_hash_value(&status)) != NULL) {
               ...
        }
    }
Or I didn't understand you.

I thought about integrate check inside existing loop in  hash_seq_search() :
+ rerun:
     if ((curElem = status->curEntry) != NULL)
     {
         /* Continuing scan of curBucket... */
         status->curEntry = curElem->link;
         if (status->curEntry == NULL)   /* end of this bucket */
+    {
+        if (status->hasHashvalue)
+        hash_seq_term(status);        
    
             ++status->curBucket;
+    }
+    else if (status->hasHashvalue && status->hashvalue !=
+                curElem->hashvalue)
+        goto rerun;
         return (void *) ELEMENTKEY(curElem);
     }

But for me it looks weird and adds some checks which will takes some CPU time.


03-att_with_hash_value.v5.patch - adds usage of previous patch.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

Re: type cache cleanup improvements

From

Michael Paquier

Date:

14 March 2024, 07:42:10

On Wed, Mar 13, 2024 at 04:40:38PM +0300, Teodor Sigaev wrote:
> Done, all patches should be applied consequentially.

One thing that first pops out to me is that we can do the refactor of
hash_initial_lookup() as an independent piece, without the extra paths
introduced.  But rather than returning the bucket hash and have the
bucket number as an in/out argument of hash_initial_lookup(), there is
an argument for reversing them: hash_search_with_hash_value() does not
care about the bucket number.

> 02-hash_seq_init_with_hash_value.v5.patch - introduces a
> hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as
> inline, but I suppose, modern compilers are smart enough to inline it
> automatically.

Likely so, though that does not hurt to show the intention to the
reader.

So I would like to suggest the attached patch for this first piece.
What do you think?

It may also be an idea to use `git format-patch` when generating a
series of patches.  That makes for easier reviews.
--
Michael

Attachment

Re: type cache cleanup improvements

From

Teodor Sigaev

Date:

14 March 2024, 13:27:43

> One thing that first pops out to me is that we can do the refactor of
> hash_initial_lookup() as an independent piece, without the extra paths
> introduced.  But rather than returning the bucket hash and have the
> bucket number as an in/out argument of hash_initial_lookup(), there is
> an argument for reversing them: hash_search_with_hash_value() does not
> care about the bucket number.
Ok, no problem

> 
>> 02-hash_seq_init_with_hash_value.v5.patch - introduces a
>> hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as
>> inline, but I suppose, modern compilers are smart enough to inline it
>> automatically.
> 
> Likely so, though that does not hurt to show the intention to the
> reader.
Agree

> 
> So I would like to suggest the attached patch for this first piece.
> What do you think?
I have not any objections

> 
> It may also be an idea to use `git format-patch` when generating a
> series of patches.  That makes for easier reviews.
Thanks, will try

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: type cache cleanup improvements

From

Michael Paquier

Date:

14 March 2024, 22:58:24

On Thu, Mar 14, 2024 at 04:27:43PM +0300, Teodor Sigaev wrote:
>> So I would like to suggest the attached patch for this first piece.
>> What do you think?
>
> I have not any objections

Okay, I've applied this piece for now.  Not sure I'll have much room
to look at the rest.
--
Michael

Attachment

signature.asc

Re: type cache cleanup improvements

From

Teodor Sigaev

Date:

15 March 2024, 10:57:13

> Okay, I've applied this piece for now.  Not sure I'll have much room
> to look at the rest.

Thank you very much!

Rest of patches, rebased.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

Re: type cache cleanup improvements

From

Жарков Роман

Date:

29 March 2024, 04:49:55

> Rest of patches, rebased.

Hello,
I read and tested only the first patch so far. Creation of temp
tables and rollback in your script work 3-4 times faster with
0001-type-cache.patch on my windows laptop.

In the patch I found a copy of the comment "If it's domain over
composite, reset flags...". Can we move the reset flags operation
and its comment into the invalidateCompositeTypeCacheEntry()
function? This simplify the TypeCacheRelCallback() func, but
adds two more IF statements when we need to clean up a cache
entry for a specific relation. (diff attached).
--
Roman Zharkov

Attachment

mapRelType-v2.patch

Re: type cache cleanup improvements

From

Andrei Lepikhov

Date:

03 April 2024, 06:07:27

On 3/15/24 17:57, Teodor Sigaev wrote:
>> Okay, I've applied this piece for now.  Not sure I'll have much room
>> to look at the rest.
> 
> Thank you very much!
I have spent some time reviewing this feature. I think we can discuss 
and apply it step-by-step. So, the 0001-* patch is at this moment.
The feature addresses the issue of TypCache being bloated by intensive 
usage of non-standard types and domains. It adds significant overhead 
during relcache invalidation by thoroughly scanning this hash table.
IMO, this feature will be handy soon, as we already see some patches 
where TypCache is intensively used for storing composite types—for 
example, look into solutions proposed in [1].
One of my main concerns with this feature is the possibility of lost 
entries, which could be mistakenly used by relations with the same oid 
in the future. This seems particularly possible in cases with multiple 
temporary tables. The author has attempted to address this by replacing 
the typrelid and type_id fields in the mapRelType on each call of 
lookup_type_cache. However, I believe we could further improve this by 
removing the entry from mapRelType on invalidation, thus avoiding this 
potential issue.
While reviewing the patch, I made some minor changes (see attachment) 
that you're free to adopt or reject. However, it's crucial that the 
patch includes a detailed explanation, not just a single sentence, to 
ensure everyone understands the changes.
Upon closer inspection, I noticed that the current implementation only 
invalidates the cache entry. While this is acceptable for standard 
types, it may not be sufficient to maintain numerous custom types (as in 
the example in the initial letter) or in cases where whole-row vars are 
heavily used. In such scenarios, removing the entry and reducing the 
hash table's size might be more efficient.
In toto, the 0001-* patch looks good, and I would be glad to see it in 
the core.

[1] 
https://www.postgresql.org/message-id/flat/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg%40mail.gmail.com

-- 
regards,
Andrei Lepikhov
Postgres Professional

Attachment

0001-minor_improvements.diff

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

05 August 2024, 01:16:07

Hi!

On Wed, Apr 3, 2024 at 9:07 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 3/15/24 17:57, Teodor Sigaev wrote:
> >> Okay, I've applied this piece for now.  Not sure I'll have much room
> >> to look at the rest.
> >
> > Thank you very much!
> I have spent some time reviewing this feature. I think we can discuss
> and apply it step-by-step. So, the 0001-* patch is at this moment.
> The feature addresses the issue of TypCache being bloated by intensive
> usage of non-standard types and domains. It adds significant overhead
> during relcache invalidation by thoroughly scanning this hash table.
> IMO, this feature will be handy soon, as we already see some patches
> where TypCache is intensively used for storing composite types—for
> example, look into solutions proposed in [1].
> One of my main concerns with this feature is the possibility of lost
> entries, which could be mistakenly used by relations with the same oid
> in the future. This seems particularly possible in cases with multiple
> temporary tables. The author has attempted to address this by replacing
> the typrelid and type_id fields in the mapRelType on each call of
> lookup_type_cache. However, I believe we could further improve this by
> removing the entry from mapRelType on invalidation, thus avoiding this
> potential issue.
> While reviewing the patch, I made some minor changes (see attachment)
> that you're free to adopt or reject. However, it's crucial that the
> patch includes a detailed explanation, not just a single sentence, to
> ensure everyone understands the changes.
> Upon closer inspection, I noticed that the current implementation only
> invalidates the cache entry. While this is acceptable for standard
> types, it may not be sufficient to maintain numerous custom types (as in
> the example in the initial letter) or in cases where whole-row vars are
> heavily used. In such scenarios, removing the entry and reducing the
> hash table's size might be more efficient.
> In toto, the 0001-* patch looks good, and I would be glad to see it in
> the core.

I've revised the patchset.  First of all, I've re-ordered the patches.

0001-0002 (former 0002-0003)
Comprises hash_search_with_hash_value() function and its application
to avoid full hash iteration in InvalidateAttoptCacheCallback() and
TypeCacheTypCallback().  I think this is quite straightforward
optimization without negative side effects.  I've revised comments,
commit message and did some code beautification.  I'm going to push
this if no objections.

0003 (former 0001)
I've revised this patch.  I think main concerns expressed in the
thread about this path is that we don't have invalidation mechanism
for relid => typid map.  Finally due to oid wraparound same relids
could get reused.  That could lead to invalid entries in the map about
existing relids and typeids.  This is rather messy, but I don't think
this could cause a material bug.  The maps items are used only for
cache invalidation.  Extra invalidation doesn't cause a bug.  If type
with same relid will be cached, then correspoding map item will be
overridden, so no missing invalidation.  However, I see the following
reasons for keeping consistent state of relid => typid map.

1) As the main use-case for this optimization is flood of temporary
tables, it would be nice not let relid => typid map bloat in this
case.  I see that TypeCacheHash would get bloated, because its entries
are never deleted.  However, I would prefer to not get this situation
even worse.
2) In future we may find some more use-cases for relid => typid map
besides cache invalidation.  Keeping that in consistent state could be
advantage then.

In the attached patch, I'm keeping relid => typid map when
corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or
TCFLAGS_OPERATOR_FLAGS, or tupdesc.  Thus, when temporary table gets
deleted, we would invalidate the map item.

It will be also nice to get rid of iteration over all the cached
domain types in TypeCacheRelCallback().  However, this typically
shouldn't be a problem since domain types are less tended to bloat.
Domain types are created manually, unlike composite types which are
automatically created for every temporary table.  We will probably
need to optimize this in future, but I don't feel this to be necessary
in present patch.

I think the revised 0003 requires review.

------
Regards,
Alexander Korotkov
Supabase

Attachment

Re: type cache cleanup improvements

From

Pavel Borisov

Date:

21 August 2024, 16:28:13

Hi, Alexander!

On Tue, 20 Aug 2024 at 23:01, Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Mon, Aug 5, 2024 at 4:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> I've revised the patchset. First of all, I've re-ordered the patches.
>
> 0001-0002 (former 0002-0003)
> Comprises hash_search_with_hash_value() function and its application
> to avoid full hash iteration in InvalidateAttoptCacheCallback() and
> TypeCacheTypCallback(). I think this is quite straightforward
> optimization without negative side effects. I've revised comments,
> commit message and did some code beautification. I'm going to push
> this if no objections.
>
> 0003 (former 0001)
> I've revised this patch. I think main concerns expressed in the
> thread about this path is that we don't have invalidation mechanism
> for relid => typid map. Finally due to oid wraparound same relids
> could get reused. That could lead to invalid entries in the map about
> existing relids and typeids. This is rather messy, but I don't think
> this could cause a material bug. The maps items are used only for
> cache invalidation. Extra invalidation doesn't cause a bug. If type
> with same relid will be cached, then correspoding map item will be
> overridden, so no missing invalidation. However, I see the following
> reasons for keeping consistent state of relid => typid map.
>
> 1) As the main use-case for this optimization is flood of temporary
> tables, it would be nice not let relid => typid map bloat in this
> case. I see that TypeCacheHash would get bloated, because its entries
> are never deleted. However, I would prefer to not get this situation
> even worse.
> 2) In future we may find some more use-cases for relid => typid map
> besides cache invalidation. Keeping that in consistent state could be
> advantage then.
>
> In the attached patch, I'm keeping relid => typid map when
> corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or
> TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets
> deleted, we would invalidate the map item.
>
> It will be also nice to get rid of iteration over all the cached
> domain types in TypeCacheRelCallback(). However, this typically
> shouldn't be a problem since domain types are less tended to bloat.
> Domain types are created manually, unlike composite types which are
> automatically created for every temporary table. We will probably
> need to optimize this in future, but I don't feel this to be necessary
> in present patch.
>
> I think the revised 0003 requires review.

The rebased remaining patch is attached.

I've looked at patch v8.

In function check_insert_rel_type_cache() the block:

+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif

As I understand it does HASH_FIND after the same value just inserted by HASH_ENT

ER above under the same if condition:

if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&

+ typentry->tupDesc == NULL)

Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in comment in a quoted block, but if condition is the same.

For function check_delete_rel_type_cache():

I'd modify the block:

+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif

as:

+
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */

+ else

+ #ifdef USE_ASSERT_CHECKING

+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);

+#endif

3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are better to be renamed to be more clear, though I don't have exact proposals yet,

4. I haven't looked into comments, though I'd recommend oid -> OID replacement in the comments.

Thank you for working on this patchset!

Regards,

Pavel Borisov

Supabase

Re: type cache cleanup improvements

From

Pavel Borisov

Date:

22 August 2024, 13:02:21

Hi, Alexander!

On Wed, 21 Aug 2024 at 19:29, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Hi, Pavel!

On Wed, Aug 21, 2024 at 4:28 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> I've looked at patch v8.
>
> 1.
> In function check_insert_rel_type_cache() the block:
>
> +#ifdef USE_ASSERT_CHECKING
> +
> + /*
> + * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
> + * entry if it should exist.
> + */
> + if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
> + typentry->tupDesc == NULL)
> + {
> + bool found;
> +
> + (void) hash_search(RelIdToTypeIdCacheHash,
> + &typentry->typrelid,
> + HASH_FIND, &found);
> + Assert(found);
> + }
> +#endif
>
> As I understand it does HASH_FIND after the same value just inserted by HASH_ENT
> ER above under the same if condition:
>
> if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
> + typentry->tupDesc == NULL)
>
> Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in comment in a quoted block, but if condition is the same.

Yep, these are remains from one of my previous attempt. No sense to
check for HASH_FIND right after HASH_ENTER. Removed.

> 2.
> For function check_delete_rel_type_cache():
> I'd modify the block:
> +#ifdef USE_ASSERT_CHECKING
> +
> + /*
> + * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
> + * entry if it should exist.
> + */
> + if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
> + (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
> + typentry->tupDesc != NULL)
> + {
> + bool found;
> +
> + (void) hash_search(RelIdToTypeIdCacheHash,
> + &typentry->typrelid,
> + HASH_FIND, &found);
> + Assert(found);
> + }
> +#endif
>
> as:
> +
> + /*
> + * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
> + * entry if it should exist.
> + */
> + else
> +{
> + #ifdef USE_ASSERT_CHECKING
> + bool found;
> +
> + (void) hash_search(RelIdToTypeIdCacheHash,
> + &typentry->typrelid,
> + HASH_FIND, &found);
> + Assert(found);
> +#endif
> +}

Changed in the way you proposed, except I put the comment inside the
#ifdef. I this it's easier to understand this way.

> 3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are better to be renamed to be more clear, though I don't have exact proposals yet,

Renamed to delete_rel_type_cache_if_needed and
insert_rel_type_cache_if_needed. I've checked that

> 4. I haven't looked into comments, though I'd recommend oid -> OID replacement in the comments.

I've changed oid -> OID in the comments and in the commit message.

> Thank you for working on this patchset!

Thank you for review!

Looked at v9:

Patch looks good to me. I'd only suggest comments changes:

"The map from relation's OID to the corresponding composite type OID" -> "The mapping of relation's OID to the corresponding composite type OID"

"We're keeping the map entry when corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is we're keeping map entry if the entry has something to clear." -> "We're keeping the map entry when the corresponding typentry has something to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc."

"Invalidate particular TypeCacheEntry on Relcache inval callback" - remove extra tabs before. Maybe also add empty line above.

"Typically shouldn't be a problem" -> "Typically this shouldn't affect performance"

"Relid = 0, so we need" -> "Relid is invalid. By convention we need"

"if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag" -> "if we cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag previously"

"+/*

+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,

+ * or tupDesc if needed." - remove one "if needed"

Regards,

Pavel Borisov

Supabase

Re: type cache cleanup improvements

From

Andrei Lepikhov

Date:

22 August 2024, 15:02:42

On 21/8/2024 17:28, Alexander Korotkov wrote:
> 
> I've changed oid -> OID in the comments and in the commit message.
I passed through the patch again: no objections and +1 to the changes of 
comments proposed by Pavel.

-- 
regards, Andrei Lepikhov

Re: type cache cleanup improvements

From

Alexander Lakhin

Date:

25 August 2024, 22:00:00

Hello Alexander,

22.08.2024 19:52, Alexander Korotkov wrotd:
> If no objections, I'm planning to push this after reverting PARTITION
> SPLIT/MERGE.
>

Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build.
trilobite failed it:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07

and I'm observing the same locally:
...
#5  0x00005636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 "found",
     fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66
#6  0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
#7  0x00005636d3754063 in InvalidateCompositeTypeCacheEntry (typentry=0x5636d41d5d10) at typcache.c:2355
#8  0x00005636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at typcache.c:2441
...

(gdb) f 6
#6  0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
3077                    Assert(found);
(gdb) p found
$1 = false

(This Assert is introduced by c14d4acb8.)

Best regards,
Alexander

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

25 August 2024, 22:21:19

On Sun, Aug 25, 2024 at 10:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> 22.08.2024 19:52, Alexander Korotkov wrotd:
> > If no objections, I'm planning to push this after reverting PARTITION
> > SPLIT/MERGE.
> >
>
> Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build.
> trilobite failed it:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07
>
> and I'm observing the same locally:
> ...
> #5  0x00005636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 "found",
>      fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66
> #6  0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
> #7  0x00005636d3754063 in InvalidateCompositeTypeCacheEntry (typentry=0x5636d41d5d10) at typcache.c:2355
> #8  0x00005636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at typcache.c:2441
> ...
>
> (gdb) f 6
> #6  0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
> 3077                    Assert(found);
> (gdb) p found
> $1 = false
>
> (This Assert is introduced by c14d4acb8.)

Thank you for noticing.  I'm checking this.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

26 August 2024, 00:22:03

On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Sun, Aug 25, 2024 at 10:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> > 22.08.2024 19:52, Alexander Korotkov wrotd:
> > > If no objections, I'm planning to push this after reverting PARTITION
> > > SPLIT/MERGE.
> > >
> >
> > Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build.
> > trilobite failed it:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07
> >
> > and I'm observing the same locally:
> > ...
> > #5  0x00005636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 "found",
> >      fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66
> > #6  0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
> > #7  0x00005636d3754063 in InvalidateCompositeTypeCacheEntry (typentry=0x5636d41d5d10) at typcache.c:2355
> > #8  0x00005636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at typcache.c:2441
> > ...
> >
> > (gdb) f 6
> > #6  0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
> > 3077                    Assert(found);
> > (gdb) p found
> > $1 = false
> >
> > (This Assert is introduced by c14d4acb8.)
>
> Thank you for noticing.  I'm checking this.

I didn't take into account that TypeCacheEntry could be invalidated
while lookup_type_cache() does syscache lookups.  When I realized that
I was curious on how does it currently work.  It appears that type
cache invalidation mostly only clears the flags while values are
remaining in place and still available for lookup_type_cache() caller.
TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
to survive only because we don't do any syscache lookups for composite
data types later in lookup_type_cache().  I'm becoming less fan of how
this works...  I think these aspects needs to be at least documented
in details.

Regarding c14d4acb8, it appears to require redesign.  I'm going to revert it.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Andrei Lepikhov

Date:

26 August 2024, 09:37:50

On 25/8/2024 23:22, Alexander Korotkov wrote:
> On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
>>> (This Assert is introduced by c14d4acb8.)
>>
>> Thank you for noticing.  I'm checking this.
> 
> I didn't take into account that TypeCacheEntry could be invalidated
> while lookup_type_cache() does syscache lookups.  When I realized that
> I was curious on how does it currently work.  It appears that type
> cache invalidation mostly only clears the flags while values are
> remaining in place and still available for lookup_type_cache() caller.
> TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
> to survive only because we don't do any syscache lookups for composite
> data types later in lookup_type_cache().  I'm becoming less fan of how
> this works...  I think these aspects needs to be at least documented
> in details.
> 
> Regarding c14d4acb8, it appears to require redesign.  I'm going to revert it.
Sorry, but I don't understand your point.
Let's refocus on the problem at hand. The issue arose when the 
TypeCacheTypCallback and the TypeCacheRelCallback were executed in 
sequence within InvalidateSystemCachesExtended.
The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and 
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback 
checks the typentry->tupDesc and, because it wasn't NULL, attempted to 
remove this record a second time.
I think there is no case for redesign, but we have a mess in 
insertion/deletion conditions.

-- 
regards, Andrei Lepikhov

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

26 August 2024, 11:26:26

On Mon, Aug 26, 2024 at 9:37 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> On 25/8/2024 23:22, Alexander Korotkov wrote:
> > On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
> >>> (This Assert is introduced by c14d4acb8.)
> >>
> >> Thank you for noticing.  I'm checking this.
> >
> > I didn't take into account that TypeCacheEntry could be invalidated
> > while lookup_type_cache() does syscache lookups.  When I realized that
> > I was curious on how does it currently work.  It appears that type
> > cache invalidation mostly only clears the flags while values are
> > remaining in place and still available for lookup_type_cache() caller.
> > TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
> > to survive only because we don't do any syscache lookups for composite
> > data types later in lookup_type_cache().  I'm becoming less fan of how
> > this works...  I think these aspects needs to be at least documented
> > in details.
> >
> > Regarding c14d4acb8, it appears to require redesign.  I'm going to revert it.
> Sorry, but I don't understand your point.
> Let's refocus on the problem at hand. The issue arose when the
> TypeCacheTypCallback and the TypeCacheRelCallback were executed in
> sequence within InvalidateSystemCachesExtended.
> The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and
> TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback
> checks the typentry->tupDesc and, because it wasn't NULL, attempted to
> remove this record a second time.
> I think there is no case for redesign, but we have a mess in
> insertion/deletion conditions.

Yes, it's possible to repair the current approach.  But we need to do
this correct, not just "not failing with current usages".  Then we
need to call insert_rel_type_cache_if_needed() not just when we set
TCFLAGS_HAVE_PG_TYPE_DATA flag, but every time we set any of
TCFLAGS_OPERATOR_FLAGS or tupDesc.  That's a lot of places, not as
simple and elegant as it was planned.  This is why I wonder if there
is a better approach.

Secondly, I'm not terribly happy with current state of type cache.
The caller of lookup_type_cache() might get already invalidated data.
This probably OK, because caller probably hold locks on dependent
objects to guarantee that relevant properties of type actually
persists.  At very least this should be documented, but it doesn't
seem so.  Setting of tupdesc is sensitive to its order of execution.
That feels quite fragile to me, and not documented either.  I think
this area needs improvements before we push additional functionality
there.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

29 August 2024, 12:01:56

On Mon, Aug 26, 2024 at 11:26 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
>
> On Mon, Aug 26, 2024 at 9:37 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > On 25/8/2024 23:22, Alexander Korotkov wrote:
> > > On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
> > >>> (This Assert is introduced by c14d4acb8.)
> > >>
> > >> Thank you for noticing.  I'm checking this.
> > >
> > > I didn't take into account that TypeCacheEntry could be invalidated
> > > while lookup_type_cache() does syscache lookups.  When I realized that
> > > I was curious on how does it currently work.  It appears that type
> > > cache invalidation mostly only clears the flags while values are
> > > remaining in place and still available for lookup_type_cache() caller.
> > > TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
> > > to survive only because we don't do any syscache lookups for composite
> > > data types later in lookup_type_cache().  I'm becoming less fan of how
> > > this works...  I think these aspects needs to be at least documented
> > > in details.
> > >
> > > Regarding c14d4acb8, it appears to require redesign.  I'm going to revert it.
> > Sorry, but I don't understand your point.
> > Let's refocus on the problem at hand. The issue arose when the
> > TypeCacheTypCallback and the TypeCacheRelCallback were executed in
> > sequence within InvalidateSystemCachesExtended.
> > The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and
> > TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback
> > checks the typentry->tupDesc and, because it wasn't NULL, attempted to
> > remove this record a second time.
> > I think there is no case for redesign, but we have a mess in
> > insertion/deletion conditions.
>
> Yes, it's possible to repair the current approach.  But we need to do
> this correct, not just "not failing with current usages".  Then we
> need to call insert_rel_type_cache_if_needed() not just when we set
> TCFLAGS_HAVE_PG_TYPE_DATA flag, but every time we set any of
> TCFLAGS_OPERATOR_FLAGS or tupDesc.  That's a lot of places, not as
> simple and elegant as it was planned.  This is why I wonder if there
> is a better approach.
>
> Secondly, I'm not terribly happy with current state of type cache.
> The caller of lookup_type_cache() might get already invalidated data.
> This probably OK, because caller probably hold locks on dependent
> objects to guarantee that relevant properties of type actually
> persists.  At very least this should be documented, but it doesn't
> seem so.  Setting of tupdesc is sensitive to its order of execution.
> That feels quite fragile to me, and not documented either.  I think
> this area needs improvements before we push additional functionality
> there.

I see fdd965d074 added a proper handling for concurrent invalidation
for relation cache. If a concurrent invalidation occurs, we retry
building a relation descriptor.  Thus, we end up with returning of a
valid relation descriptor to caller.  I wonder if we can take the same
approach to type cache.  That would make the whole type cache more
consistent and less fragile.  Also, this patch will be simpler.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Andrei Lepikhov

Date:

31 August 2024, 22:33:23

On 29/8/2024 11:01, Alexander Korotkov wrote:
> On Mon, Aug 26, 2024 at 11:26 AM Alexander Korotkov
>> Secondly, I'm not terribly happy with current state of type cache.
>> The caller of lookup_type_cache() might get already invalidated data.
>> This probably OK, because caller probably hold locks on dependent
>> objects to guarantee that relevant properties of type actually
>> persists.  At very least this should be documented, but it doesn't
>> seem so.  Setting of tupdesc is sensitive to its order of execution.
>> That feels quite fragile to me, and not documented either.  I think
>> this area needs improvements before we push additional functionality
>> there.
> 
> I see fdd965d074 added a proper handling for concurrent invalidation
> for relation cache. If a concurrent invalidation occurs, we retry
> building a relation descriptor.  Thus, we end up with returning of a
> valid relation descriptor to caller.  I wonder if we can take the same
> approach to type cache.  That would make the whole type cache more
> consistent and less fragile.  Also, this patch will be simpler.
I think I understand the solution from the commit fdd965d074.
Just for the record, you mentioned invalidation inside the 
lookup_type_cache above. Passing through the code, I found the only 
place for such a case - the call of the GetDefaultOpClass, which 
triggers the opening of the relation pg_opclass, which can cause an 
AcceptInvalidationMessages call. Did you mean this case, or does a wider 
field of cases exist here?

-- 
regards, Andrei Lepikhov

Re: type cache cleanup improvements

From

Andrei Lepikhov

Date:

18 September 2024, 17:10:08

On 13/9/2024 01:38, Alexander Korotkov wrote:
> I've tried to implement handling of concurrent invalidation similar to
> commit fdd965d074.  However that appears to be more difficult that I
> thought, because for some datatypes like arrays, ranges etc we might
> need fill the element type and reference it.  So, I decided to
> continue with the current approach but borrowing some ideas from
> fdd965d074.  The revised patchset attached.
Let me rephrase the issue in more straightforward terms to ensure we are 
all clear on the problem:
The critical problem of the typcache lookup on not-yet-locked data is 
that it can lead to an inconsistent state of the TypEntry, potentially 
causing disruptions in the DBMS's operations, correct?
Let's exemplify this statement. By filling typentry's lt_opr, eq_opr, 
and gt_opr fields, we access the AMOPSTRATEGY cache. One operation can 
successfully fetch data from the cache, but another can miss data and 
touch the catalogue table, causing invalidations. In this case, we can 
get an inconsistent set of operators. Do I understand the problem 
statement correctly?

If this view is correct, your derived approach should work fine if all 
necessary callbacks are registered. I see that at least AMOPSTRATEGY and 
PROCOID were missed at the moment of the typcache initialization.

-- 
regards, Andrei Lepikhov

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

23 September 2024, 13:53:47

On Wed, Sep 18, 2024 at 5:10 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> On 13/9/2024 01:38, Alexander Korotkov wrote:
> > I've tried to implement handling of concurrent invalidation similar to
> > commit fdd965d074.  However that appears to be more difficult that I
> > thought, because for some datatypes like arrays, ranges etc we might
> > need fill the element type and reference it.  So, I decided to
> > continue with the current approach but borrowing some ideas from
> > fdd965d074.  The revised patchset attached.
> Let me rephrase the issue in more straightforward terms to ensure we are
> all clear on the problem:
> The critical problem of the typcache lookup on not-yet-locked data is
> that it can lead to an inconsistent state of the TypEntry, potentially
> causing disruptions in the DBMS's operations, correct?
> Let's exemplify this statement. By filling typentry's lt_opr, eq_opr,
> and gt_opr fields, we access the AMOPSTRATEGY cache. One operation can
> successfully fetch data from the cache, but another can miss data and
> touch the catalogue table, causing invalidations. In this case, we can
> get an inconsistent set of operators. Do I understand the problem
> statement correctly?

Actually, I didn't research much if there is a material problem.  So,
I didn't try to concurrently delete some operator class members
concurrently to lookup_type_cache().  There are probably some bugs,
but they likely have low impact in practice, given that type/opclass
changes are very rare.

Yet I was concentrated on why do lookup_type_cache() returns
TypeCacheEntry filled with whatever caller asked given there could be
concurrent invalidations.

So, my approach was to
1) Document how we currently handle concurrent invalidations.
2) Maintain RelIdToTypeIdCacheHash correctly with concurrent invalidations.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Artur Zakirov

Date:

10 October 2024, 18:54:01

Hi all,

On Fri, 13 Sept 2024 at 01:38, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> 0001 - adds comment about concurrent invalidation handling
> 0002 - revised c14d4acb8.  Now we track type oids, whose
> TypeCacheEntry's filing is in-progress.  Add entry to
> RelIdToTypeIdCacheHash at the end of lookup_type_cache() or on the
> transaction abort.  During invalidation don't assert
> RelIdToTypeIdCacheHash to be here if TypeCacheEntry is in-progress.

Thank you Alexander for the patch. I reviewed and tested it.

I used Teodor's script to check the performance. On my laptop on
master ROLLBACK runs 11496.219 ms. With patch ROLLBACK runs 378.990
ms.

It seems to me that there are couple of possible issues in the patch:

In `lookup_type_cache()` `in_progress_list` is allocated using
`CacheMemoryContext`, on the other hand it seems there might be a case
when `CacheMemoryContext` is not created yet. It is created below in
the code if it doesn't exist:

    /* Also make sure CacheMemoryContext exists */
    if (!CacheMemoryContext)
        CreateCacheMemoryContext();

It is probably a very rare case, but it might be better to allocate
`in_progress_list` after that line, or move creation of
`CacheMemoryContext` higher.

Within `insert_rel_type_cache_if_needed()` and
`delete_rel_type_cache_if_needed()` there is an if condition:

    if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
        (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
        typentry->tupDesc != NULL)

Based on the logic of the rest of the code does it make sense to use
TCFLAGS_DOMAIN_BASE_IS_COMPOSITE instead of TCFLAGS_OPERATOR_FLAGS?
Otherwise the condition doesn't look logical.

-- 
Kind regards,
Artur

Re: type cache cleanup improvements

From

jian he

Date:

15 October 2024, 10:34:41

On Sun, Oct 13, 2024 at 8:09 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>

hi. Alexander.
I don't fully understand all of it. but I did some tests anyway.

static void
cleanup_in_progress_typentries(void)
{
    int            i;
    if (in_progress_list_len > 1)
        elog(INFO, "%s:%d in_progress_list_len > 1", __FILE_NAME__, __LINE__);
    for (i = 0; i < in_progress_list_len; i++)
    {
        TypeCacheEntry *typentry;
        typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
                                                  &in_progress_list[i],
                                                  HASH_FIND, NULL);
        insert_rel_type_cache_if_needed(typentry);
    }
    in_progress_list_len = 0;
}

the regress still passed.
I assume "elog(INFO, " won't interfere in cleanup_in_progress_typentries.
So we lack tests for larger in_progress_list_len values or i missed something?



    /* Call check_delete_rel_type_cache() if we actually cleared something */
    if (hadTupDescOrOpclass)
        delete_rel_type_cache_if_needed(typentry);

        /*
         * Call check_delete_rel_type_cache() if we cleaned
         * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
         */
        if (hadPgTypeData)
            delete_rel_type_cache_if_needed(typentry);

check_delete_rel_type_cache don't exist, so these comments are wrong?

Re: type cache cleanup improvements

From

Artur Zakirov

Date:

15 October 2024, 11:45:23

On Tue, 15 Oct 2024 at 10:09, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >     /* Call check_delete_rel_type_cache() if we actually cleared something */
> >     if (hadTupDescOrOpclass)
> >         delete_rel_type_cache_if_needed(typentry);
> >
> >         /*
> >          * Call check_delete_rel_type_cache() if we cleaned
> >          * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
> >          */
> >         if (hadPgTypeData)
> >             delete_rel_type_cache_if_needed(typentry);
> >
> > check_delete_rel_type_cache don't exist, so these comments are wrong?
>
> Yep, they didn't get updated.  Fixed in the attached patchset.

Thank you Alexander for the fixes. The last version of the patch looks
good to me.

> I'm not sure I get the point.  This check ensures that type entry has
> something to be cleared.  In this case we need to keep
> RelIdToTypeIdCacheHash entry to find this item on invalidation
> message.  I'm not sure how TCFLAGS_DOMAIN_BASE_IS_COMPOSITE is
> relevant here, because it's valid only for TYPTYPE_DOMAIN while this
> patch deals with TYPTYPE_COMPOSITE.

Regarding this discussion earlier, I assumed that TYPTYPE_DOMAIN also
needs to be handled by `insert_rel_type_cache_if_needed()`. And it
seems that handling of TYPTYPE_DOMAIN will remain the same as before.

-- 
Kind regards,
Artur

Re: type cache cleanup improvements

From

jian he

Date:

15 October 2024, 12:50:15

On Tue, Oct 15, 2024 at 4:09 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> Hi, Jian!
>
> Thank you for your review.
>
> On Tue, Oct 15, 2024 at 10:34 AM jian he <jian.universality@gmail.com> wrote:
> > I don't fully understand all of it. but I did some tests anyway.
> >
> > static void
> > cleanup_in_progress_typentries(void)
> > {
> >     int            i;
> >     if (in_progress_list_len > 1)
> >         elog(INFO, "%s:%d in_progress_list_len > 1", __FILE_NAME__, __LINE__);
> >     for (i = 0; i < in_progress_list_len; i++)
> >     {
> >         TypeCacheEntry *typentry;
> >         typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
> >                                                   &in_progress_list[i],
> >                                                   HASH_FIND, NULL);
> >         insert_rel_type_cache_if_needed(typentry);
> >     }
> >     in_progress_list_len = 0;
> > }
> >
> > the regress still passed.
> > I assume "elog(INFO, " won't interfere in cleanup_in_progress_typentries.
> > So we lack tests for larger in_progress_list_len values or i missed something?
>
> Try to run test suite with -DCLOBBER_CACHE_ALWAYS.
>

build from source, DCLOBBER_CACHE_ALWAYS takes a very long time.
So I gave up.


in lookup_type_cache, we unconditionally do
in_progress_list_len++;
in_progress_list_len--;

"static int    in_progress_list_len;"
means in_progress_list_len value change is confined in
src/backend/utils/cache/typcache.c.

based on above information, i am still  confused with
cleanup_in_progress_typentries, in_progress_list_len
is there any simple sql example to demo
cleanup_in_progress_typentries,  in_progress_list_len> 0.

Re: type cache cleanup improvements

From

Artur Zakirov

Date:

15 October 2024, 16:16:07

On Tue, 15 Oct 2024 at 11:50, jian he <jian.universality@gmail.com> wrote:
> based on above information, i am still  confused with
> cleanup_in_progress_typentries, in_progress_list_len
> is there any simple sql example to demo
> cleanup_in_progress_typentries,  in_progress_list_len> 0.

AFAIK to reproduce cases when `in_progress_list_len > 0`
`lookup_type_cache()` should fail during its execution.

To do so you can call `lookup_type_cache()` with non-existing type_id
from a C function.

-- 
Kind regards,
Artur

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

20 October 2024, 20:47:28

On Tue, Oct 15, 2024 at 12:50 PM jian he <jian.universality@gmail.com> wrote:
>
> On Tue, Oct 15, 2024 at 4:09 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > Hi, Jian!
> >
> > Thank you for your review.
> >
> > On Tue, Oct 15, 2024 at 10:34 AM jian he <jian.universality@gmail.com> wrote:
> > > I don't fully understand all of it. but I did some tests anyway.
> > >
> > > static void
> > > cleanup_in_progress_typentries(void)
> > > {
> > >     int            i;
> > >     if (in_progress_list_len > 1)
> > >         elog(INFO, "%s:%d in_progress_list_len > 1", __FILE_NAME__, __LINE__);
> > >     for (i = 0; i < in_progress_list_len; i++)
> > >     {
> > >         TypeCacheEntry *typentry;
> > >         typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
> > >                                                   &in_progress_list[i],
> > >                                                   HASH_FIND, NULL);
> > >         insert_rel_type_cache_if_needed(typentry);
> > >     }
> > >     in_progress_list_len = 0;
> > > }
> > >
> > > the regress still passed.
> > > I assume "elog(INFO, " won't interfere in cleanup_in_progress_typentries.
> > > So we lack tests for larger in_progress_list_len values or i missed something?
> >
> > Try to run test suite with -DCLOBBER_CACHE_ALWAYS.
> >
>
> build from source, DCLOBBER_CACHE_ALWAYS takes a very long time.
> So I gave up.
>
>
> in lookup_type_cache, we unconditionally do
> in_progress_list_len++;
> in_progress_list_len--;

Yes, this should work OK when no errors.  On error or interruption,
finalize_in_progress_typentries() will clean the things up.

> "static int    in_progress_list_len;"
> means in_progress_list_len value change is confined in
> src/backend/utils/cache/typcache.c.

Yep.

> based on above information, i am still  confused with
> cleanup_in_progress_typentries, in_progress_list_len
> is there any simple sql example to demo
> cleanup_in_progress_typentries,  in_progress_list_len> 0.

I don't think there is simple sql to reliably reproduce that.  In
order to hit that, we must process invalidation messages in some
(short) moment of time during lookup_type_cache().  You can reproduce
that by setting a breakpoint in lookup_type_cache() and in parallel do
something to invalidate the type cache entry (for instance, ALTER
TABLE ... ADD COLUMN ... would invalidate the composite type).  In
principle, we can reproduce that using injection points.  However, I'm
not intended to do that as long as we have buildfarm members with
-DCLOBBER_CACHE_ALWAYS.  FWIW, I will for sure run tests with
-DCLOBBER_CACHE_ALWAYS before committing this.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

20 October 2024, 21:00:18

On Sun, Oct 20, 2024 at 8:47 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Tue, Oct 15, 2024 at 12:50 PM jian he <jian.universality@gmail.com> wrote:
> > build from source, DCLOBBER_CACHE_ALWAYS takes a very long time.
> > So I gave up.
> >
> >
> > in lookup_type_cache, we unconditionally do
> > in_progress_list_len++;
> > in_progress_list_len--;
>
> Yes, this should work OK when no errors.  On error or interruption,
> finalize_in_progress_typentries() will clean the things up.
>
> > "static int    in_progress_list_len;"
> > means in_progress_list_len value change is confined in
> > src/backend/utils/cache/typcache.c.
>
> Yep.
>
> > based on above information, i am still  confused with
> > cleanup_in_progress_typentries, in_progress_list_len
> > is there any simple sql example to demo
> > cleanup_in_progress_typentries,  in_progress_list_len> 0.
>
> I don't think there is simple sql to reliably reproduce that.  In
> order to hit that, we must process invalidation messages in some
> (short) moment of time during lookup_type_cache().  You can reproduce
> that by setting a breakpoint in lookup_type_cache() and in parallel do
> something to invalidate the type cache entry (for instance, ALTER
> TABLE ... ADD COLUMN ... would invalidate the composite type).

Oops, concurrent invalidation message is not enough here.  So,
-DCLOBBER_CACHE_ALWAYS is also not enough to reproduce the situation.
Injection-point test is required.  I'm going to add this.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Dagfinn Ilmari Mannsåker

Date:

21 October 2024, 02:32:08

Alexander Korotkov <aekorotkov@gmail.com> writes:

> +static Oid *in_progress_list;
> +static int    in_progress_list_len;
> +static int    in_progress_list_maxlen;

Is there any particular reason not to use pg_list.h for this?

- ilmari

Re: type cache cleanup improvements

From

Andrei Lepikhov

Date:

21 October 2024, 08:36:04

On 21/10/2024 00:36, Alexander Korotkov wrote:
> On Thu, Oct 17, 2024 at 12:41 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> I think the first patch may already be committed, and this little burden
>> may be avoided in future versions.
> I've integrated your patch.  But I think
> finalize_in_progress_typentries() is more appropriate place to check
> typentry for NULL.  Also, I've revised the
> finalize_in_progress_typentries() header comment.  I'm going to push
> these patches if no objections.
I agree with your idea. Also, I think it would be more conventional not 
to check the type entry for a NULL value but to test the 'found' value 
instead.
And thanks for the injection points tests! I must have slipped my mind 
about this option.
Now, the patch set looks good for me.

-- 
regards, Andrei Lepikhov

Re: type cache cleanup improvements

From

Andrei Lepikhov

Date:

21 October 2024, 08:40:14

On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> 
>> +static Oid *in_progress_list;
>> +static int    in_progress_list_len;
>> +static int    in_progress_list_maxlen;
> 
> Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory 
allocations and deallocations 99.9% of the time. Also, quick access to 
the single element (which we will have in real life almost all of the 
time) is much faster than employing list machinery.

-- 
regards, Andrei Lepikhov

Re: type cache cleanup improvements

From

jian he

Date:

21 October 2024, 10:51:40

thanks for the
INJECTION_POINT("typecache-before-rel-type-cache-insert");
Now I have better understanding of the whole changes.


+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHas
+ * entries, marked as in-progress by lookup_type_cache().  It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
comment typo. "TypeCacheHas" should be "TypeCacheHash"

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

21 October 2024, 11:10:46

On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> >
> >> +static Oid *in_progress_list;
> >> +static int  in_progress_list_len;
> >> +static int  in_progress_list_maxlen;
> >
> > Is there any particular reason not to use pg_list.h for this?
> Sure. The type cache lookup has to be as much optimal as possible.
> Using an array and relating sequential access to it, we avoid memory
> allocations and deallocations 99.9% of the time. Also, quick access to
> the single element (which we will have in real life almost all of the
> time) is much faster than employing list machinery.

+1,
List with zero elements has to be NIL.  That means continuous
allocations/deallocations.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Dagfinn Ilmari Mannsåker

Date:

21 October 2024, 13:16:25

Alexander Korotkov <aekorotkov@gmail.com> writes:

> On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
>>
>> On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
>> > Alexander Korotkov <aekorotkov@gmail.com> writes:
>> >
>> >> +static Oid *in_progress_list;
>> >> +static int  in_progress_list_len;
>> >> +static int  in_progress_list_maxlen;
>> >
>> > Is there any particular reason not to use pg_list.h for this?
>> Sure. The type cache lookup has to be as much optimal as possible.
>> Using an array and relating sequential access to it, we avoid memory
>> allocations and deallocations 99.9% of the time. Also, quick access to
>> the single element (which we will have in real life almost all of the
>> time) is much faster than employing list machinery.

Lists are actually dynamically resized arrays these days (see commit
1cff1b95ab6ddae32faa3efe0d95a820dbfdc164), not linked lists, so
accessing arbitrary elements is O(1), not O(n). Just like this patch,
the size is doubled (starting at 16) whenever array is full.

> +1,
> List with zero elements has to be NIL.  That means continuous
> allocations/deallocations.

This however is a valid point (unless we keep a dummy zeroth element to
avoid it, which is even uglier than open-coding the array extension
logic), so objection withdrawn.

- ilmari

Re: type cache cleanup improvements

From

Pavel Borisov

Date:

22 October 2024, 18:09:59

Hi, Alexander!

On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Mon, Oct 21, 2024 at 2:30 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> On Mon, Oct 21, 2024 at 1:16 PM Dagfinn Ilmari Mannsåker
> <ilmari@ilmari.org> wrote:
> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> >
> > > On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> > >>
> > >> On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
> > >> > Alexander Korotkov <aekorotkov@gmail.com> writes:
> > >> >
> > >> >> +static Oid *in_progress_list;
> > >> >> +static int in_progress_list_len;
> > >> >> +static int in_progress_list_maxlen;
> > >> >
> > >> > Is there any particular reason not to use pg_list.h for this?
> > >> Sure. The type cache lookup has to be as much optimal as possible.
> > >> Using an array and relating sequential access to it, we avoid memory
> > >> allocations and deallocations 99.9% of the time. Also, quick access to
> > >> the single element (which we will have in real life almost all of the
> > >> time) is much faster than employing list machinery.
> >
> > Lists are actually dynamically resized arrays these days (see commit
> > 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164), not linked lists, so
> > accessing arbitrary elements is O(1), not O(n). Just like this patch,
> > the size is doubled (starting at 16) whenever array is full.
> >
> > > +1,
> > > List with zero elements has to be NIL. That means continuous
> > > allocations/deallocations.
> >
> > This however is a valid point (unless we keep a dummy zeroth element to
> > avoid it, which is even uglier than open-coding the array extension
> > logic), so objection withdrawn.
>
> OK, thank you!
>
> The attached revision fixes EXTRA_INSTALL in
> src/test/modules/typcache/Makefile. Spotted off-list by Arthur
> Zakirov.

I've re-checked that regression tests pass with
-DCLOBBER_CACHE_ALWAYS. Also did some grammar corrections for
comments and commit message. I'm going to push this if no objections.

Thank you for working on this patch!

Looked through the patchset once more.

Patch 0001 (minor): "in the last" -> "after everything else" or "after other TypeCacheEntry contents"

Patch 0002 looks ready to me.

Regards,

Pavel Borisov

Supabase

Re: type cache cleanup improvements

From

Andres Freund

Date:

25 October 2024, 11:35:29

Hi,

On 2024-10-22 20:33:24 +0300, Alexander Korotkov wrote:
> Thank you, Pavel!  0001 revised according to your suggestion.

Starting with this commit CI fails.

https://cirrus-ci.com/task/6668851469877248

https://api.cirrus-ci.com/v1/artifact/task/6668851469877248/testrun/build/testrun/regress-running/regress/regression.diffs

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out
/tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out    2024-10-24 11:38:43.829712000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out    2024-10-24 11:44:57.154238000
+0000
@@ -1338,14 +1338,9 @@
 ERROR:  cannot drop inherited constraint "f1_pos" of relation "p1_c1"
 alter table p1 drop constraint f1_pos;
 \d p1_c1
-               Table "public.p1_c1"
- Column |  Type   | Collation | Nullable | Default
---------+---------+-----------+----------+---------
- f1     | integer |           |          |
-Check constraints:
-    "f1_pos" CHECK (f1 > 0)
-Inherits: p1
-
+ERROR:  error triggered for injection point typecache-before-rel-type-cache-insert
+LINE 4: ORDER BY 1;
+                 ^
 drop table p1 cascade;
 NOTICE:  drop cascades to table p1_c1
 create table p1(f1 int constraint f1_pos CHECK (f1 > 0));

Greetings,

Andres

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

25 October 2024, 12:48:49

On Fri, Oct 25, 2024 at 11:35 AM Andres Freund <andres@anarazel.de> wrote:
> On 2024-10-22 20:33:24 +0300, Alexander Korotkov wrote:
> > Thank you, Pavel!  0001 revised according to your suggestion.
>
> Starting with this commit CI fails.
>
> https://cirrus-ci.com/task/6668851469877248
>
https://api.cirrus-ci.com/v1/artifact/task/6668851469877248/testrun/build/testrun/regress-running/regress/regression.diffs
>
> diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out
/tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out
> --- /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out  2024-10-24 11:38:43.829712000 +0000
> +++ /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out      2024-10-24 11:44:57.154238000
+0000
> @@ -1338,14 +1338,9 @@
>  ERROR:  cannot drop inherited constraint "f1_pos" of relation "p1_c1"
>  alter table p1 drop constraint f1_pos;
>  \d p1_c1
> -               Table "public.p1_c1"
> - Column |  Type   | Collation | Nullable | Default
> ---------+---------+-----------+----------+---------
> - f1     | integer |           |          |
> -Check constraints:
> -    "f1_pos" CHECK (f1 > 0)
> -Inherits: p1
> -
> +ERROR:  error triggered for injection point typecache-before-rel-type-cache-insert
> +LINE 4: ORDER BY 1;
> +                 ^
>  drop table p1 cascade;
>  NOTICE:  drop cascades to table p1_c1
>  create table p1(f1 int constraint f1_pos CHECK (f1 > 0));

Thank you for reporting this.
Looks weird that injection point, which isn't used in these tests, got
triggered here.
I'm looking into this.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

25 October 2024, 13:13:29

On Fri, Oct 25, 2024 at 12:48 PM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Fri, Oct 25, 2024 at 11:35 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2024-10-22 20:33:24 +0300, Alexander Korotkov wrote:
> > > Thank you, Pavel!  0001 revised according to your suggestion.
> >
> > Starting with this commit CI fails.
> >
> > https://cirrus-ci.com/task/6668851469877248
> >
https://api.cirrus-ci.com/v1/artifact/task/6668851469877248/testrun/build/testrun/regress-running/regress/regression.diffs
> >
> > diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out
/tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out
> > --- /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out  2024-10-24 11:38:43.829712000 +0000
> > +++ /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out      2024-10-24
11:44:57.154238000+0000 
> > @@ -1338,14 +1338,9 @@
> >  ERROR:  cannot drop inherited constraint "f1_pos" of relation "p1_c1"
> >  alter table p1 drop constraint f1_pos;
> >  \d p1_c1
> > -               Table "public.p1_c1"
> > - Column |  Type   | Collation | Nullable | Default
> > ---------+---------+-----------+----------+---------
> > - f1     | integer |           |          |
> > -Check constraints:
> > -    "f1_pos" CHECK (f1 > 0)
> > -Inherits: p1
> > -
> > +ERROR:  error triggered for injection point typecache-before-rel-type-cache-insert
> > +LINE 4: ORDER BY 1;
> > +                 ^
> >  drop table p1 cascade;
> >  NOTICE:  drop cascades to table p1_c1
> >  create table p1(f1 int constraint f1_pos CHECK (f1 > 0));
>
> Thank you for reporting this.
> Looks weird that injection point, which isn't used in these tests, got
> triggered here.
> I'm looking into this.

Oh, I forgot to make injection points in typcache_rel_type_cache.sql
local.  Thus, it affects concurrent tests.  Must be fixed in
aa1e898dea.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Noah Misch

Date:

11 April, 23:32:41

On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
> On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >> I'm going to push this if no objections.

(This became commit b85a9d0.)

> +    /* Call delete_rel_type_cache() if we actually cleared something */
> +    if (hadTupDescOrOpclass)
> +        delete_rel_type_cache_if_needed(typentry);

I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
entry exists if and only if certain kinds of data appear in the TypeCacheHash
entry.  However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
maintaining RelIdToTypeIdCacheHash.  Is it right to do that?

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

12 April, 00:43:19

On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
> > On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > > On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > >> I'm going to push this if no objections.
>
> (This became commit b85a9d0.)
>
> > +     /* Call delete_rel_type_cache() if we actually cleared something */
> > +     if (hadTupDescOrOpclass)
> > +             delete_rel_type_cache_if_needed(typentry);
>
> I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
> entry exists if and only if certain kinds of data appear in the TypeCacheHash
> entry.  However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
> maintaining RelIdToTypeIdCacheHash.  Is it right to do that?

Thank you for the question.  I'll recheck this in next couple of days.

------
Regards,
Alexander Korotkov
Supabase

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

21 April, 04:54:08

Hi, Noah!

On Sat, Apr 12, 2025 at 12:43 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
>
> On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
> > > On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > > > On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > >> I'm going to push this if no objections.
> >
> > (This became commit b85a9d0.)
> >
> > > +     /* Call delete_rel_type_cache() if we actually cleared something */
> > > +     if (hadTupDescOrOpclass)
> > > +             delete_rel_type_cache_if_needed(typentry);
> >
> > I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
> > entry exists if and only if certain kinds of data appear in the TypeCacheHash
> > entry.  However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
> > maintaining RelIdToTypeIdCacheHash.  Is it right to do that?
>
> Thank you for the question.  I'll recheck this in next couple of days.

Sorry for the delay.  Generally, your finding is correct.  But, I
didn't manage to reproduce the situation, where existing code leads to
real error.  In order to have it, we must have typcache entry without
TCFLAGS_HAVE_PG_TYPE_DATA and tupDesc, but with some of
TCFLAGS_OPERATOR_FLAGS.  Reseting TCFLAGS_HAVE_PG_TYPE_DATA for a
composite type doesn't seem to be possible without resetting the rest
at the same time.

Nevertheless, I think it would be fragile to leave the current code
"as is".  If even there is no case of real error (or it's just me
didn't manage to find it), it could appear after further changes of
type cache code.  So, the fix is attached.

------
Regards,
Alexander Korotkov
Supabase

Attachment

v1-0001-Maintain-RelIdToTypeIdCacheHash-in-TypeCacheOpcCa.patch

Re: type cache cleanup improvements

From

Noah Misch

Date:

29 April, 03:56:12

On Mon, Apr 21, 2025 at 04:54:08AM +0300, Alexander Korotkov wrote:
> On Sat, Apr 12, 2025 at 12:43 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
> > > On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
> > > > On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > > > > On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > > >> I'm going to push this if no objections.
> > >
> > > (This became commit b85a9d0.)
> > >
> > > > +     /* Call delete_rel_type_cache() if we actually cleared something */
> > > > +     if (hadTupDescOrOpclass)
> > > > +             delete_rel_type_cache_if_needed(typentry);
> > >
> > > I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
> > > entry exists if and only if certain kinds of data appear in the TypeCacheHash
> > > entry.  However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
> > > maintaining RelIdToTypeIdCacheHash.  Is it right to do that?

> Sorry for the delay.  Generally, your finding is correct.  But, I
> didn't manage to reproduce the situation, where existing code leads to
> real error.  In order to have it, we must have typcache entry without
> TCFLAGS_HAVE_PG_TYPE_DATA and tupDesc, but with some of
> TCFLAGS_OPERATOR_FLAGS.

That makes sense.

> Reseting TCFLAGS_HAVE_PG_TYPE_DATA for a
> composite type doesn't seem to be possible without resetting the rest
> at the same time.
> 
> Nevertheless, I think it would be fragile to leave the current code
> "as is".  If even there is no case of real error (or it's just me
> didn't manage to find it), it could appear after further changes of
> type cache code.  So, the fix is attached.

This change looks appropriate.  Thanks.

Re: type cache cleanup improvements

From

Alexander Korotkov

Date:

29 April, 13:00:05

Hi, Noah!

On Tue, Apr 29, 2025 at 3:56 AM Noah Misch <noah@leadboat.com> wrote:
> On Mon, Apr 21, 2025 at 04:54:08AM +0300, Alexander Korotkov wrote:
> > On Sat, Apr 12, 2025 at 12:43 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
> > > > On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
> > > > > On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > > > > > On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > > > >> I'm going to push this if no objections.
> > > >
> > > > (This became commit b85a9d0.)
> > > >
> > > > > +     /* Call delete_rel_type_cache() if we actually cleared something */
> > > > > +     if (hadTupDescOrOpclass)
> > > > > +             delete_rel_type_cache_if_needed(typentry);
> > > >
> > > > I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
> > > > entry exists if and only if certain kinds of data appear in the TypeCacheHash
> > > > entry.  However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
> > > > maintaining RelIdToTypeIdCacheHash.  Is it right to do that?
>
> > Sorry for the delay.  Generally, your finding is correct.  But, I
> > didn't manage to reproduce the situation, where existing code leads to
> > real error.  In order to have it, we must have typcache entry without
> > TCFLAGS_HAVE_PG_TYPE_DATA and tupDesc, but with some of
> > TCFLAGS_OPERATOR_FLAGS.
>
> That makes sense.
>
> > Reseting TCFLAGS_HAVE_PG_TYPE_DATA for a
> > composite type doesn't seem to be possible without resetting the rest
> > at the same time.
> >
> > Nevertheless, I think it would be fragile to leave the current code
> > "as is".  If even there is no case of real error (or it's just me
> > didn't manage to find it), it could appear after further changes of
> > type cache code.  So, the fix is attached.
>
> This change looks appropriate.  Thanks.

Thank you for your feedback!

------
Regards,
Alexander Korotkov
Supabase