Re: Optimizing ResouceOwner to speed up COPY - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Optimizing ResouceOwner to speed up COPY
Date
Msg-id 4fe30d40-da8b-4cf6-8394-05e189a9fb1a@iki.fi
Whole thread Raw
In response to Re: Optimizing ResouceOwner to speed up COPY  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: Optimizing ResouceOwner to speed up COPY
List pgsql-hackers
On 17/10/2025 06:13, Chao Li wrote:
> ```
> @@ -250,12 +257,21 @@ ResourceOwnerAddToHash(ResourceOwner owner, Datum value, const ResourceOwnerDesc
>       idx = hash_resource_elem(value, kind) & mask;
>       for (;;)
>       {
> +        /* found an exact match - just increment the counter */
> +        if ((owner->hash[idx].kind == kind) &&
> +            (owner->hash[idx].item == value))
> +        {
> +            owner->hash[idx].count += count;
> +            return;
> +        }
> +
>           if (owner->hash[idx].kind == NULL)
>               break;                /* found a free slot */
>           idx = (idx + 1) & mask;
>       }
> ```
> 
> I think this is the “trade-off” you mention in your email: if a free slot found earlier, then still gets duplicated
entries.I have no concern to this “trade-off”, but please add a comment here, otherwise future readers may be confused,
andmight potentially consider that were a bug, without reading your original design email.
 

+1 on such a comment. Here or maybe on the ResourceElem struct itself.

>  typedef struct ResourceElem
>  {
>      Datum        item;
> +    uint32        count;                /* number of occurrences */
>      const ResourceOwnerDesc *kind;    /* NULL indicates a free hash table slot */
>  } ResourceElem;

Hmm, the 'count' is not used when the entry is stored in the array. 
Perhaps we should have a separate struct for array and hash elements 
now. Keeping the array small helps it to fit in CPU caches.

> /*
>  * Forget that an object is owned by a ResourceOwner
>  *
>  * Note: If same resource ID is associated with the ResourceOwner more than
>  * once, one instance is removed.
>  *
>  * Note: Forgetting a resource does not guarantee that there is room to
>  * remember a new resource.  One exception is when you forget the most
>  * recently remembered resource; that does make room for a new remember call.
>  * Some code callers rely on that exception.
>  */
> void
> ResourceOwnerForget(ResourceOwner owner, Datum value, const ResourceOwnerDesc *kind)

Does this break the exception noted in the above comment? I guess it 
still holds because we don't deduplicate entries in the array. That's 
very subtle, needs a comment at least.

typo: ocurrence -> occurrence

- Heikki



pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: ci: Skip minfree file in the cores_backtrace.sh
Next
From: Michael Banck
Date:
Subject: Re: Executing pg_createsubscriber with a non-compatible control file