Re: Optimizing ResouceOwner to speed up COPY - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Optimizing ResouceOwner to speed up COPY |
Date | |
Msg-id | 73dffc49-942a-4502-9963-1b7838958959@iki.fi Whole thread Raw |
In response to | Re: Optimizing ResouceOwner to speed up COPY (Tomas Vondra <tomas@vondra.me>) |
Responses |
Re: Optimizing ResouceOwner to speed up COPY
|
List | pgsql-hackers |
On 18/10/2025 01:49, Tomas Vondra wrote: > On 10/17/25 12:32, Tomas Vondra wrote: >> >> >> On 10/17/25 10:31, Heikki Linnakangas wrote: >>>> typedef struct ResourceElem >>>> { >>>> Datum item; >>>> + uint32 count; /* number of occurrences */ >>>> const ResourceOwnerDesc *kind; /* NULL indicates a free hash >>>> table slot */ >>>> } ResourceElem; >>> >>> Hmm, the 'count' is not used when the entry is stored in the array. >>> Perhaps we should have a separate struct for array and hash elements >>> now. Keeping the array small helps it to fit in CPU caches. >> >> Agreed. I had the same idea yesterday, but I haven't done it yet. > > The attached v2 does that - it adds a separate ResourceHashElem for the > has table, and it works. But I'm not sure I like it very much, because > there are two places that relied on both the array and hash table using > the same struct to "walk" it the same way. > > For ResourceOwnerSort() it's not too bad, but ResourceOwnerReleaseAll() > now duplicates most of the code. It's not terrible, but also not pretty. > I can't think of a better way, though. Looks fine to me. The code duplication is not too bad IMO. Here's a rebased version of the micro-benchmark I used when I was working on the ResourceOwner refactoring (https://www.postgresql.org/message-id/d746cead-a1ef-7efe-fb47-933311e876a3%40iki.fi). I ran it again on my laptop. Different from the one I used back then, so the results are not comparable with the results from that old thread. Unpatched (commit 18d26140934): postgres=# \i contrib/resownerbench/snaptest.sql numkeep | numsnaps | lifo_time_ns | fifo_time_ns ---------+----------+--------------+-------------- 0 | 1 | 11.6 | 11.1 0 | 5 | 12.1 | 13.1 0 | 10 | 12.3 | 13.5 0 | 60 | 14.6 | 19.4 0 | 70 | 16.0 | 18.1 0 | 100 | 16.7 | 18.0 0 | 1000 | 18.1 | 20.7 0 | 10000 | 21.9 | 29.5 9 | 10 | 11.0 | 11.1 9 | 100 | 14.9 | 20.0 9 | 1000 | 16.1 | 24.4 9 | 10000 | 21.9 | 25.7 65 | 70 | 11.7 | 12.5 65 | 100 | 13.9 | 14.8 65 | 1000 | 16.7 | 17.8 65 | 10000 | 22.5 | 27.8 (16 rows) v2-0001-Deduplicate-entries-in-ResourceOwner.patch: postgres=# \i contrib/resownerbench/snaptest.sql numkeep | numsnaps | lifo_time_ns | fifo_time_ns ---------+----------+--------------+-------------- 0 | 1 | 10.8 | 10.6 0 | 5 | 11.5 | 12.3 0 | 10 | 12.1 | 13.0 0 | 60 | 13.9 | 19.4 0 | 70 | 15.9 | 18.7 0 | 100 | 16.0 | 18.5 0 | 1000 | 19.2 | 21.6 0 | 10000 | 22.4 | 29.0 9 | 10 | 11.2 | 11.3 9 | 100 | 14.4 | 19.9 9 | 1000 | 16.4 | 23.8 9 | 10000 | 22.4 | 25.7 65 | 70 | 11.4 | 12.1 65 | 100 | 14.8 | 17.0 65 | 1000 | 16.9 | 18.1 65 | 10000 | 22.5 | 28.5 (16 rows) v20251016-0001-Deduplicate-entries-in-ResourceOwner.patch: postgres=# \i contrib/resownerbench/snaptest.sql numkeep | numsnaps | lifo_time_ns | fifo_time_ns ---------+----------+--------------+-------------- 0 | 1 | 11.3 | 11.1 0 | 5 | 12.3 | 13.0 0 | 10 | 13.0 | 14.1 0 | 60 | 14.7 | 20.5 0 | 70 | 16.3 | 19.0 0 | 100 | 16.5 | 18.4 0 | 1000 | 19.0 | 22.4 0 | 10000 | 23.2 | 29.6 9 | 10 | 11.2 | 11.1 9 | 100 | 14.8 | 20.5 9 | 1000 | 16.8 | 24.3 9 | 10000 | 23.3 | 26.5 65 | 70 | 12.4 | 13.0 65 | 100 | 15.2 | 16.6 65 | 1000 | 16.9 | 18.4 65 | 10000 | 23.4 | 29.3 (16 rows) These are just a single run on my laptop, the error bars on individual numbers are significant. But it seems to me that V2 is maybe a little faster when the entries fit in the array. - Heikki
Attachment
pgsql-hackers by date: