Re: Optimizing ResouceOwner to speed up COPY - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Optimizing ResouceOwner to speed up COPY
Date
Msg-id 1534176.1760638367@sss.pgh.pa.us
Whole thread Raw
In response to Optimizing ResouceOwner to speed up COPY  (Tomas Vondra <tomas@vondra.me>)
Responses Re: Optimizing ResouceOwner to speed up COPY
List pgsql-hackers
Tomas Vondra <tomas@vondra.me> writes:
> The reason is pretty simple - ResourceOwner tracks the resources in a
> very simple hash table, with O(n^2) behavior with duplicates. This
> happens with COPY, because COPY creates an array of a 1000 tuple slots,
> and each slot references the same tuple descriptor. And the descriptor
> is added to ResourceOwner for each slot.
> ...
> There's an easy way to improve this by allowing a single hash entry to
> represent multiple references to the same resource. The attached patch
> adds a "count" to the ResourceElem, tracking how many times that
> resource was added. So if you add 1000 tuples slots, the descriptor will
> have just one ResourceElem entry with count=1000.

Hmm.  I don't love the 50% increase in sizeof(ResourceElem) ... maybe
that's negligible, or maybe it isn't.  Can you find evidence of this
change being helpful for anything except this specific scenario in
COPY?  Because we could probably find some way to avoid registering
all the doppelganger slots, if that's the only culprit.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Optimizing ResouceOwner to speed up COPY
Next
From: "Joel Jacobson"
Date:
Subject: Re: Optimize LISTEN/NOTIFY