Re: Patch: ResourceOwner optimization for tables with many partitions - Mailing list pgsql-hackers
From | Aleksander Alekseev |
---|---|
Subject | Re: Patch: ResourceOwner optimization for tables with many partitions |
Date | |
Msg-id | 20151224111925.56e267e9@fujitsu Whole thread Raw |
In response to | Re: Patch: ResourceOwner optimization for tables with many partitions (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Patch: ResourceOwner optimization for tables with
many partitions
|
List | pgsql-hackers |
I believe I fixed all flaws mentioned so far (see attachment). Also I did a new benchmark to make sure that new patch makes PostgreSQL work faster and doesn't cause performance degradation in some cases. "usual pgbench" row corresponds to `pgbench -j 8 -c 8 -T 30 pgbench` performed on a 4-core PC. "N partitions" rows correspond to a benchmark described in a first message of this thread performed on the same PC. N is and argument given to gen.pl script i.e. number of partitions in generated partitioned table. Here are results (3 tests, TPS excluding connections establishing): Test | Before | After | Delta avg ----------------|-----------|-----------|------------- | 295.7 | 295.0 | usual pgbench | 303.1 | 299.6 | ~ 0% | 297.7 | 302.7 | ----------------|-----------|-----------|------------- | 28022.3 | 27956.1 | 10 partitions | 27550.1 | 28916.9 | ~ 0% | 28617.0 | 28022.9 | ----------------|-----------|-----------|------------- | 3021.4 | 3184.0 | 100 partitions | 2949.1 | 3120.1 | 3% more TPS | 2870.6 | 2825.2 | ----------------|-----------|-----------|------------- | 106.7 | 158.6 | 1000 partitions | 105.2 | 168.4 | 53% more TPS | 105.9 | 162.0 | On Fri, 18 Dec 2015 13:39:01 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > I don't know, I'm still not very comfortable with this. And Tom > > didn't like dictating that hash_any() must be no-fail, though I'm > > not sure why. > > What I definitely didn't like was assuming at a distance that it would > be no-fail. If we're to depend on that, the patch had better attach > a comment saying so to the header comments of the function(s) it's > assuming that about. Otherwise, somebody could hack up hashfunc.c > in a way that breaks the assumption, without any clue that some code > in a very-far-away module is critically reliant on it. > > > Let's wait to see what others think. > > A few observations: > > * This bit is too cute by half, if not three-quarters: > > + uint32 itemsizelg:2; /* sizeof one > item log 2 */ > + uint32 capacity:30; /* capacity of > array */ > > Is there a good reason to assume that the only things we'll ever store > in these arrays are of size no more than 8 bytes? Are we so desperate > to save space that we cannot spare two separate words for itemsize and > capacity? (ISTM it's a good bet that the extra code for accessing > these bitfields occupies more space than would be saved, considering > how few ResourceOwners typically exist at one time.) Let's just make > it a couple of ints and be done. Actually, maybe nitems and capacity > should be size_t, just in case. > > * An alternative design would be to forget itemsizelg altogether and > insist that everything stored in the resource arrays be a Datum, > which could then be coerced to/from some form of integer or some form > of pointer as appropriate. That would waste some space in the int > case, but it would considerably simplify both the ResourceArray code > and the APIs to it, which might be worth the price of assuming we'll > never store anything bigger than 8 bytes. It also would make this > look more like some existing APIs such as the on_exit callbacks. > > * A lot of the code churn comes from the insistence on defining > callbacks, which I'm dubious that we need. We could instead have a > function that is "get any convenient one of the array elements" and > revise the loops in ResourceOwnerReleaseInternal to be like > > while ((item = getconvenientitem(resourcearray))) > { > drop item in exactly the same way as before > } > > I find that preferable to the proposed ResourceArrayRemoveAll > > + while (resarr->nitems > 0) > + { > + releasecb(resarr->itemsarr, isCommit); > + } > > which certainly looks like it's an infinite loop; it's assuming (again > with no documentation) that the callback function will cause the array > to get smaller somehow. With the existing coding, it's much more > clear why we think the loops will terminate. > > * The reason that ResourceOwnerReleaseInternal was not horribly > inefficient was that its notion of "any convenient one" of the items > to be deleted next was in fact the one that the corresponding Forget > function would examine first, thus avoiding an O(N^2) cost to > re-identify the item to be dropped. I think we should make an effort > to be more explicit about that connection in any rewrite. In > particular, it looks to me like when a hash array is in use, things > will get slower not faster because we'll be adding a hash lookup step > to each forget operation. Maybe we should consider adjusting the > APIs so that that can be avoided. Or possibly we could have internal > state in the ResourceArrays that says "we expect this item to be > dropped in a moment, check that before going to the trouble of a hash > lookup". > > * Actually, I'm not convinced that the proposed reimplementation of > ResourceArrayRemove isn't horribly slow much of the time. It sure > looks like it could degrade to a linear search very easily. > > * I still say that the assumption embodied as > RESOURCE_ARRAY_ZERO_ELEMENT (ie that no valid entry is all-zero-bits) > is pretty unacceptable. It might work for pointers, but I don't like > it for resources represented by integer indexes. > > regards, tom lane > >
Attachment
pgsql-hackers by date: