Thread: Patch: ResourceOwner optimization for tables with many partitions
Hello all, Current implementation of ResourceOwner uses arrays to store resources like TupleDescs, Snapshots, etc. When we want to release one of these resources ResourceOwner finds it with linear scan. Granted, resource array are usually small so its not a problem most of the time. But it appears to be a bottleneck when we are working with tables which have a lot of partitions. To reproduce this issue: 1. run `./gen.pl 10000 | psql my_database postgres` 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database` 3. in second terminal run `sudo perf top -u postgres` Both gen.pl and q.sql are attached to this message. You will see that postgres spends a lot of time in ResourceOwnerForget* procedures: 32.80% postgres [.] list_nth 20.29% postgres [.] ResourceOwnerForgetRelationRef 12.87% postgres [.] find_all_inheritors 7.90% postgres [.] get_tabstat_entry 6.68% postgres [.] ResourceOwnerForgetTupleDesc 1.17% postgres [.] hash_search_with_hash_value ... < 1% ... I would like to suggest a patch (see attachment) witch fixes this bottleneck. Also I discovered that there is a lot of code duplication in ResourceOwner. Patch fixes this too. The idea is quite simple. I just replaced arrays with something that could be considered hash tables, but with some specific optimizations. After applying this patch we can see that bottleneck is gone: 42.89% postgres [.] list_nth 18.30% postgres [.] find_all_inheritors 10.97% postgres [.] get_tabstat_entry 1.82% postgres [.] hash_search_with_hash_value 1.21% postgres [.] SearchCatCache ... < 1% ... For tables with thousands partitions we gain in average 32.5% more TPS. As far as I can see in the same time patch doesn't make anything worse. `make check` passes with asserts enabled and disabled. There is no performance degradation according to both standard pgbench benchmark and benchmark described above for tables with 10 and 100 partitions. Best regards, Aleksander
Attachment
On Fri, Dec 4, 2015 at 7:15 AM, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: > Current implementation of ResourceOwner uses arrays to store resources > like TupleDescs, Snapshots, etc. When we want to release one of these > resources ResourceOwner finds it with linear scan. Granted, resource > array are usually small so its not a problem most of the time. But it > appears to be a bottleneck when we are working with tables which have a > lot of partitions. > > To reproduce this issue: > 1. run `./gen.pl 10000 | psql my_database postgres` > 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database` > 3. in second terminal run `sudo perf top -u postgres` > > Both gen.pl and q.sql are attached to this message. > > You will see that postgres spends a lot of time in ResourceOwnerForget* > procedures: > > 32.80% postgres [.] list_nth > 20.29% postgres [.] ResourceOwnerForgetRelationRef > 12.87% postgres [.] find_all_inheritors > 7.90% postgres [.] get_tabstat_entry > 6.68% postgres [.] ResourceOwnerForgetTupleDesc > 1.17% postgres [.] hash_search_with_hash_value > ... < 1% ... > > I would like to suggest a patch (see attachment) witch fixes this > bottleneck. Also I discovered that there is a lot of code duplication in > ResourceOwner. Patch fixes this too. The idea is quite simple. I just > replaced arrays with something that could be considered hash tables, > but with some specific optimizations. > > After applying this patch we can see that bottleneck is gone: > > 42.89% postgres [.] list_nth > 18.30% postgres [.] find_all_inheritors > 10.97% postgres [.] get_tabstat_entry > 1.82% postgres [.] hash_search_with_hash_value > 1.21% postgres [.] SearchCatCache > ... < 1% ... > > For tables with thousands partitions we gain in average 32.5% more TPS. > > As far as I can see in the same time patch doesn't make anything worse. > `make check` passes with asserts enabled and disabled. There is no > performance degradation according to both standard pgbench benchmark > and benchmark described above for tables with 10 and 100 partitions. I do think it's interesting to try to speed up the ResourceOwner mechanism when there are many resources owned. We've had some forays in that direction in the past, and I think it's useful to continue that work. But I also think that this patch, while interesting, is not something we can seriously consider committing in its current form, because: 1. It lacks adequate comments and submission notes to explain the general theory of operation of the patch. 2. It seems to use uint8 * rather freely as a substitute for char * or void *, which doesn't seem like a great idea. 3. It doesn't follow PostgreSQL formatting conventions (uncuddled curly braces, space after if and before open paren, etc.). 4. It mixes together multiple ideas in a single patch, not only introducing a hashing concept but also striping a brand-new layer of abstraction across the resource-owner mechanism. I am not sure that layer of abstraction is a very good idea, but if it needs to be done, I think it should be a separate patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
Hello, Robert Thanks for your review. I believe I fixed items 1, 2 and 3 (see attachment). Also I would like to clarify item 4. > 4. It mixes together multiple ideas in a single patch, not only > introducing a hashing concept but also striping a brand-new layer of > abstraction across the resource-owner mechanism. I am not sure that > layer of abstraction is a very good idea, but if it needs to be done, > I think it should be a separate patch. Do I right understand that you suggest following? Current patch should be split in two parts. In first patch we create and use ResourceArray with array-based implementation (abstraction layer). Then we apply second patch which change ResourceArray implementation to hashing based (optimization). Best regards, Aleksander On Tue, 8 Dec 2015 17:30:33 -0500 Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Dec 4, 2015 at 7:15 AM, Aleksander Alekseev > <a.alekseev@postgrespro.ru> wrote: > > Current implementation of ResourceOwner uses arrays to store > > resources like TupleDescs, Snapshots, etc. When we want to release > > one of these resources ResourceOwner finds it with linear scan. > > Granted, resource array are usually small so its not a problem most > > of the time. But it appears to be a bottleneck when we are working > > with tables which have a lot of partitions. > > > > To reproduce this issue: > > 1. run `./gen.pl 10000 | psql my_database postgres` > > 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database` > > 3. in second terminal run `sudo perf top -u postgres` > > > > Both gen.pl and q.sql are attached to this message. > > > > You will see that postgres spends a lot of time in > > ResourceOwnerForget* procedures: > > > > 32.80% postgres [.] list_nth > > 20.29% postgres [.] ResourceOwnerForgetRelationRef > > 12.87% postgres [.] find_all_inheritors > > 7.90% postgres [.] get_tabstat_entry > > 6.68% postgres [.] ResourceOwnerForgetTupleDesc > > 1.17% postgres [.] hash_search_with_hash_value > > ... < 1% ... > > > > I would like to suggest a patch (see attachment) witch fixes this > > bottleneck. Also I discovered that there is a lot of code > > duplication in ResourceOwner. Patch fixes this too. The idea is > > quite simple. I just replaced arrays with something that could be > > considered hash tables, but with some specific optimizations. > > > > After applying this patch we can see that bottleneck is gone: > > > > 42.89% postgres [.] list_nth > > 18.30% postgres [.] find_all_inheritors > > 10.97% postgres [.] get_tabstat_entry > > 1.82% postgres [.] hash_search_with_hash_value > > 1.21% postgres [.] SearchCatCache > > ... < 1% ... > > > > For tables with thousands partitions we gain in average 32.5% more > > TPS. > > > > As far as I can see in the same time patch doesn't make anything > > worse. `make check` passes with asserts enabled and disabled. There > > is no performance degradation according to both standard pgbench > > benchmark and benchmark described above for tables with 10 and 100 > > partitions. > > I do think it's interesting to try to speed up the ResourceOwner > mechanism when there are many resources owned. We've had some forays > in that direction in the past, and I think it's useful to continue > that work. But I also think that this patch, while interesting, is > not something we can seriously consider committing in its current > form, because: > > 1. It lacks adequate comments and submission notes to explain the > general theory of operation of the patch. > > 2. It seems to use uint8 * rather freely as a substitute for char * or > void *, which doesn't seem like a great idea. > > 3. It doesn't follow PostgreSQL formatting conventions (uncuddled > curly braces, space after if and before open paren, etc.). > > 4. It mixes together multiple ideas in a single patch, not only > introducing a hashing concept but also striping a brand-new layer of > abstraction across the resource-owner mechanism. I am not sure that > layer of abstraction is a very good idea, but if it needs to be done, > I think it should be a separate patch. >
Attachment
On Wed, Dec 9, 2015 at 8:30 AM, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: > Hello, Robert > > Thanks for your review. I believe I fixed items 1, 2 and 3 (see > attachment). Also I would like to clarify item 4. > >> 4. It mixes together multiple ideas in a single patch, not only >> introducing a hashing concept but also striping a brand-new layer of >> abstraction across the resource-owner mechanism. I am not sure that >> layer of abstraction is a very good idea, but if it needs to be done, >> I think it should be a separate patch. > > Do I right understand that you suggest following? > > Current patch should be split in two parts. In first patch we create > and use ResourceArray with array-based implementation (abstraction > layer). Then we apply second patch which change ResourceArray > implementation to hashing based (optimization). Well, sorta. To be honest, I think this patch is really ugly. If we were going to do this then, yes, I would want to split the patch into two parts along those lines. But actually I don't really want to do it this way at all. It's not that I don't want the performance benefits: I do. But the current code is really easy to read and extremely simple, and this changes it into something that is a heck of a lot harder to read and understand. I'm not sure exactly what to do about that, but it seems like a problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Well, sorta. To be honest, I think this patch is really ugly. If we > were going to do this then, yes, I would want to split the patch into > two parts along those lines. But actually I don't really want to do > it this way at all. It's not that I don't want the performance > benefits: I do. But the current code is really easy to read and > extremely simple, and this changes it into something that is a heck of > a lot harder to read and understand. I'm not sure exactly what to do > about that, but it seems like a problem. I haven't read the patch in any detail, but keep in mind that part of the API contract for ResourceOwners is that ResourceOwnerRememberFoo() cannot fail. Period. So any patch that makes those functions less than trivial is going to be really suspect from a reliability standpoint. It would be advisable for example that hash_any not suddenly become covered by the "must not fail" requirement. BTW, I do not think you can get away with the requirement that all-zeroes isn't a valid resource representation. It might be okay today but it's hardly future-proof. regards, tom lane
On Thu, Dec 10, 2015 at 2:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Well, sorta. To be honest, I think this patch is really ugly. If we >> were going to do this then, yes, I would want to split the patch into >> two parts along those lines. But actually I don't really want to do >> it this way at all. It's not that I don't want the performance >> benefits: I do. But the current code is really easy to read and >> extremely simple, and this changes it into something that is a heck of >> a lot harder to read and understand. I'm not sure exactly what to do >> about that, but it seems like a problem. > > I haven't read the patch in any detail, but keep in mind that part of the > API contract for ResourceOwners is that ResourceOwnerRememberFoo() cannot > fail. Period. So any patch that makes those functions less than trivial > is going to be really suspect from a reliability standpoint. It would be > advisable for example that hash_any not suddenly become covered by the > "must not fail" requirement. Hmm. I hadn't thought about that. I wonder if there's a simpler approach to this whole problem. What the patch aims to do is allow resources to be freed in arbitrary order without slowdown. But do we really need that? Resource owners are currently optimized for freeing resources in the order opposite to the order of allocation. I bet in this case the order in which they are freed isn't random, but is exactly in order of allocation. If so, we might be able to either (a) jigger things so that freeing in order of allocation is efficient or (b) jigger things so that the resources are released in reverse order of allocation as the resource owner code expects. That might be simpler and less invasive than this fix. Then again, it's somehow appealing for higher-level code not to have to care about this... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
>> To be honest, I think this patch is really ugly. [...] I'm not sure >> exactly what to do about that, but it seems like a problem. I have an idea. There are actually two types of resources - int-like (buffers, files) and void*-like (RelationRef, TupleDesc, ...). What if I split ResourceArray into IntResourceArray and PointerResourceArray? It would definitely solve ugliness problem --- no more memcpy's, char[] buffers, etc. >> It would be advisable for example that hash_any not suddenly become >> covered by the "must not fail" requirement. Frankly I can't think of any case when hash_any could or should fail. Maybe we should just add a "must not fail" constraint to hash_any description? Also I could use some other hash implementation. It may be reasonable in this case since size of data I would like to hash is small and known in advance. >> BTW, I do not think you can get away with the requirement that >> all-zeroes isn't a valid resource representation. It might be okay >> today but it's hardly future-proof. Agree. I could store a value that should be considered as "zero" in ResourceArray. It would be InvalidBuffer for buffers, -1 for files and NULL for all void*-types. Does such solution sounds OK? Best regards, Aleksander
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
> It would be InvalidBuffer for buffers, -1 for files and NULL for all > void*-types. Does such solution sounds OK? On second thought I believe there is no reason for storing anything for void*-like types. I could just hardcode NULL in PointerResourceArray. Best regards, Aleksander
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
Hello, Robert Here is my fix for item 4. Best regards, Aleksander On Thu, 10 Dec 2015 11:37:23 -0500 Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Dec 9, 2015 at 8:30 AM, Aleksander Alekseev > <a.alekseev@postgrespro.ru> wrote: > > Hello, Robert > > > > Thanks for your review. I believe I fixed items 1, 2 and 3 (see > > attachment). Also I would like to clarify item 4. > > > >> 4. It mixes together multiple ideas in a single patch, not only > >> introducing a hashing concept but also striping a brand-new layer > >> of abstraction across the resource-owner mechanism. I am not sure > >> that layer of abstraction is a very good idea, but if it needs to > >> be done, I think it should be a separate patch. > > > > Do I right understand that you suggest following? > > > > Current patch should be split in two parts. In first patch we create > > and use ResourceArray with array-based implementation (abstraction > > layer). Then we apply second patch which change ResourceArray > > implementation to hashing based (optimization). > > Well, sorta. To be honest, I think this patch is really ugly. If we > were going to do this then, yes, I would want to split the patch into > two parts along those lines. But actually I don't really want to do > it this way at all. It's not that I don't want the performance > benefits: I do. But the current code is really easy to read and > extremely simple, and this changes it into something that is a heck of > a lot harder to read and understand. I'm not sure exactly what to do > about that, but it seems like a problem. >
Attachment
On Mon, Dec 14, 2015 at 6:47 AM, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: > Here is my fix for item 4. I don't know, I'm still not very comfortable with this. And Tom didn't like dictating that hash_any() must be no-fail, though I'm not sure why. Let's wait to see what others think. I kind of hope there's a way of getting the benefits we want here without so much code churn. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I don't know, I'm still not very comfortable with this. And Tom > didn't like dictating that hash_any() must be no-fail, though I'm not > sure why. What I definitely didn't like was assuming at a distance that it would be no-fail. If we're to depend on that, the patch had better attach a comment saying so to the header comments of the function(s) it's assuming that about. Otherwise, somebody could hack up hashfunc.c in a way that breaks the assumption, without any clue that some code in a very-far-away module is critically reliant on it. > Let's wait to see what others think. A few observations: * This bit is too cute by half, if not three-quarters: + uint32 itemsizelg:2; /* sizeof one item log 2 */ + uint32 capacity:30; /* capacity of array */ Is there a good reason to assume that the only things we'll ever store in these arrays are of size no more than 8 bytes? Are we so desperate to save space that we cannot spare two separate words for itemsize and capacity? (ISTM it's a good bet that the extra code for accessing these bitfields occupies more space than would be saved, considering how few ResourceOwners typically exist at one time.) Let's just make it a couple of ints and be done. Actually, maybe nitems and capacity should be size_t, just in case. * An alternative design would be to forget itemsizelg altogether and insist that everything stored in the resource arrays be a Datum, which could then be coerced to/from some form of integer or some form of pointer as appropriate. That would waste some space in the int case, but it would considerably simplify both the ResourceArray code and the APIs to it, which might be worth the price of assuming we'll never store anything bigger than 8 bytes. It also would make this look more like some existing APIs such as the on_exit callbacks. * A lot of the code churn comes from the insistence on defining callbacks, which I'm dubious that we need. We could instead have a function that is "get any convenient one of the array elements" and revise the loops in ResourceOwnerReleaseInternal to be like while ((item = getconvenientitem(resourcearray))){ drop item in exactly the same way as before} I find that preferable to the proposed ResourceArrayRemoveAll + while (resarr->nitems > 0) + { + releasecb(resarr->itemsarr, isCommit); + } which certainly looks like it's an infinite loop; it's assuming (again with no documentation) that the callback function will cause the array to get smaller somehow. With the existing coding, it's much more clear why we think the loops will terminate. * The reason that ResourceOwnerReleaseInternal was not horribly inefficient was that its notion of "any convenient one" of the items to be deleted next was in fact the one that the corresponding Forget function would examine first, thus avoiding an O(N^2) cost to re-identify the item to be dropped. I think we should make an effort to be more explicit about that connection in any rewrite. In particular, it looks to me like when a hash array is in use, things will get slower not faster because we'll be adding a hash lookup step to each forget operation. Maybe we should consider adjusting the APIs so that that can be avoided. Or possibly we could have internal state in the ResourceArrays that says "we expect this item to be dropped in a moment, check that before going to the trouble of a hash lookup". * Actually, I'm not convinced that the proposed reimplementation of ResourceArrayRemove isn't horribly slow much of the time. It sure looks like it could degrade to a linear search very easily. * I still say that the assumption embodied as RESOURCE_ARRAY_ZERO_ELEMENT (ie that no valid entry is all-zero-bits) is pretty unacceptable. It might work for pointers, but I don't like it for resources represented by integer indexes. regards, tom lane
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
I believe I fixed all flaws mentioned so far (see attachment). Also I did a new benchmark to make sure that new patch makes PostgreSQL work faster and doesn't cause performance degradation in some cases. "usual pgbench" row corresponds to `pgbench -j 8 -c 8 -T 30 pgbench` performed on a 4-core PC. "N partitions" rows correspond to a benchmark described in a first message of this thread performed on the same PC. N is and argument given to gen.pl script i.e. number of partitions in generated partitioned table. Here are results (3 tests, TPS excluding connections establishing): Test | Before | After | Delta avg ----------------|-----------|-----------|------------- | 295.7 | 295.0 | usual pgbench | 303.1 | 299.6 | ~ 0% | 297.7 | 302.7 | ----------------|-----------|-----------|------------- | 28022.3 | 27956.1 | 10 partitions | 27550.1 | 28916.9 | ~ 0% | 28617.0 | 28022.9 | ----------------|-----------|-----------|------------- | 3021.4 | 3184.0 | 100 partitions | 2949.1 | 3120.1 | 3% more TPS | 2870.6 | 2825.2 | ----------------|-----------|-----------|------------- | 106.7 | 158.6 | 1000 partitions | 105.2 | 168.4 | 53% more TPS | 105.9 | 162.0 | On Fri, 18 Dec 2015 13:39:01 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > I don't know, I'm still not very comfortable with this. And Tom > > didn't like dictating that hash_any() must be no-fail, though I'm > > not sure why. > > What I definitely didn't like was assuming at a distance that it would > be no-fail. If we're to depend on that, the patch had better attach > a comment saying so to the header comments of the function(s) it's > assuming that about. Otherwise, somebody could hack up hashfunc.c > in a way that breaks the assumption, without any clue that some code > in a very-far-away module is critically reliant on it. > > > Let's wait to see what others think. > > A few observations: > > * This bit is too cute by half, if not three-quarters: > > + uint32 itemsizelg:2; /* sizeof one > item log 2 */ > + uint32 capacity:30; /* capacity of > array */ > > Is there a good reason to assume that the only things we'll ever store > in these arrays are of size no more than 8 bytes? Are we so desperate > to save space that we cannot spare two separate words for itemsize and > capacity? (ISTM it's a good bet that the extra code for accessing > these bitfields occupies more space than would be saved, considering > how few ResourceOwners typically exist at one time.) Let's just make > it a couple of ints and be done. Actually, maybe nitems and capacity > should be size_t, just in case. > > * An alternative design would be to forget itemsizelg altogether and > insist that everything stored in the resource arrays be a Datum, > which could then be coerced to/from some form of integer or some form > of pointer as appropriate. That would waste some space in the int > case, but it would considerably simplify both the ResourceArray code > and the APIs to it, which might be worth the price of assuming we'll > never store anything bigger than 8 bytes. It also would make this > look more like some existing APIs such as the on_exit callbacks. > > * A lot of the code churn comes from the insistence on defining > callbacks, which I'm dubious that we need. We could instead have a > function that is "get any convenient one of the array elements" and > revise the loops in ResourceOwnerReleaseInternal to be like > > while ((item = getconvenientitem(resourcearray))) > { > drop item in exactly the same way as before > } > > I find that preferable to the proposed ResourceArrayRemoveAll > > + while (resarr->nitems > 0) > + { > + releasecb(resarr->itemsarr, isCommit); > + } > > which certainly looks like it's an infinite loop; it's assuming (again > with no documentation) that the callback function will cause the array > to get smaller somehow. With the existing coding, it's much more > clear why we think the loops will terminate. > > * The reason that ResourceOwnerReleaseInternal was not horribly > inefficient was that its notion of "any convenient one" of the items > to be deleted next was in fact the one that the corresponding Forget > function would examine first, thus avoiding an O(N^2) cost to > re-identify the item to be dropped. I think we should make an effort > to be more explicit about that connection in any rewrite. In > particular, it looks to me like when a hash array is in use, things > will get slower not faster because we'll be adding a hash lookup step > to each forget operation. Maybe we should consider adjusting the > APIs so that that can be avoided. Or possibly we could have internal > state in the ResourceArrays that says "we expect this item to be > dropped in a moment, check that before going to the trouble of a hash > lookup". > > * Actually, I'm not convinced that the proposed reimplementation of > ResourceArrayRemove isn't horribly slow much of the time. It sure > looks like it could degrade to a linear search very easily. > > * I still say that the assumption embodied as > RESOURCE_ARRAY_ZERO_ELEMENT (ie that no valid entry is all-zero-bits) > is pretty unacceptable. It might work for pointers, but I don't like > it for resources represented by integer indexes. > > regards, tom lane > >
Attachment
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
Oops, wrong patches - here are correct ones.
Attachment
Hello. I have applied this patch and can confirm ~10% speedup by this patch in presence of big amount of inherited tables. Test case was as suggested by Aleksander: create 1000 inherited tables, no constraints, insert a row in each one, and issuesingle row queries over this table. Xeon-based server 12C/24T, 50 connections, 30-min average: TPS, no patch: 393 tps TPS, with patch: 441 tps The same setup but with single table with 1000 rows give performance about 188_000 tps, so overall speed of requests overa inherited table in this scenario is quite pathological (probably this is expected because database just execute 1000selects to each inherited table). I've also tried to set range constraints for all inherited tables, so only one tablewas affected by query, but planning time increased a lot and total tps was again about 500 tps. Also attaching two flame graphs measured during tests. It’s clearly visible that PortalDrop takes x4 less time after patch. Stas. > On 24 Dec 2015, at 12:24, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: > > Oops, wrong patches - here are correct ones.<resource-owner-optimization-v4-step1.patch><resource-owner-optimization-v4-step2.patch> > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers --- Stas Kelvich Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Attachment
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes: > Oops, wrong patches - here are correct ones. This is certainly not doing what I had in mind for communication between ResourceOwnerGetAny and ResourceOwnerRelease, because the latter pays zero attention to "lastidx", but instead does a blind hash search regardless. In general, I'm suspicious of the impact of this patch on "normal" cases with not-very-large resource arrays. It might be hard to measure that because in such cases resowner.c is not a bottleneck; but that doesn't mean I want to give up performance in cases that perform well today. You could probably set up a test harness that exercises ResourceOwnerAdd/Release/etc in isolation and get good timings for them that way, to confirm or disprove whether there's a performance loss for small arrays. I still suspect it would be advantageous to have two different operating modes depending on the size of an individual resource array, and not bother with hashing until you got to more than a couple dozen entries. Given that you're rehashing anyway when you enlarge the array, this wouldn't cost anything except a few more lines of code, ISTM --- and you already have a net code savings because of the refactoring. Also, there are defined ways to convert between Datum format and other formats (DatumGetPointer() etc). This code isn't using 'em. regards, tom lane
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
Hello, Tom > Also, there are defined ways to convert between Datum format and > other formats (DatumGetPointer() etc). This code isn't using 'em. Fixed. But I was not sure how to deal with File and Buffer types since they are ints (see fd.h and buf.h) and there is no DatumGetInt macro, only DatumGetInt32/Int64. I don't know if there is a good reason for this. So for now I just added these definitions right into resowner.c: ``` /* Convert File to Datum */ #define FileGetDatum(file) ((Datum)(file)) /* Convert Datum to File */ #define DatumGetFile(datum)((File)(datum)) /* Convert Buffer to Datum */ #define BufferGetDatum(buffer)((Datum)(buffer)) /* Convert Datum to Buffer */ #define DatumGetBuffer(datum)((Buffer)(datum)) ``` ... to make all code look similar and all intentions --- clear. I have a feeling you could suggest a better solution :) > This is certainly not doing what I had in mind for communication > between ResourceOwnerGetAny and ResourceOwnerRelease, because the > latter pays zero attention to "lastidx", but instead does a blind > hash search regardless. > > In general, I'm suspicious of the impact of this patch on "normal" > cases with not-very-large resource arrays. It might be hard to > measure that because in such cases resowner.c is not a bottleneck; > but that doesn't mean I want to give up performance in cases that > perform well today. You could probably set up a test harness that > exercises ResourceOwnerAdd/Release/etc in isolation and get good > timings for them that way, to confirm or disprove whether there's > a performance loss for small arrays. > > I still suspect it would be advantageous to have two different > operating modes depending on the size of an individual resource > array, and not bother with hashing until you got to more than a > couple dozen entries. Given that you're rehashing anyway when you > enlarge the array, this wouldn't cost anything except a few more > lines of code, ISTM --- and you already have a net code savings > because of the refactoring. To be honest I don't have much faith in such micro benchmarks. They don't consider how code will be executed under real load. Therefore any results of such a benchmark wouldn't give us a clear answer which solution is preferable. Besides different users can execute the same code in different fashion. I compared two implementations - "always use hashing" (step2a.path) and "use hashing only for large arrays" (step2b.path). Both patches give the same performance according to benchmark I described in a first message of this thread. Since none of these patches is better in respect of problem I'm trying to solve I suggest to use step2b.path. It seems to be a good idea not to use hashing for searching in small arrays. Besides in this case behaviour of PostgreSQL will be changed less after applying a patch. Users will probably appreciate this. Best regards, Aleksander
Attachment
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes: > I compared two implementations - "always use hashing" (step2a.path) and > "use hashing only for large arrays" (step2b.path). Both patches give > the same performance according to benchmark I described in a first > message of this thread. Um, that's not too surprising, because they're exactly the same patch? regards, tom lane
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
> Um, that's not too surprising, because they're exactly the same patch? Wrong diff. Here is correct one. Everything else is right. I just re-checked :) step2a: number of transactions actually processed: 16325 latency average: 49.008 ms latency stddev: 8.780 ms tps = 163.182594 (including connections establishing) tps = 163.189818 (excluding connections establishing) step2b-fixed: number of transactions actually processed: 16374 latency average: 48.867 ms latency stddev: 8.223 ms tps = 163.658269 (including connections establishing) tps = 163.666318 (excluding connections establishing)
Attachment
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes: >> Um, that's not too surprising, because they're exactly the same patch? > Wrong diff. Here is correct one. This still had quite a few bugs, but I fixed them (hope I caught everything) and pushed it. I did some performance testing of the ResourceArray code in isolation. It appears that you need 100 or so elements before the hash code path is competitive with the array path, so I increased the cutover point quite a bit from what you had. regards, tom lane
Re: Patch: ResourceOwner optimization for tables with many partitions
From
Aleksander Alekseev
Date:
Hello, Tom. I'm a bit concerned regarding assumption that sizeof int never exceeds 4 bytes. While this could be true today for most C compilers, standard [1][2] doesn't guarantee that. Perhaps we should add something like: StaticAssertStmt(sizeof(int) <= sizeof(int32), "int size exceeds int32 size"); It costs nothing but could save a lot of time (not mentioning data loss) some unlucky user. [1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf [2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
On Wed, Jan 27, 2016 at 3:57 AM, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: > I'm a bit concerned regarding assumption that sizeof int never exceeds 4 > bytes. While this could be true today for most C compilers, standard > [1][2] doesn't guarantee that. Perhaps we should add something like: > > StaticAssertStmt(sizeof(int) <= sizeof(int32), > "int size exceeds int32 size"); I suspect that if this ever failed to be true, resowner.c would not be the only thing having problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company