Re: Patch: ResourceOwner optimization for tables with many partitions - Mailing list pgsql-hackers
From | Aleksander Alekseev |
---|---|
Subject | Re: Patch: ResourceOwner optimization for tables with many partitions |
Date | |
Msg-id | 20151209163052.48d2d43a@fujitsu Whole thread Raw |
In response to | Re: Patch: ResourceOwner optimization for tables with many partitions (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Patch: ResourceOwner optimization for tables with many partitions
|
List | pgsql-hackers |
Hello, Robert Thanks for your review. I believe I fixed items 1, 2 and 3 (see attachment). Also I would like to clarify item 4. > 4. It mixes together multiple ideas in a single patch, not only > introducing a hashing concept but also striping a brand-new layer of > abstraction across the resource-owner mechanism. I am not sure that > layer of abstraction is a very good idea, but if it needs to be done, > I think it should be a separate patch. Do I right understand that you suggest following? Current patch should be split in two parts. In first patch we create and use ResourceArray with array-based implementation (abstraction layer). Then we apply second patch which change ResourceArray implementation to hashing based (optimization). Best regards, Aleksander On Tue, 8 Dec 2015 17:30:33 -0500 Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Dec 4, 2015 at 7:15 AM, Aleksander Alekseev > <a.alekseev@postgrespro.ru> wrote: > > Current implementation of ResourceOwner uses arrays to store > > resources like TupleDescs, Snapshots, etc. When we want to release > > one of these resources ResourceOwner finds it with linear scan. > > Granted, resource array are usually small so its not a problem most > > of the time. But it appears to be a bottleneck when we are working > > with tables which have a lot of partitions. > > > > To reproduce this issue: > > 1. run `./gen.pl 10000 | psql my_database postgres` > > 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database` > > 3. in second terminal run `sudo perf top -u postgres` > > > > Both gen.pl and q.sql are attached to this message. > > > > You will see that postgres spends a lot of time in > > ResourceOwnerForget* procedures: > > > > 32.80% postgres [.] list_nth > > 20.29% postgres [.] ResourceOwnerForgetRelationRef > > 12.87% postgres [.] find_all_inheritors > > 7.90% postgres [.] get_tabstat_entry > > 6.68% postgres [.] ResourceOwnerForgetTupleDesc > > 1.17% postgres [.] hash_search_with_hash_value > > ... < 1% ... > > > > I would like to suggest a patch (see attachment) witch fixes this > > bottleneck. Also I discovered that there is a lot of code > > duplication in ResourceOwner. Patch fixes this too. The idea is > > quite simple. I just replaced arrays with something that could be > > considered hash tables, but with some specific optimizations. > > > > After applying this patch we can see that bottleneck is gone: > > > > 42.89% postgres [.] list_nth > > 18.30% postgres [.] find_all_inheritors > > 10.97% postgres [.] get_tabstat_entry > > 1.82% postgres [.] hash_search_with_hash_value > > 1.21% postgres [.] SearchCatCache > > ... < 1% ... > > > > For tables with thousands partitions we gain in average 32.5% more > > TPS. > > > > As far as I can see in the same time patch doesn't make anything > > worse. `make check` passes with asserts enabled and disabled. There > > is no performance degradation according to both standard pgbench > > benchmark and benchmark described above for tables with 10 and 100 > > partitions. > > I do think it's interesting to try to speed up the ResourceOwner > mechanism when there are many resources owned. We've had some forays > in that direction in the past, and I think it's useful to continue > that work. But I also think that this patch, while interesting, is > not something we can seriously consider committing in its current > form, because: > > 1. It lacks adequate comments and submission notes to explain the > general theory of operation of the patch. > > 2. It seems to use uint8 * rather freely as a substitute for char * or > void *, which doesn't seem like a great idea. > > 3. It doesn't follow PostgreSQL formatting conventions (uncuddled > curly braces, space after if and before open paren, etc.). > > 4. It mixes together multiple ideas in a single patch, not only > introducing a hashing concept but also striping a brand-new layer of > abstraction across the resource-owner mechanism. I am not sure that > layer of abstraction is a very good idea, but if it needs to be done, > I think it should be a separate patch. >
Attachment
pgsql-hackers by date: