Patch: ResourceOwner optimization for tables with many partitions - Mailing list pgsql-hackers

From Aleksander Alekseev
Subject Patch: ResourceOwner optimization for tables with many partitions
Date
Msg-id 20151204151504.5c7e4278@fujitsu
Whole thread Raw
Responses Re: Patch: ResourceOwner optimization for tables with many partitions
List pgsql-hackers
Hello all,

Current implementation of ResourceOwner uses arrays to store resources
like TupleDescs, Snapshots, etc. When we want to release one of these
resources ResourceOwner finds it with linear scan. Granted, resource
array are usually small so its not a problem most of the time. But it
appears to be a bottleneck when we are working with tables which have a
lot of partitions.

To reproduce this issue:
1. run `./gen.pl 10000 | psql my_database postgres`
2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database`
3. in second terminal run `sudo perf top -u postgres`

Both gen.pl and q.sql are attached to this message.

You will see that postgres spends a lot of time in ResourceOwnerForget*
procedures:

 32.80%  postgres               [.] list_nth
 20.29%  postgres               [.] ResourceOwnerForgetRelationRef
 12.87%  postgres               [.] find_all_inheritors
  7.90%  postgres               [.] get_tabstat_entry
  6.68%  postgres               [.] ResourceOwnerForgetTupleDesc
  1.17%  postgres               [.] hash_search_with_hash_value
 ... < 1% ...

I would like to suggest a patch (see attachment) witch fixes this
bottleneck. Also I discovered that there is a lot of code duplication in
ResourceOwner. Patch fixes this too. The idea is quite simple. I just
replaced arrays with something that could be considered hash tables,
but with some specific optimizations.

After applying this patch we can see that bottleneck is gone:

 42.89%  postgres               [.] list_nth
 18.30%  postgres               [.] find_all_inheritors
 10.97%  postgres               [.] get_tabstat_entry
  1.82%  postgres               [.] hash_search_with_hash_value
  1.21%  postgres               [.] SearchCatCache
 ... < 1% ...

For tables with thousands partitions we gain in average 32.5% more TPS.

As far as I can see in the same time patch doesn't make anything worse.
`make check` passes with asserts enabled and disabled. There is no
performance degradation according to both standard pgbench benchmark
and benchmark described above for tables with 10 and 100 partitions.

Best regards,
Aleksander
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: proposal: add 'waiting for replication' to pg_stat_activity.state
Next
From: Greg Stark
Date:
Subject: Re: atomic reads & writes (with no barriers)