Re: Speed up transaction completion faster after many relations areaccessed in a transaction - Mailing list pgsql-hackers

From David Rowley
Subject Re: Speed up transaction completion faster after many relations areaccessed in a transaction
Date
Msg-id CAKJS1f-ffb-R5RV6iDKbiXpMYCG71=0hqaCU3maDjPOktbf7AQ@mail.gmail.com
Whole thread Raw
In response to RE: Speed up transaction completion faster after many relations areaccessed in a transaction  ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
Responses Re: Speed up transaction completion faster after many relations areaccessed in a transaction  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
On Thu, 27 Jun 2019 at 12:59, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
>
> From: David Rowley [mailto:david.rowley@2ndquadrant.com]
> Thank you, looks good.  I find it ready for committer (I noticed the status is already set so.)

Thanks for looking.

I've just been looking at this again and I thought I'd better check
the performance of the worst case for the patch, where the hash table
is rebuilt each query.

To do this I first created a single column 70 partition partitioned
table ("p") and left it empty.

I then checked the performance of:

SELECT * FROM p;

Having 70 partitions means that the lock table's max bucket goes over
the LOCKMETHODLOCALHASH_SHRINK_THRESHOLD which is set to 64 and
results in the table being rebuilt each time the query is run.

The performance was as follows:

70 partitions: LOCKMETHODLOCALHASH_SHRINK_THRESHOLD = 64

master + shrink_bloated_locallocktable_v5.patch:

ubuntu@ip-10-0-0-201:~$ pgbench -n -T 60 -f select1.sql -M prepared postgres
tps = 8427.053378 (excluding connections establishing)
tps = 8583.251821 (excluding connections establishing)
tps = 8569.587268 (excluding connections establishing)
tps = 8552.988483 (excluding connections establishing)
tps = 8527.735108 (excluding connections establishing)

master (93907478):

ubuntu@ip-10-0-0-201:~$ pgbench -n -T 60 -f select1.sql -M prepared postgres
tps = 8712.919411 (excluding connections establishing)
tps = 8760.190372 (excluding connections establishing)
tps = 8755.069470 (excluding connections establishing)
tps = 8747.389735 (excluding connections establishing)
tps = 8758.275202 (excluding connections establishing)

patched is 2.45% slower


If I increase the partition count to 140 and put the
LOCKMETHODLOCALHASH_SHRINK_THRESHOLD up to 128, then the performance
is as follows:

master + shrink_bloated_locallocktable_v5.patch:

ubuntu@ip-10-0-0-201:~$ pgbench -n -T 60 -f select1.sql -M prepared postgres
tps = 2548.917856 (excluding connections establishing)
tps = 2561.283564 (excluding connections establishing)
tps = 2549.669870 (excluding connections establishing)
tps = 2421.971864 (excluding connections establishing)
tps = 2428.983660 (excluding connections establishing)

Master (93907478):

ubuntu@ip-10-0-0-201:~$ pgbench -n -T 60 -f select1.sql -M prepared postgres
tps = 2605.407529 (excluding connections establishing)
tps = 2600.691426 (excluding connections establishing)
tps = 2594.123983 (excluding connections establishing)
tps = 2455.745644 (excluding connections establishing)
tps = 2450.061483 (excluding connections establishing)

patched is 1.54% slower

I'd rather not put the LOCKMETHODLOCALHASH_SHRINK_THRESHOLD up any
higher than 128 since it can detract from the improvement we're trying
to make with this patch.

Now, this case of querying a partitioned table that happens to be
completely empty seems a bit unrealistic. Something more realistic
might be index scanning all partitions to find a value that only
exists in a single partition. Assuming the partitions actually have
some records, then that's going to be a more expensive query, so the
overhead of rebuilding the table will be less noticeable.

A previous version of the patch has already had some heuristics to try
to only rebuild the hash table when it's likely beneficial. I'd rather
not go exploring in that area again.

Is anyone particularly concerned about the worst-case slowdown here
being about 1.54%?  The best case, and arguably a more realistic case
above showed a 34% speedup for the best case.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Comment typo in tableam.h
Next
From: Ryan Lambert
Date:
Subject: Re: Built-in connection pooler