Re: Speed up transaction completion faster after many relations areaccessed in a transaction - Mailing list pgsql-hackers

From David Rowley
Subject Re: Speed up transaction completion faster after many relations areaccessed in a transaction
Date
Msg-id CAKJS1f_UoT6FsmDA=vqjPNu+XmtDCwuieAQXxDp9HRSAVZzicA@mail.gmail.com
Whole thread Raw
In response to Re: Speed up transaction completion faster after many relations areaccessed in a transaction  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Speed up transaction completion faster after many relations areaccessed in a transaction  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
On Wed, 24 Jul 2019 at 15:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
> To be able to reduce the threshold down again we'd need to make a
> hash_get_num_entries(LockMethodLocalHash) call before performing the
> guts of LockReleaseAll(). We could then weight that onto some running
> average counter with a weight of, say... 10, so we react to changes
> fairly quickly, but not instantly. We could then have some sort of
> logic like "rebuild the hash table if running average 4 times less
> than max_bucket"
>
> I've attached a spreadsheet of that idea and the algorithm we could
> use to track the running average.  Initially, I've mocked it up a
> series of transactions that use 1000 locks, then at row 123 dropped
> that to 10 locks. If we assume the max_bucket is 1000, then it takes
> until row 136 for the running average to drop below the max_bucket
> count, i.e 13 xacts. There we'd reset there at the init size of 16. If
> the average went up again, then we'd automatically expand the table as
> we do now.  To make this work we'd need an additional call to
> hash_get_num_entries(), before we release the locks, so there is more
> overhead.

Here's a patch with this implemented. I've left a NOTICE in there to
make it easier for people to follow along at home and see when the
lock table is reset.

There will be a bit of additional overhead to the reset detection
logic over the v7 patch. Namely: additional hash_get_num_entries()
call before releasing the locks, and more complex and floating-point
maths instead of more simple integer maths in v7.

Here's a demo with the debug NOTICE in there to show us what's going on.

-- setup
create table a (a int) partition by range (a);
select 'create table a'||x||' partition of a for values from('||x||')
to ('||x+1||');' from generate_Series(1,1000) x;
\gexec

$ psql postgres
NOTICE:  max_bucket = 15, threshold = 64.000000, running_avg_locks
0.100000 Reset? No
psql (13devel)
# \o /dev/null
# select * from a where a < 100;
NOTICE:  max_bucket = 101, threshold = 64.000000, running_avg_locks
10.090000 Reset? Yes
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 76.324005, running_avg_locks
19.081001 Reset? Yes
# select * from a where a < 100;

A couple of needless resets there... Maybe we can get rid of those by
setting the initial running average up to something higher than 0.0.

NOTICE:  max_bucket = 99, threshold = 108.691605, running_avg_locks
27.172901 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 137.822449, running_avg_locks
34.455612 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 164.040207, running_avg_locks
41.010052 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 187.636185, running_avg_locks
46.909046 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 208.872559, running_avg_locks
52.218140 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 227.985306, running_avg_locks
56.996326 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 245.186768, running_avg_locks
61.296692 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 260.668091, running_avg_locks
65.167023 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 274.601288, running_avg_locks
68.650322 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 287.141174, running_avg_locks
71.785294 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 298.427063, running_avg_locks
74.606766 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 308.584351, running_avg_locks
77.146088 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 317.725922, running_avg_locks
79.431480 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 325.953339, running_avg_locks
81.488335 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 333.358002, running_avg_locks
83.339500 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 340.022217, running_avg_locks
85.005554 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 346.019989, running_avg_locks
86.504997 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 351.417999, running_avg_locks
87.854500 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 356.276184, running_avg_locks
89.069046 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 360.648560, running_avg_locks
90.162140 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 364.583710, running_avg_locks
91.145927 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 368.125336, running_avg_locks
92.031334 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 371.312805, running_avg_locks
92.828201 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 374.181519, running_avg_locks
93.545380 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 376.763367, running_avg_locks
94.190842 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 379.087036, running_avg_locks
94.771759 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 381.178345, running_avg_locks
95.294586 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 383.060516, running_avg_locks
95.765129 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 384.754456, running_avg_locks
96.188614 Reset? No
# select * from a where a < 100;
NOTICE:  max_bucket = 99, threshold = 386.279022, running_avg_locks
96.569756 Reset? No

-- Here I switch to only selecting from 9 partitions instead of 99.

# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 351.651123, running_avg_locks
87.912781 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 320.486023, running_avg_locks
80.121506 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 292.437408, running_avg_locks
73.109352 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 267.193665, running_avg_locks
66.798416 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 244.474304, running_avg_locks
61.118576 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 224.026871, running_avg_locks
56.006718 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 205.624176, running_avg_locks
51.406044 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 189.061752, running_avg_locks
47.265438 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 174.155579, running_avg_locks
43.538895 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 160.740021, running_avg_locks
40.185005 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 148.666016, running_avg_locks
37.166504 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 137.799408, running_avg_locks
34.449852 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 128.019470, running_avg_locks
32.004868 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 119.217522, running_avg_locks
29.804380 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 111.295769, running_avg_locks
27.823942 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 104.166191, running_avg_locks
26.041548 Reset? No
# select * from a where a < 10;
NOTICE:  max_bucket = 99, threshold = 97.749573, running_avg_locks
24.437393 Reset? Yes

It took 17 xacts to react to the change and reset the lock table.

# select * from a where a < 10;
NOTICE:  max_bucket = 15, threshold = 91.974617, running_avg_locks
22.993654 Reset? No

notice max_bucket is back at 15 again.

Any thoughts on this?

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Ian Barwick
Date:
Subject: Re: [PATCH] minor bugfix for pg_basebackup (9.6 ~ )
Next
From: Dilip Kumar
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs