Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Date
Msg-id CAFiTN-tfzyvDiftvAq6rayH69Gy5Ah6o4oo4+1_Y-DjTq5Tf6Q@mail.gmail.com
Whole thread Raw
In response to Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
List pgsql-hackers
On Tue, Nov 21, 2023 at 2:03 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Nov 20, 2023 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Nov 20, 2023 at 2:37 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > > > On 20 Nov 2023, at 13:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > 2) Do we really need one separate lwlock tranche for each SLRU?
> > > >
> > > > IMHO if we use the same lwlock tranche then the wait event will show
> > > > the same wait event name, right? And that would be confusing for the
> > > > user, whether we are waiting for Subtransaction or Multixact or
> > > > anything else.  Is my understanding no correct here?
> > >
> > > If we give to a user multiple GUCs to tweak, I think we should give a way to understand which GUC to tweak when
theyobserve wait times. 
>
> PFA, updated patch set, I have worked on review comments by Alvaro and
> Andrey.  So the only open comments are about clog group commit
> testing, for that my question was as I sent in the previous email
> exactly what part we are worried about in the coverage report?
>
> The second point is, if we want to generate a group update we will
> have to create the injection point after we hold the control lock so
> that other processes go for group update and then for waking up the
> waiting process who is holding the SLRU control lock in the exclusive
> mode we would need to call a function ('test_injection_points_wake()')
> to wake that up and for calling the function we would need to again
> acquire the SLRU lock in read mode for visibility check in the catalog
> for fetching the procedure row and now this wake up session will block
> on control lock for the session which is waiting on injection point so
> now it will create a deadlock.   Maybe with bank-wise lock we can
> create a lot of transaction so that these 2 falls in different banks
> and then we can somehow test this, but then we will have to generate
> 16 * 4096 = 64k transaction so that the SLRU banks are different for
> the transaction which inserted procedure row in system table from the
> transaction in which we are trying to do the group commit

I have attached a POC patch for testing the group update using the
injection point framework.  This is just for testing the group update
part and is not yet a committable test.  I have added a bunch of logs
in the code so that we can see what's going on with the group update.
From the below logs, we can see that multiple processes are getting
accumulated for the group update and the leader is updating their xid
status.


Note: With this testing, we have found a bug in the bank-wise
approach, basically we are clearing a procglobal->clogGroupFirst, even
before acquiring the bank lock that means in most of the cases there
will be a single process in each group as a group leader (I think this
is what Alvaro was pointing in his coverage report).  I have added
this fix in this POC just for testing purposes but in my next version
I will add this fix to my proper patch version after a proper review
and a bit more testing.


here is the output after running the test
==============
2023-11-23 05:55:29.399 UTC [93367] 003_clog_group_commit.pl LOG:
procno 6 got the lock
2023-11-23 05:55:29.399 UTC [93367] 003_clog_group_commit.pl
STATEMENT:  SELECT txid_current();
2023-11-23 05:55:29.406 UTC [93369] 003_clog_group_commit.pl LOG:
statement: SELECT test_injection_points_attach('ClogGroupCommit',
'wait');
2023-11-23 05:55:29.415 UTC [93371] 003_clog_group_commit.pl LOG:
statement: INSERT INTO test VALUES(1);
2023-11-23 05:55:29.416 UTC [93371] 003_clog_group_commit.pl LOG:
procno 4 got the lock
2023-11-23 05:55:29.416 UTC [93371] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(1);
2023-11-23 05:55:29.424 UTC [93373] 003_clog_group_commit.pl LOG:
statement: INSERT INTO test VALUES(2);
2023-11-23 05:55:29.425 UTC [93373] 003_clog_group_commit.pl LOG:
procno 3 for xid 128742 added for group update
2023-11-23 05:55:29.425 UTC [93373] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(2);
2023-11-23 05:55:29.431 UTC [93376] 003_clog_group_commit.pl LOG:
statement: INSERT INTO test VALUES(3);
2023-11-23 05:55:29.438 UTC [93378] 003_clog_group_commit.pl LOG:
statement: INSERT INTO test VALUES(4);
2023-11-23 05:55:29.438 UTC [93376] 003_clog_group_commit.pl LOG:
procno 2 for xid 128743 added for group update
2023-11-23 05:55:29.438 UTC [93376] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(3);
2023-11-23 05:55:29.438 UTC [93376] 003_clog_group_commit.pl LOG:
procno 2 is follower and wait for group leader to update commit status
of xid 128743
2023-11-23 05:55:29.438 UTC [93376] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(3);
2023-11-23 05:55:29.439 UTC [93378] 003_clog_group_commit.pl LOG:
procno 1 for xid 128744 added for group update
2023-11-23 05:55:29.439 UTC [93378] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(4);
2023-11-23 05:55:29.439 UTC [93378] 003_clog_group_commit.pl LOG:
procno 1 is follower and wait for group leader to update commit status
of xid 128744
2023-11-23 05:55:29.439 UTC [93378] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(4);
2023-11-23 05:55:29.445 UTC [93380] 003_clog_group_commit.pl LOG:
statement: INSERT INTO test VALUES(5);
2023-11-23 05:55:29.446 UTC [93380] 003_clog_group_commit.pl LOG:
procno 0 for xid 128745 added for group update
2023-11-23 05:55:29.446 UTC [93380] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(5);
2023-11-23 05:55:29.446 UTC [93380] 003_clog_group_commit.pl LOG:
procno 0 is follower and wait for group leader to update commit status
of xid 128745
2023-11-23 05:55:29.446 UTC [93380] 003_clog_group_commit.pl
STATEMENT:  INSERT INTO test VALUES(5);
2023-11-23 05:55:29.451 UTC [93382] 003_clog_group_commit.pl LOG:
statement: SELECT test_injection_points_wake();
2023-11-23 05:55:29.460 UTC [93384] 003_clog_group_commit.pl LOG:
statement: SELECT test_injection_points_detach('ClogGroupCommit');

=============

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Synchronizing slots from primary to standby
Next
From: jian he
Date:
Subject: Re: remaining sql/json patches