Re: MultiXact\SLRU buffers configuration - Mailing list pgsql-hackers
From | Shawn Debnath |
---|---|
Subject | Re: MultiXact\SLRU buffers configuration |
Date | |
Msg-id | YemDdpMrsoJFQJnU@f01898859afd.ant.amazon.com Whole thread Raw |
In response to | Re: MultiXact\SLRU buffers configuration (Andrey Borodin <x4mmm@yandex-team.ru>) |
Responses |
Re: MultiXact\SLRU buffers configuration
|
List | pgsql-hackers |
On Sat, Jan 15, 2022 at 12:16:59PM +0500, Andrey Borodin wrote: > > I was planning on running a set of stress tests on these patches. Could > > we confirm which ones we plan to include in the commitfest? > > Many thanks for your interest. Here's the latest version. Here are the results of the multixact perf test I ran on the patch that splits the linear SLRU caches into banks. With my test setup, the binaries with the patch applied performed slower marginally across the test matrix against unpatched binaries. Here are the results: +-------------------------------+---------------------+-----------------------+------------+ | workload | patched average tps | unpatched average tps | difference | +-------------------------------+---------------------+-----------------------+------------+ | create only | 10250.54396 | 10349.67487 | -1.0% | | create and select | 9677.711286 | 9991.065037 | -3.2% | | large cache create only | 10310.96646 | 10337.16455 | -0.3% | | large cache create and select | 9654.24077 | 9924.270242 | -2.8% | +-------------------------------+---------------------+-----------------------+------------+ The test was configured in the following manner: - AWS EC2 c5d.24xlarge instances, located in the same AZ, were used as the database host and the test driver. These systems have 96 vcpus and 184 GB memory. NVMe drives were configured as RAID5. - GUCs were changed from defaults to be the following: max_connections = 5000 shared_buffers = 96GB max_wal_size = 2GB min_wal_size = 192MB - pgbench runs were done with -c 1000 -j 1000 and a scale of 10,000 - Two multixact workloads were tested, first [0] was a create only script which selected 100 pgbench_account rows for share. Second workload [1] added a select statement to visit rows touched in the past which had multixacts generated for them. pgbench test script [2] wraps the call to the functions inside an explicit transaction. - Large cache tests are multixact offsets cache size hard coded to 128 and members cache size hard coded to 256. - Row selection is based on time based approach that lets all client connections coordinate which rows to work with based on the millisecond they start executing. To allow for more multixacts to be generated and reduce contention, the workload uses offsets ahead of the start id based on a random number. - The one bummer about these runs were that they only ran for 600 seconds for insert only and 400 seconds for insert and select. I consistently ran into checkpointer getting oom-killed on this instance after that timeframe. Will dig into this separately. But the TPS was consistent. - Each test was repeated at least 3 times and the average of those runs were used. - I am using the master branch and changes were applied on commit f47ed79cc8a0cfa154dc7f01faaf59822552363f I think patch 1 is a must-have. Regarding patch 2, I would propose we avoid introducing more complexity into SimpleLRU cache and instead focus on making the SLRU to buffer cache effort [3] a reality. I would also add that we have a few customers in our fleet who have been successfully running the large cache configuration on the regular SLRU without any issues. With cache sizes this small, the linear searches are still quite efficient. If my test workload can be made better, please let me know. Happy to re-run tests as needed. [0] https://gist.github.com/sdebnath/e015561811adf721dd40dd6638969c69 [1] https://gist.github.com/sdebnath/2f3802e1fe288594b6661a7a59a7ca07 [2] https://gist.github.com/sdebnath/6bbfd5f87945a7d819e30a9a1701bc97 [3] https://www.postgresql.org/message-id/CA%2BhUKGKAYze99B-jk9NoMp-2BDqAgiRC4oJv%2BbFxghNgdieq8Q%40mail.gmail.com -- Shawn Debnath Amazon Web Services (AWS)
pgsql-hackers by date: