RE: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers
From | k.jamison@fujitsu.com |
---|---|
Subject | RE: [Patch] Optimize dropping of relation buffers using dlist |
Date | |
Msg-id | OSBPR01MB2341672E9A95E5EC6D2E79B5EF020@OSBPR01MB2341.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | RE: [Patch] Optimize dropping of relation buffers using dlist ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>) |
Responses |
RE: [Patch] Optimize dropping of relation buffers using dlist
RE: [Patch] Optimize dropping of relation buffers using dlist |
List | pgsql-hackers |
On Tuesday, October 13, 2020 10:09 AM, Tsunakawa-san wrote: > Why do you do this way? I think the previous patch was more correct (while > agreeing with Horiguchi-san in that nTotalBlocks may be unnecessary. What > you want to do is "if the size of any fork could be inaccurate, do the traditional > full buffer scan without performing any optimization for any fork," right? But > the above code performs optimization for forks until it finds a fork with > inaccurate size. > > (2) > + * Get the total number of cached blocks and to-be-invalidated > blocks > + * of the relation. The cached value returned by smgrnblocks could > be > + * smaller than the actual number of existing buffers of the file. > > As you changed the meaning of the smgrnblocks() argument from cached to > accurate, and you nolonger calculate the total blocks, the comment should > reflect them. > > > (3) > In smgrnblocks(), accurate is not set to false when mdnblocks() is called. > The caller doesn't initialize the value either, so it can see garbage value. > > > (4) > + if (nForkBlocks[i] != InvalidBlockNumber && > + nBlocksToInvalidate < > BUF_DROP_FULL_SCAN_THRESHOLD) > + { > ... > + } > + else > + break; > + } > > In cases like this, it's better to reverse the if and else. Thus, you can reduce > the nest depth. Thank you for the review! 1. I have revised the patch addressing your comments/feedback. Attached are the latest set of patches. 2. Non-recovery Performance I also included a debug version of the patch (0004) where I removed the recovery-related checks to measure non-recovery performance. However, I still can't seem to find the cause of why the non-recovery performance does not change when compared to master. (1 min 15 s for the given test case below) > - if (InRecovery && reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber) > + if (reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber) Here's how I measured it: 0. postgresql.conf setting shared_buffers = 100GB autovacuum = off full_page_writes = off checkpoint_timeout = 30min max_locks_per_transaction = 100 wal_log_hints = on wal_keep_size = 100 max_wal_size = 20GB 1. createdb test 2. Create tables: SELECT create_tables(1000); create or replace function create_tables(numtabs int) returns void as $$ declare query_string text; begin for i in 1..numtabs loop query_string := 'create table tab_' || i::text || ' (a int);'; execute query_string; end loop; end; $$ language plpgsql; 3 Insert rows to tables (3.5 GB db): SELECT insert_tables(1000); create or replace function insert_tables(numtabs int) returns void as $$ declare query_string text; begin for i in 1..numtabs loop query_string := 'insert into tab_' || i::text || ' SELECT generate_series(1, 100000);' ; execute query_string; end loop; end; $$ language plpgsql; 4. DELETE FROM tables: SELECT delfrom_tables(1000); create or replace function delfrom_tables(numtabs int) returns void as $$ declare query_string text; begin for i in 1..numtabs loop query_string := 'delete from tab_' || i::text; execute query_string; end loop; end; $$ language plpgsql; 5. Measure VACUUM timing \timing VACUUM; Using the debug version of the patch, I have confirmed that it enters the optimization path when it meets the conditions. Here are some printed logs from 018_wal_optimize_node_replica.log: > make world -j4 -s && make -C src/test/recovery/ check PROVE_TESTS=t/018_wal_optimize.pl WARNING: current fork 0, nForkBlocks[i] 1, accurate: 1 CONTEXT: WAL redo at 0/162B4E0 for Storage/TRUNCATE: base/13751/24577 to 0 blocks flags 7 WARNING: Optimization Loop. buf_id = 41. nforks = 1. current fork = 0. forkNum: 0 == tag's forkNum: 0. curBlock: 0 < nForkBlocks[i] = 1. tag blockNum:0 >= firstDelBlock[i]: 0. nBlocksToInvalidate = 1 < threshold = 32. -- 3. Recovery Performance (hot standby, failover) OTOH, when executing recovery performance (using 0003 patch), the results were great. | s_b | master | patched | %reg | |-------|--------|---------|--------| | 128MB | 3.043 | 2.977 | -2.22% | | 1GB | 3.417 | 3.41 | -0.21% | | 20GB | 20.597 | 2.409 | -755% | | 100GB | 66.862 | 2.409 | -2676% | To execute this on a hot standby setup (after inserting rows to tables) 1. [Standby] Pause WAL replay SELECT pg_wal_replay_pause(); 2. [Master] Measure VACUUM timing. Then stop server. \timing VACUUM; \q pg_ctl stop -mi -w 3. [Standby] Use the attached script to promote standby and measure the performance. # test.sh recovery So the current issue I'm still investigating is why the performance for non-recovery is bad, while OTOH it's good when InRecovery. Regards, Kirk Jamison
Attachment
pgsql-hackers by date: