Re: [PATCH] Speedup truncates of relation forks - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [PATCH] Speedup truncates of relation forks |
Date | |
Msg-id | 20190611230900.ctj4sz5zflzdxkqr@development Whole thread Raw |
In response to | [PATCH] Speedup truncates of relation forks ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>) |
Responses |
Re: [PATCH] Speedup truncates of relation forks
RE: [PATCH] Speedup truncates of relation forks |
List | pgsql-hackers |
On Tue, Jun 11, 2019 at 07:34:35AM +0000, Jamison, Kirk wrote: >Hi all, > >Attached is a patch to speed up the performance of truncates of relations. >This is also my first time to contribute my own patch, >and I'd gladly appreciate your feedback and advice. > Thanks for the patch. Please add it to the commitfest app, so that we don't forget about it: https://commitfest.postgresql.org/23/ > >A. Summary > >Whenever we truncate relations, it scans the shared buffers thrice >(one per fork) which can be time-consuming. This patch improves >the performance of relation truncates by initially marking the >pages-to-be-truncated of relation forks, then simultaneously >truncating them, resulting to an improved performance in VACUUM, >autovacuum operations and their recovery performance. > OK, so essentially the whole point is to scan the buffers only once, for all forks at the same time (instead of three times). > >B. Patch Details >The following functions were modified: > > >1. FreeSpaceMapTruncateRel() and visibilitymap_truncate() > >a. CURRENT HEAD: These functions truncate the FSM pages and unused VM pages. > >b. PATCH: Both functions only mark the pages to truncate and return a block number. > >- We used to call smgrtruncate() in these functions, but these are now moved inside the RelationTruncate() and smgr_redo(). > >- The tentative renaming of the functions are: MarkFreeSpaceMapTruncateRel() and visibilitymap_mark_truncate(). Feelfree to suggest better names. > > >2. RelationTruncate() > >a. HEAD: Truncate FSM and VM first, then write WAL, and lastly truncate main fork. > >b. PATCH: Now we mark FSM and VM pages first, write WAL, mark MAIN fork pages, then truncate all forks (MAIN, FSM,VM) simultaneously. > > >3. smgr_redo() > >a. HEAD: Truncate main fork and the relation during XLOG replay, create fake rel cache for FSM and VM, truncate FSM,truncate VM, then free fake rel cache. > >b. PATCH: Mark main fork dirty buffers, create fake rel cache, mark fsm and vm buffers, truncate marked pages of relationforks simultaneously, truncate relation during XLOG replay, then free fake rel cache. > > >4. smgrtruncate(), DropRelFileNodeBuffers() > >- input arguments are changed to array of forknum and block numbers, int nforks (size of forkNum array) > >- truncates the pages of relation forks simultaneously > > >5. smgrdounlinkfork() >I modified the function because it calls DropRelFileNodeBuffers. However, this is a dead code that can be removed. >I did not remove it for now because that's not for me but the community to decide. > You really don't need to extract the changes like this - such changes are generally obvious from the diff. You only need to explain things that are not obvious from the code itself, e.g. non-trivial design decisions, etc. > >C. Performance Test > >I setup a synchronous streaming replication between a master-standby. > >In postgresql.conf: >autovacuum = off >wal_level = replica >max_wal_senders = 5 >wal_keep_segments = 16 >max_locks_per_transaction = 10000 >#shared_buffers = 8GB >#shared_buffers = 24GB > >Objective: Measure VACUUM execution time; varying shared_buffers size. > >1. Create table (ex. 10,000 tables). Insert data to tables. >2. DELETE FROM TABLE (ex. all rows of 10,000 tables) >3. psql -c "\timing on" (measures total execution of SQL queries) >4. VACUUM (whole db) > >If you want to test with large number of relations, >you may use the stored functions I used here: >http://bit.ly/reltruncates > > >D. Results > >HEAD results >1) 128MB shared_buffers = 48.885 seconds >2) 8GB shared_buffers = 5 min 30.695 s >3) 24GB shared_buffers = 14 min 13.598 s > >PATCH results >1) 128MB shared_buffers = 42.736 s >2) 8GB shared_buffers = 2 min 26.464 s >3) 24GB shared_buffers = 5 min 35.848 s > >The performance significantly improved compared to HEAD, >especially for large shared buffers. > Right, that seems nice. And it matches the expected 1:3 speedup, at least for the larger shared_buffers cases. Years ago I've implemented an optimization for many DROP TABLE commands in a single transaction - instead of scanning buffers for each relation, the code now accumulates a small number of relations into an array, and then does a bsearch for each buffer. Would something like that be applicable/useful here? That is, if we do multiple TRUNCATE commands in a single transaction, can we optimize it like this? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: