Re: PATCH: optimized DROP of multiple tables within a transaction - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: PATCH: optimized DROP of multiple tables within a transaction |
Date | |
Msg-id | 509FFE4E.1070002@fuzzy.cz Whole thread Raw |
In response to | Re: PATCH: optimized DROP of multiple tables within a transaction (Tomas Vondra <tv@fuzzy.cz>) |
Responses |
Re: PATCH: optimized DROP of multiple tables within a transaction
|
List | pgsql-hackers |
Hi, I've prepared a slightly updated patch, based on the previous review. See it attached. On 18.10.2012 04:28, 花田 茂 wrote:> Hi Tomas, > > On 2012/10/17, at 20:45, Tomas Vondra <tv@fuzzy.cz> wrote: >> >> Dne 17.10.2012 12:34, Shigeru HANADA napsal: >>> Performance test >>> ================ >>> I tested 1000 tables case (each is copy of pgbench_branches with >>> 100000 rows) on 1GB shared_buffers server. Please note that I >>> tested on MacBook air, i.e. storage is not HDD but SSD. Here is >>> the test procedure: >>> >>> 1) loop 1000 times >>> 1-1) create copy table of pgbench_accounts as accounts$i >>> 1-2) load 100000 rows >>> 1-3) add primary key >>> 1-4) select all rows to cache pages in shared buffer >>> 2) BEGIN >>> 3) loop 1000 times >>> 3-1) DROP TABLE accounts$i >>> 4) COMMIT >> >> I don't think the 'load rows' and 'select all rows' is really >> necessary. And AFAIK sequential scans use small circular buffer >> not to pollute sharedbuffers, so I'd guess the pages are not cached >> in shared buffers anyway. Have you verified that, e.g. by >> pg_buffercache? > > Oops, you're right. I omitted 1-3 and 1-4 in actual measurement, but > IMO loading data is necessary to fill the shared buffer up, because > # of buffers which are deleted during COMMIT is major factor of this > patch. And, yes, I verified that all shared buffers are used after > all loading have been finished. I don't think it's an important factor, as the original code does this: for (i = 0; i < NBuffers; i++) { volatile BufferDesc *bufHdr = &BufferDescriptors[i]; ... } in the DropRelFileNodeAllBuffers. That loops through all shared buffers no matter what, so IMHO the performance in this case depends on the total size of the shared buffers. But maybe I'm missing something important. >>>> Our system creates a lot of "working tables" (even 100.000) >>>> and we need to perform garbage collection (dropping obsolete >>>> tables) regularly. Thisoften took ~ 1 hour, because we're using >>>> big AWS instances with lots of RAM (which tends to be slower >>>> than RAM on bare hw). After applying this patch and dropping >>>> tables in groups of 100, the gc runs in less than 4 minutes >>>> (i.e. a 15x speed-up). >>> >>> Hm, my environment seems very different from yours. Could you >>> show the setting of shared_buffers in your environment? I'd like >>> to make my test environment as similar as possible to yours. >> >> We're using m2.4xlarge instances (70G of RAM) with 10GB shared >> buffers. > > Thank you, it's more huge than I expected. I'm not sure whether my > boss allows me to use such rich environment... :( I've done a simple benchmark on my laptop with 2GB shared buffers, it's attached in the drop-test.py (it's a bit messy, but it works). It does four things: 1) creates N tables 2) drops them one by one 3) creates N tables 4) drops them in batches (batch = one transaction) To run the script, simply specify number of tables you want to work with (say 10000), size of the batch (e.g. 100) and connection string (e.g. 'host=localhost dbname=mydb'). With those parameters, I got these numbers on the laptop: creating 10000 tables all tables created in 3.298694 seconds dropping 10000 tables one by one ... all tables dropped in 32.692478 seconds creating 10000 tables all tables created in 3.458178 seconds dropping 10000 tables in batches by 100... all tables dropped in 3.28268 seconds So it's 33 seconds vs. 3.3 seconds, i.e. 10x speedup. On AWS we usually get larger differences, as we use larger shared buffers and the memory is significantly slower there. Regarding the other comments: > * As Robert commented, this patch adds DropRelFileNodeAllBuffersList > by copying code from DropRelFileNodeAllBuffers. Please refactor it > to avoid code duplication. Yes, I've merged the two functions, i.e. I've removed the original DropRelFileNodeAllBuffers and used the name for the new one (taking array instead of single relnode). Then I've modified the existing calls to to use DropRelFileNodeAllBuffers(&relnode, 1) instead of the original one DropRelFileNodeAllBuffers(relnode) Maybe this should be done for smgrdounlink/smgrdounlinkall too. > * In smgrDoPendingDeletes, you free srels explicitly. Can't we leave > them to memory context stuff? Even it is required, at least pfree > must be called in the case nrels == 0 too. Hmmm, right. Not sure which choice is better, so for now I've moved the pfree out of the 'if' block, so it's executed for 'nrels==0' too. > * In smgrDoPendingDeletes, the buffer srels is expanded in every > iteration. This seems a bit inefficient. How about doubling the > capacity when used it up? This requires additional variable, but > reduces repalloc call considerably. Done, although I haven't seen any significant speed improvement. > * Just an idea, but if we can remove entries for local relations from > rnodes array before buffer loop in DropRelFileNodeAllBuffersList, > following bsearch might be more efficient, though dropping many > temporary tables might be rare. My reasoning, exactly. But maybe it should be done to keep the code clean, i.e. not letting temp tables to code paths where they are not expected. Tomas
Attachment
pgsql-hackers by date: