Re: PATCH: optimized DROP of multiple tables within a transaction - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: PATCH: optimized DROP of multiple tables within a transaction |
Date | |
Msg-id | 20121210153801.GA16200@awork2.anarazel.de Whole thread Raw |
In response to | Re: PATCH: optimized DROP of multiple tables within a transaction (Tomas Vondra <tv@fuzzy.cz>) |
Responses |
Re: PATCH: optimized DROP of multiple tables within a
transaction
|
List | pgsql-hackers |
On 2012-12-08 17:07:38 +0100, Tomas Vondra wrote: > On 8.12.2012 15:49, Tomas Vondra wrote: > > On 8.12.2012 15:26, Andres Freund wrote: > >> On 2012-12-06 23:38:59 +0100, Tomas Vondra wrote: > >>> I've re-run the tests with the current patch on my home workstation, and > >>> the results are these (again 10k tables, dropped either one-by-one or in > >>> batches of 100). > >>> > >>> 1) unpatched > >>> > >>> dropping one-by-one: 15.5 seconds > >>> dropping in batches of 100: 12.3 sec > >>> > >>> 2) patched (v3.1) > >>> > >>> dropping one-by-one: 32.8 seconds > >>> dropping in batches of 100: 3.0 sec > >>> > >>> The problem here is that when dropping the tables one-by-one, the > >>> bsearch overhead is tremendous and significantly increases the runtime. > >>> I've done a simple check (if dropping a single table, use the original > >>> simple comparison) and I got this: > >>> > >>> 3) patched (v3.2) > >>> > >>> dropping one-by-one: 16.0 seconds > >>> dropping in batches of 100: 3.3 sec > >>> > >>> i.e. the best of both. But it seems like an unnecessary complexity to me > >>> - if you need to drop a lot of tables you'll probably do that in a > >>> transaction anyway. > >>> > >> > >> Imo that's still a pretty bad performance difference. And your > >> single-table optimization will probably fall short as soon as the table > >> has indexes and/or a toast table... > > > > Why? I haven't checked the code but either those objects are droppped > > one-by-one (which seems unlikely) or they're added to the pending list > > and then the new code will kick in, making it actually faster. > > > > Yes, the original code might be faster even for 2 or 3 objects, but I > > can't think of a simple solution to fix this, given that it really > > depends on CPU/RAM speed and shared_buffers size. > > I've done some test and yes - once there are other objects the > optimization falls short. For example for tables with one index, it > looks like this: > > 1) unpatched > > one by one: 28.9 s > 100 batches: 23.9 s > > 2) patched > > one by one: 44.1 s > 100 batches: 4.7 s > > So the patched code is by about 50% slower, but this difference quickly > disappears with the number of indexes / toast tables etc. > > I see this as an argument AGAINST such special-case optimization. My > reasoning is this: > > * This difference is significant only if you're dropping a table with > low number of indexes / toast tables. In reality this is not going to > be very frequent. > > * If you're dropping a single table, it really does not matter - the > difference will be like 100 ms vs. 200 ms or something like that. I don't particularly buy that argument. There are good reasons (like avoiding deadlocks, long transactions) to drop multiple tables in individual transactions. Not that I have a good plan to how to work around that though :( Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: