Home > mailing lists

Re: PATCH: optimized DROP of multiple tables within a transaction - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: PATCH: optimized DROP of multiple tables within a transaction
Date	December 10, 2012 15:38:06
Msg-id	20121210153801.GA16200@awork2.anarazel.de Whole thread Raw
In response to	Re: PATCH: optimized DROP of multiple tables within a transaction (Tomas Vondra <tv@fuzzy.cz>)
Responses	Re: PATCH: optimized DROP of multiple tables within a transaction
List	pgsql-hackers

Tree view

On 2012-12-08 17:07:38 +0100, Tomas Vondra wrote:
> On 8.12.2012 15:49, Tomas Vondra wrote:
> > On 8.12.2012 15:26, Andres Freund wrote:
> >> On 2012-12-06 23:38:59 +0100, Tomas Vondra wrote:
> >>> I've re-run the tests with the current patch on my home workstation, and
> >>> the results are these (again 10k tables, dropped either one-by-one or in
> >>> batches of 100).
> >>>
> >>> 1) unpatched
> >>>
> >>> dropping one-by-one:        15.5 seconds
> >>> dropping in batches of 100: 12.3 sec
> >>>
> >>> 2) patched (v3.1)
> >>>
> >>> dropping one-by-one:        32.8 seconds
> >>> dropping in batches of 100:  3.0 sec
> >>>
> >>> The problem here is that when dropping the tables one-by-one, the
> >>> bsearch overhead is tremendous and significantly increases the runtime.
> >>> I've done a simple check (if dropping a single table, use the original
> >>> simple comparison) and I got this:
> >>>
> >>> 3) patched (v3.2)
> >>>
> >>> dropping one-by-one:        16.0 seconds
> >>> dropping in batches of 100:  3.3 sec
> >>>
> >>> i.e. the best of both. But it seems like an unnecessary complexity to me
> >>> - if you need to drop a lot of tables you'll probably do that in a
> >>> transaction anyway.
> >>>
> >>
> >> Imo that's still a pretty bad performance difference. And your
> >> single-table optimization will probably fall short as soon as the table
> >> has indexes and/or a toast table...
> >
> > Why? I haven't checked the code but either those objects are droppped
> > one-by-one (which seems unlikely) or they're added to the pending list
> > and then the new code will kick in, making it actually faster.
> >
> > Yes, the original code might be faster even for 2 or 3 objects, but I
> > can't think of a simple solution to fix this, given that it really
> > depends on CPU/RAM speed and shared_buffers size.
>
> I've done some test and yes - once there are other objects the
> optimization falls short. For example for tables with one index, it
> looks like this:
>
>   1) unpatched
>
>   one by one:  28.9 s
>   100 batches: 23.9 s
>
>   2) patched
>
>   one by one:  44.1 s
>   100 batches:  4.7 s
>
> So the patched code is by about 50% slower, but this difference quickly
> disappears with the number of indexes / toast tables etc.
>
> I see this as an argument AGAINST such special-case optimization. My
> reasoning is this:
>
> * This difference is significant only if you're dropping a table with
>   low number of indexes / toast tables. In reality this is not going to
>   be very frequent.
>
> * If you're dropping a single table, it really does not matter - the
>   difference will be like 100 ms vs. 200 ms or something like that.

I don't particularly buy that argument. There are good reasons (like
avoiding deadlocks, long transactions) to drop multiple tables
in individual transactions.
Not that I have a good plan to how to work around that though :(

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

pgsql-hackers by date:

From: Andres Freund
Date: 10 December 2012, 15:32:52
Subject: Re: CommitFest #3 and upcoming schedule

From: Tom Lane
Date: 10 December 2012, 15:48:39
Subject: Re: Support for REINDEX CONCURRENTLY

Re: PATCH: optimized DROP of multiple tables within a transaction - Mailing list pgsql-hackers

Previous

Next