Re: PATCH: optimized DROP of multiple tables within a transaction - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: PATCH: optimized DROP of multiple tables within a transaction |
Date | |
Msg-id | 50C365CA.4070408@fuzzy.cz Whole thread Raw |
In response to | Re: PATCH: optimized DROP of multiple tables within a transaction (Tomas Vondra <tv@fuzzy.cz>) |
Responses |
Re: PATCH: optimized DROP of multiple tables within a transaction
Re: PATCH: optimized DROP of multiple tables within a transaction |
List | pgsql-hackers |
On 8.12.2012 15:49, Tomas Vondra wrote: > On 8.12.2012 15:26, Andres Freund wrote: >> On 2012-12-06 23:38:59 +0100, Tomas Vondra wrote: >>> I've re-run the tests with the current patch on my home workstation, and >>> the results are these (again 10k tables, dropped either one-by-one or in >>> batches of 100). >>> >>> 1) unpatched >>> >>> dropping one-by-one: 15.5 seconds >>> dropping in batches of 100: 12.3 sec >>> >>> 2) patched (v3.1) >>> >>> dropping one-by-one: 32.8 seconds >>> dropping in batches of 100: 3.0 sec >>> >>> The problem here is that when dropping the tables one-by-one, the >>> bsearch overhead is tremendous and significantly increases the runtime. >>> I've done a simple check (if dropping a single table, use the original >>> simple comparison) and I got this: >>> >>> 3) patched (v3.2) >>> >>> dropping one-by-one: 16.0 seconds >>> dropping in batches of 100: 3.3 sec >>> >>> i.e. the best of both. But it seems like an unnecessary complexity to me >>> - if you need to drop a lot of tables you'll probably do that in a >>> transaction anyway. >>> >> >> Imo that's still a pretty bad performance difference. And your >> single-table optimization will probably fall short as soon as the table >> has indexes and/or a toast table... > > Why? I haven't checked the code but either those objects are droppped > one-by-one (which seems unlikely) or they're added to the pending list > and then the new code will kick in, making it actually faster. > > Yes, the original code might be faster even for 2 or 3 objects, but I > can't think of a simple solution to fix this, given that it really > depends on CPU/RAM speed and shared_buffers size. I've done some test and yes - once there are other objects the optimization falls short. For example for tables with one index, it looks like this: 1) unpatched one by one: 28.9 s 100 batches: 23.9 s 2) patched one by one: 44.1 s 100 batches: 4.7 s So the patched code is by about 50% slower, but this difference quickly disappears with the number of indexes / toast tables etc. I see this as an argument AGAINST such special-case optimization. My reasoning is this: * This difference is significant only if you're dropping a table with low number of indexes / toast tables. In reality thisis not going to be very frequent. * If you're dropping a single table, it really does not matter - the difference will be like 100 ms vs. 200 ms or somethinglike that. * If you're dropping a lot of tables, you should do than in a transaction anyway, or you should be aware that doing thatin a transaction will be faster (we can add this info into DROP TABLE docs). IMHO this is similar to doing a lot of INSERTs - doing all of them in a single transaction is faster. The possibility that someone needs to drop a lot of tables in separate transactions exists in theory, but I don't know of a realistic real-world usage. So I'd vote to ditch that special case optimization. >>> + >>> +/* >>> + * Used to sort relfilenode array (ordered by [relnode, dbnode, spcnode]), so >>> + * that it's suitable for bsearch. >>> + */ >>> +static int >>> +rnode_comparator(const void * p1, const void * p2) >>> +{ >>> + RelFileNodeBackend n1 = * (RelFileNodeBackend *) p1; >>> + RelFileNodeBackend n2 = * (RelFileNodeBackend *) p2; >>> + >>> + if (n1.node.relNode < n2.node.relNode) >>> + return -1; >>> + else if (n1.node.relNode > n2.node.relNode) >>> + return 1; >>> + >>> + if (n1.node.dbNode < n2.node.dbNode) >>> + return -1; >>> + else if (n1.node.dbNode > n2.node.dbNode) >>> + return 1; >>> + >>> + if (n1.node.spcNode < n2.node.spcNode) >>> + return -1; >>> + else if (n1.node.spcNode > n2.node.spcNode) >>> + return 1; >>> + else >>> + return 0; >>> +} >> >> ISTM that this whole comparator could be replaced by memcmp(). That >> could quite possibly lower the overhead of the bsearch() in simple >> cases. We already rely on the fact that RelFileNode's have no padding >> atm (c.f. buf_internal.h), so a memcmp() seems fine to me. > > Good point, I'll try that. The original code used a macro that simply > compared the fields and I copied that logic, but if we can remove the > code ... Hmmm, I've replaced the comparator with this one: static int rnode_comparator(const void * p1, const void * p2) { return memcmp(p1, p2, sizeof(RelFileNode)); } and while it's not significantly faster (actually it's often a bit slower than the original comparator), it's a much simpler code. Tomas
pgsql-hackers by date: