Avoiding tuple construction/deconstruction during joining - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Avoiding tuple construction/deconstruction during joining |
Date | |
Msg-id | 14577.1110821062@sss.pgh.pa.us Whole thread Raw |
In response to | How to read query plan (Miroslav Šulc <miroslav.sulc@startnet.cz>) |
Responses |
Re: Avoiding tuple construction/deconstruction during joining
Re: Avoiding tuple construction/deconstruction during joining Re: Avoiding tuple construction/deconstruction during joining |
List | pgsql-performance |
=?ISO-8859-2?Q?Miroslav_=A9ulc?= <miroslav.sulc@startnet.cz> writes: > [ concerning a deeply nested LEFT JOIN to get data from a star schema ] > So I have some results. I have tested the query on both PostgreSQL 8.0.1 > and MySQL 4.1.8 with LIMIT set to 30 and OFFSET set to 6000. PostgreSQL > result is 11,667.916 ms, MySQL result is 448.4 ms. That's a fairly impressive discrepancy :-(, and even the slot_getattr() patch that Atsushi Ogawa provided isn't going to close the gap. (I got about a 4x speedup on Miroslav's example in my testing, which leaves us still maybe 6x slower than MySQL.) Looking at the post-patch profile for the test case, there is still quite a lot of cycles going into tuple assembly and disassembly: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 24.47 4.49 4.49 _mcount 8.01 5.96 1.47 9143692 0.00 0.00 ExecEvalVar 6.92 7.23 1.27 6614373 0.00 0.00 slot_deformtuple 6.54 8.43 1.20 9143692 0.00 0.00 slot_getattr 6.21 9.57 1.14 103737 0.01 0.03 ExecTargetList 5.56 10.59 1.02 103775 0.01 0.01 DataFill 3.22 11.18 0.59 103775 0.01 0.01 ComputeDataSize 2.83 11.70 0.52 ExecEvalVar 2.72 12.20 0.50 9094122 0.00 0.00 memcpy 2.51 12.66 0.46 encore 2.40 13.10 0.44 427448 0.00 0.00 nocachegetattr 2.13 13.49 0.39 103775 0.00 0.02 heap_formtuple 2.07 13.87 0.38 noshlibs 1.20 14.09 0.22 225329 0.00 0.00 _doprnt 1.20 14.31 0.22 msquadloop 1.14 14.52 0.21 chunks 0.98 14.70 0.18 871885 0.00 0.00 AllocSetAlloc 0.98 14.88 0.18 $$dyncall 0.76 15.02 0.14 594242 0.00 0.00 FunctionCall3 0.71 15.15 0.13 213312 0.00 0.00 comparetup_heap 0.65 15.27 0.12 6364 0.02 0.13 printtup 0.60 15.38 0.11 790702 0.00 0.00 pfree (_mcount is profiling overhead, ignore it.) It looks to me like just about everything in the top dozen functions is there as a result of the fact that join steps form new tuples that are the merge of their input tuples. Even our favorite villains, palloc and pfree, are down in the sub-percent range. I am guessing that the reason MySQL wins on this is that they avoid doing any data copying during a join step. I wonder whether we could accomplish the same by taking Ogawa's patch to the next level: allow a TupleTableSlot to contain either a "materialized" tuple as now, or a "virtual" tuple that is simply an array of Datums and null flags. (It's virtual in the sense that any pass-by-reference Datums would have to be pointing to data at the next level down.) This would essentially turn the formtuple and deformtuple operations into no-ops, and get rid of a lot of the associated overhead such as ComputeDataSize and DataFill. The only operations that would have to forcibly materialize a tuple would be ones that need to keep the tuple till after they fetch their next input tuple --- hashing and sorting are examples, but very many plan node types don't ever need to do that. I haven't worked out the details, but it seems likely that this could be a relatively nonintrusive patch. The main thing that would be an issue would be that direct reference to slot->val would become verboten (since you could no longer be sure there was a materialized tuple there). I think this would possibly affect some contrib stuff, which is a strong hint that it'd break some existing user-written code out there. Thoughts? regards, tom lane
pgsql-performance by date: