Thread: tuplestore_putvalues()
Attached is a patch that allows an array of Datums + nulls to be inserted into a tuplestore without first creating a HeapTuple, per recent suggestion on -hackers. This avoids making an unnecessary copy. There isn't a really analogous optimization to be applied to tuplesort: it takes either a TTS, an IndexTuple or a basic Datum, none of which involve an extra copy. BTW, I notice that almost all of the callers of the various tuplestore_put methods switch into the tuplestore's context first. We could simplify their lives a bit by having the tuplestore remember the context in which it is allocated and do the switch itself. Perhaps it's not worth bothering with at this point, though. -Neil
Attachment
Neil Conway <neilc@samurai.com> writes: > Attached is a patch that allows an array of Datums + nulls to be > inserted into a tuplestore without first creating a HeapTuple, per > recent suggestion on -hackers. This avoids making an unnecessary copy. A small thought here: we were jousting recently over a point that came down to whether or not tuplestore kept track of the tupdesc for the tuples it was storing. I can hardly imagine a use-case for a tuplestore in which the tuples don't all have the same tupdesc. I think I dropped tupdesc from tuplestore's original API on the grounds that it wasn't doing anything much with the tupdesc. But now this patch adds back a tuplestore API call that needs the tupdesc. Would it be saner to supply the tupdesc to tuplestore_begin_heap instead, as tuplesort does? I haven't looked at all into what the implications of this would be, either from a performance or number-of-places-to-change standpoint. But it seems worth a bit of investigation while we're touching the code. Other than that issue, the patch seems OK in a quick once-over. regards, tom lane
Neil Conway <neilc@samurai.com> writes: > Attached is a patch that allows an array of Datums + nulls to be > inserted into a tuplestore without first creating a HeapTuple, per > recent suggestion on -hackers. This avoids making an unnecessary copy. > There isn't a really analogous optimization to be applied to tuplesort: > it takes either a TTS, an IndexTuple or a basic Datum, none of which > involve an extra copy. After a quick read, looks sane except for one stylistic gripe: in exec_stmt_return_next, you added an initialization of tuple = NULL in order to remove a couple of lines like tuple = NULL; /* keep compiler quiet */ I think this is not good coding style. The point of not having the initialization is so that the compiler will warn us if there are any paths through the function in which we fail to set "tuple". You've just disabled that bit of early-warning error detection. It's better if each switch arm has to set tuple for itself. (If only a minority of them needed to do it, I might think differently, but in this case I'd vote for sticking with the way it's being done.) regards, tom lane
On Sat, 2008-03-22 at 21:35 -0400, Tom Lane wrote: > After a quick read, looks sane except for one stylistic gripe: > in exec_stmt_return_next, you added an initialization of tuple = NULL > in order to remove a couple of lines like > > tuple = NULL; /* keep compiler quiet */ > > I think this is not good coding style. The point of not having the > initialization is so that the compiler will warn us if there are > any paths through the function in which we fail to set "tuple". > You've just disabled that bit of early-warning error detection. > It's better if each switch arm has to set tuple for itself. > (If only a minority of them needed to do it, I might think > differently, but in this case I'd vote for sticking with the > way it's being done.) I agree with your general reasoning, but in this case, it seemed cleaner to initialize "tuple" when it is declared: only 2 of the 6 branches in the function actually initialize tuple to a non-NULL value. I think the warn-about-uninitialized-value argument doesn't hold much water in this particular case, either: each branch of the function either makes use of "tuple", or does not. When "tuple" is utilized, we want to push it into the tuplestore; otherwise, we want to do nothing. Since "tuple" is only used in two cases, it seems cleaner to me to have the default be "do nothing", and only deviate from it where needed. -Neil
On Thu, 2008-02-28 at 16:37 -0800, Neil Conway wrote: > Attached is a patch that allows an array of Datums + nulls to be > inserted into a tuplestore without first creating a HeapTuple, per > recent suggestion on -hackers. This avoids making an unnecessary copy. Applied to HEAD. -Neil