Re: Rethinking TupleTableSlot deforming - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Rethinking TupleTableSlot deforming
Date
Msg-id 6941.1469214032@sss.pgh.pa.us
Whole thread Raw
In response to Re: Rethinking TupleTableSlot deforming  (Andres Freund <andres@anarazel.de>)
Responses Re: Rethinking TupleTableSlot deforming  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2016-07-22 10:09:18 -0400, Tom Lane wrote:
>> I'm really suspicious of this line of argument as well.  It's possible
>> that if you only consider all-fixed-width, never-null columns, it might
>> look like deforming the columns before the one you need is a waste of
>> effort.  But as soon as you relax either of those assumptions, you have
>> to crawl over the earlier columns anyway, and saving aside the results
>> is going to be close to free.

> Not according to my measurements. And that doesn't seems that
> surprising. For a null check you'll need to access (not cheaply so!) the
> null bitmap, to skip a varlena datum one needs to access the varlena
> header. Copying the actual datum, *especially* in the varlena case, is a
> more expensive than that; especially because otherwise you'll often not
> have to touch the source cachelines at all.

But that's nonsense.  We don't copy varlena datums during tuple deforming,
only save aside pointers to them.  And we would have to access the
column's null bit and varlena header in any case if our goal is to find a
column located later.  Yeah, we could skip physically storing the pointer
into the slot's Datum array, but not much more than that.

>> Maybe we could even go further, and require the planner
>> to always set up the input so that the sort/group columns are exactly 1..N
>> in order, removing the need for the executor to cope with any variation.

> I guess that columns both returned to the upper node, and involved in
> sorting would make it hard to entirely and efficiently rely on that,
> without forcing more expensive projections. But it's probably worth
> tinkering with.

Well, it's a question of whether an extra projection at the scan level is
worth the savings in column access during the sort or group stage.  My gut
feeling is that it very likely would be a win for a multicolumn sort.
(For a single sort key column we already amortize fetching of the column
datum, so maybe not so much in that case.)

Whether the column is needed at upper levels doesn't seem relevant to me.
setrefs.c will find it wherever it is.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Rethinking TupleTableSlot deforming
Next
From: Merlin Moncure
Date:
Subject: Re: Bug with plpgsql handling of NULL argument of compound type