Rethinking TupleTableSlot deforming - Mailing list pgsql-hackers

From Andres Freund
Subject Rethinking TupleTableSlot deforming
Date
Msg-id 20160722015605.hpthk7axm6sx2mur@alap3.anarazel.de
Whole thread Raw
Responses Re: Rethinking TupleTableSlot deforming
Re: Rethinking TupleTableSlot deforming
List pgsql-hackers
Hi,

I've previously mentioned (e.g. [1]) that tuple deforming is a serious
bottlneck. I've also experimented successfully [2] making
slot_deform_tuple() faster.

But nontheless, tuple deforming is still a *major* bottleneck in many
cases, if not the *the* major bottleneck.

We could partially address that by JITing the work slot_deform_tuple
does. Various people have, with good but not raving success, played with
that.

Alternatively/Additionally we can change the tuple format to make
deforming faster.


But I think the bigger issue than the above is actually that we're just
performing a lot of useless work in a number of common scenarios. We're
always deforming all columns up to the one needed. Very often that's a
lot of useless work.  I've experimented with selectively replacing
slot_getattr calls heap_getattr(), and for some queries that can yield
massive speedups. And obviously significant slowdowns in others.  That's
the case even when preceding columns are varlena and/or contain nulls.
I.e. a good chunk of the problem is storing the results of deforming,
not accessing the data.


ISTM, we need to change slots so that they contain information about
which columns are interesting. For the hot paths we'd then only ever
allow access to those columns, and we'd only ever deform them.  Combined
with the approach in [2] that allows us to deform tuples a lot more
efficiently.

What I'm basically thinking is that expression evaluation would always
make sure the slots have computed the relevant column set, and deform at
the beginning. There's some cases where we likely would still need to
fall back to a slower path (e.g. whole row refs), but that seems fine.

That then also allows us to nearly always avoid the slot_getattr() call,
and instead look at tts_values/nulls directly. The checks slot_getattr()
performs, and the call itself, are quite expensive.


What I'm thinking about is
a) a new ExecInitExpr()/ExecBuildProjectionInfo() which always compute a set of  interesting columns.
b) replacing all accesses to tts_values/isnull with an inline  function. In optimized builds that functions won't do
anythingbut  reference the relevant element, but in assert enabled builds it'd  check whether said column is actually
knownto be accessed.
 
c) Make ExecEvalExpr(), ExecProject(), ExecQual() (and perhaps some  other places) call the new deforming function
whichensures the  relevant columns are available.
 
d) Replace nearly all slot_getattr/slot_getsomeattrs calls with the  function introduced in b).

To me it seems this work will be a good bit easier once [2] is actually
implemented instead of prototyped, because treating ExecInitExpr()
non-recursively allows to build such 'column sets' more easily /
naturally.


Comments? Alternative suggestions?


Greetings,

Andres Freund

[1] http://archives.postgresql.org/20160624232953.beub22r6yqux4gcp@alap3.anarazel.de
[2] http://archives.postgresql.org/message-id/20160714011850.bd5zhu35szle3n3c%40alap3.anarazel.de



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Password identifiers, protocol aging and SCRAM protocol
Next
From: Jeff Janes
Date:
Subject: Re: fixes for the Danish locale