Re: [PATCH] Use optimized single-datum tuplesort in ExecSort - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Date
Msg-id CAApHDvprO=D-R63bdcw6OE1zhcqme2C5AAPhE=Qj8_zKJWG_Ww@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use optimized single-datum tuplesort in ExecSort  (James Coleman <jtc331@gmail.com>)
Responses Re: [PATCH] Use optimized single-datum tuplesort in ExecSort  (Ranier Vilela <ranier.vf@gmail.com>)
Re: [PATCH] Use optimized single-datum tuplesort in ExecSort  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Re: [PATCH] Use optimized single-datum tuplesort in ExecSort  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Fri, 16 Jul 2021 at 01:44, James Coleman <jtc331@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 9:22 PM David Rowley <dgrowleyml@gmail.com> wrote:
> >
> > On Thu, 15 Jul 2021 at 12:30, Ranier Vilela <ranier.vf@gmail.com> wrote:
> > >
> > > Em qua., 14 de jul. de 2021 às 21:21, David Rowley <dgrowleyml@gmail.com> escreveu:
> > >> But, in v8 there is no additional branch, so no branch to mispredict.
> > >> I don't really see how your explanation fits.
> > >
> > > In v8 the branch occurs at :
> > > + if (ExecGetResultType(outerPlanState(sortstate))->natts == 1)
> >
> > You do know that branch is in a function that's only executed once
> > during executor initialization, right?
>
> This is why I have a hard time believing there's a "real" change here
> and not the result of either noise or something not really
> controllable like executable layout changing.

Yeah, I think we likely are at the level where layout changes in the
compiled code are going to make things hard to measure.  I just want
to make sure we're not going to end up with some regression that's
actual and not random depending on layout changes of unrelated code.
I think a branch that's taken consistently *should* be predicted
correctly each time.

Anyway, I think all the comparisons with v7b can safely be ignored. As
Ronan pointed out, v7b has some issues due to it not recording the
sort method in the executor state that leads to it forgetting which
method it used once we start pulling tuples from it. The reproductions
of that are it calling tuplesort_gettupleslot() from the 2nd tuple
onwards regardless of if we've done a datum or tuple sort.

Ronan's latest results plus John's make me think there's no need to
separate out the node function as I did in v8.  However, I do think v6
could learn a little from v8. I think I'd rather see the sort method
determined in ExecInitSort() rather than ExecSort(). I think
minimising those few extra instructions in ExecSort() might help the
L1 instruction cache.

David



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Git revision in tarballs
Next
From: Ranier Vilela
Date:
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort