Re: [HACKERS] WIP: Faster Expression Processing v4 - Mailing list pgsql-hackers

From Douglas Doole
Subject Re: [HACKERS] WIP: Faster Expression Processing v4
Date
Msg-id CADE5jYL7a8_b2LtwTJJk4ALLAKfCN1o9PWY3jS6D0e+3niOX8g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] WIP: Faster Expression Processing v4  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] WIP: Faster Expression Processing v4  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres, sorry I haven't had a chance to look at this great stuff you've been doing. I've wanted to get to it, but work keeps getting in the way. ;-)

I do have one observation based on my experiments with your first version of the code. In my tests, I found that expression init becomes a lot more expensive in this new model. (That's neither a surprise, nor a concern.) In particular, the function ExprEvalPushStep() is quite hot. In my code I made the following changes:

  * Declare ExprEvalPushStep() "inline".
  * Remove the "if (es->steps_alloc == 0)" condition from ExprEvalPushStep().
  * In ExecInitExpr(), add:
       state->steps_alloc = 16;
       state->steps = palloc(sizeof(ExprEvalStep) * es->steps_alloc);

I found that this cut the cost of initializing the expression by about 20%. (Of course, that was on version 1 of your code, so the benefit may well be different now.)

On Tue, Mar 14, 2017 at 11:51 AM Andres Freund <andres@anarazel.de> wrote:
> Hmm. Could we make the instructions variable size? It would allow packing
> the small instructions even more tight, and we wouldn't need to obsess over
> a particular maximum size for more complicated instructions.

That makes jumps a lot more complicated.  I'd experimented with it and
given it up as "not worth it". 

Back when I was at IBM, I spent a lot of time doing stuff like this. If you want to commit with the fixed size arrays, I'm happy to volunteer to look at packing it tighter as a follow-on piece of work. (It was already on my list of things to try anyhow.)
 
If we were to try to do so, we'd also
not like storing the pointer and enum variants both, since it'd again
would reduce the density.
 
From my experience, it's worth the small loss in density to carry around both the pointer and the enum - it makes debugging so much easier.

- Doug
Salesforce

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] GUC for cleanup indexes threshold.
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] multivariate statistics (v25)