Re: WIP: expression evaluation improvements - Mailing list pgsql-hackers

From Andres Freund
Subject Re: WIP: expression evaluation improvements
Date
Msg-id 20211104234742.ao2qzqf5kpr2wgzn@alap3.anarazel.de
Whole thread Raw
In response to Re: WIP: expression evaluation improvements  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: WIP: expression evaluation improvements
List pgsql-hackers
Hi,

I pushed a rebased (ugh, that was painul) version of the patches to
https://github.com/anarazel/postgres/tree/jit-relative-offsets

Besides rebasing I dropped a few patches and did some *minor* cleanup. Besides
that there's one substantial improvement, namely that I got rid of one more
absolute pointer reference (in the aggregate steps).

The main sources for pointers that remain is FunctionCallInfo->{flinfo,
context}. There's also WindowFuncExprState->wfuncno (which isn't yet known at
"expression compile time"), but that's not too hard to solve differently.


On 2021-11-04 12:30:00 -0400, Robert Haas wrote:
> As a general comment, I think the patches could really benefit from
> more meaningful commit messages and more comments on individual
> functions. It would definitely help me review, and it might help other
> people review, or modify the code later.

Sure. I was mostly exploring what would be necessary to to change expression
evaluation so that there's no absolute pointers in it. I still haven't figured
out all the necessary bits.


> For example, I'm looking at ExprEvalStep. If the intent here is that we
> don't want the union members to point to data that might differ from one
> execution of the plan to the next, it's surely important to mention that and
> explain to people who are trying to add steps later what they should do
> instead.  But I'm also not entirely sure that's the intended rule. It kind
> of surprises me that the only things that we'd be pointing to here that
> would fall into that category would be a bool, a NullableDatum, a
> NullableDatum array, and a FunctionCallInfo ... but I've been surprised by a
> lot of things that turned out to be true.

The immediate goal is to be able to generate JITed code/LLVM-IR that doesn't
contain any absolute pointer values. If the generated code doesn't change
regardless of any of the other contents of ExprEvalStep, we can still cache
the JIT optimization / code emission steps - which are the expensive bits.

With the exception of what I listed at the top, the types that you listed
really are what's needed to avoid such pointer constants. There are more
contents in the steps, but either they are constants (and thus just can be
embedded into the generated code), the expression step is just passed to
ExecEval*, or the data can just be loaded from the ExprStep at runtime
(although that makes the generated code slower).


There's a "more advanced" version of this where we can avoid recreating
ExprStates for e.g. prepared statements. Then we'd need to make a bit more of
the data use relative pointers. But that's likely a bit further off.  A more
moderate version will be to just store the number of steps for expressions
inside the expressions - for simple queries the allocation / growing / copying
of ExprSteps is quite visible.

FWIW interpreted execution does seem to win a bit from the higher density of
memory allocations for variable data this provides.


> I am not a huge fan of the various Rel[Whatever] typedefs. I am not
> sure that's really adding any clarity. On the other hand I would be a
> big fan of renaming the structure members in some systematic way. This
> kind of thing doesn't sit well with me:

I initially had all the Rel* use the same type, and it was much more error
prone because the compiler couldn't tell that the types are different.


> - NullableDatum *value; /* value to return */
> + RelNullableDatum value; /* value to return */
> 
> Well, if NullableDatum was the value to return, then RelNullableDatum
> isn't.It's some kind of thing that lets you find the value to return.

I don't really know what you mean? It's essentially just a different type of
pointer?


> Is it true that allocno is only used for, err, some kind of LLVM
> thing, and not in the regular interpreted path? As far as I can see,
> outside of the LLVM code, we only ever test whether it's 0, and don't
> actually care about the specific value.

I'd expect it to be useful for a few interpreded cases as well, but right now
it's not.


> I hope that the fact that this patch reverses the order of the first
> two arguments to ExecInitExprRec is only something you did to make it
> so that the compiler would find places you needed to update. Because
> otherwise it makes no sense to introduce a new thing called an
> ExprStateBuilder in 0017, make it an argument to that function, and
> then turn around and change the signature again in 0026. Anyway, a
> final patch shouldn't include this kind of churn.

Yes, that definitely needs to go.


> + offsetof(ExprState, steps) * esb->steps_len * sizeof(ExprEvalStep) +
> + state->mutable_off = offsetof(ExprState, steps) * esb->steps_len *
> sizeof(ExprEvalStep);
> 
> Well, either I'm confused here, or the first * should be a + in each
> case. I wonder how this works at all.

Oh. yes, that doesn't look right. I assume it's just always too big, and
that's why it doesn't cause problems...


> + /* copy in step data */
> + {
> + ListCell *lc;
> + int off = 0;
> +
> + foreach(lc, esb->steps)
> + {
> + memcpy(&state->steps[off], lfirst(lc), sizeof(ExprEvalStep));
> + off++;
> + }
> + }
> 
> This seems incredibly pointless to me. Why use a List in the first
> place if we're going to have to flatten it using this kind of code?

We don't know how many steps an expression is going to require. It turns out
that in the current code we spend a good amount of time just growing
->steps. Using a list (even if it's an array of pointers as List now is)
during building makes appending fairly cheap. Building a dense array after all
steps have been computed keeps the execution time benefit.


> I think stuff like RelFCIOff() and RelFCIIdx() and RelArrayIdx() is
> just pretty much incomprehensible. Now, the executor is full of
> badly-named stuff already -- ExecInitExprRec being a fine example of a
> name nobody is going to understand on first reading, or maybe ever --
> but we ought to try not to make things worse. I also do understand
> that anything with relative pointers is bound to involve a bunch of
> crappy notation that we're just going to have to accept as the price
> of doing business. But it would help to pick names that are not so
> heavily abbreviated. Like, if RelFCIIdx() were called
> find_function_argument_in_relative_fcinfo() or even
> get_fnarg_from_relfcinfo() the casual reader might have a chance of
> guessing what it does.

Yea, they're crappily named. If this were C++ it'd be easy to wrap the
relative pointers in something that then makes them behave like normal
pointers, but ...


> Sure, the code might be longer, but if you can tell what it does without
> cross-referencing, it's still better.

Unfortunately it's really hard to keep the code legible and keep pgindent
happy with long names :(. But I'm sure that we can do better than these.


> I would welcome changes that make it clearer which things happen just
> once and which things happen at execution time; that said, it seems
> like RELPTR_RESOLVE() happens at execution time, and it surprises me a
> bit that this is OK from a performance perspective.

It's actually fairly cheap, at least on x86, because every relative pointer
dereference is just an offset from one base pointer. That base address can be
kept in a register. In some initial benchmarking the gains from the higher
allocation density of the variable data is bigger than potential losses.


> The pointers can change from execution to execution, but not within an
> individual execution, or so I think. So it doesn't need to be resolved every
> time, if somehow that can be avoided. But maybe CPUs are sufficiently
> well-optimized for computing a pointer address as a+b*c that it does not
> matter.

It should just be a + b, right? Well, for arrays it's more complicated, but
it's also more complicated for "normal arrays".


> I'm not sure how helpful any of these comments are, but those are my
> initial thoughts.

It's helpful.


The biggest issue I see with getting to the point of actually caching JITed
code is the the ->flinfo, ->context thing mentioned above. The best thing I
can come up with is moving the allocation of those into the ExprState as well,
but my gut says there must be a better approach that I'm not quite seeing.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Predefined role pg_maintenance for VACUUM, ANALYZE, CHECKPOINT.
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Fix WAL replay in presence of an incomplete record