Thread: [HACKERS] JIT compiling expressions/deform + inlining prototype v2.0

[HACKERS] JIT compiling expressions/deform + inlining prototype v2.0

From

Andres Freund

Date:

01 September 2017, 09:41:31

Hi,

I previously had an early prototype of JITing [1] expression evaluation
and tuple deforming.  I've since then worked a lot on this.

Here's an initial, not really pretty but functional, submission. This
supports all types of expressions, and tuples, and allows, albeit with
some drawbacks, inlining of builtin functions.  Between the version at
[1] and this I'd done some work in c++, because that allowed to
experiment more with llvm, but I've now translated everything back.
Some features I'd to re-implement due to limitations of C API.

As a teaser:
tpch_5[9586][1]=# set jit_expressions=0;set jit_tuple_deforming=0;
tpch_5[9586][1]=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql

┌──────────────┬──────────────┬───────────┬──────────────────┬──────────────────┬──────────────────┬──────────────────┬──────────────────┬────────────────────┬─────────────┐
│ l_returnflag │ l_linestatus │  sum_qty  │  sum_base_price  │  sum_disc_price  │    sum_charge    │     avg_qty      │
  avg_price     │      avg_disc      │ count_order │
 

├──────────────┼──────────────┼───────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼────────────────────┼─────────────┤
│ A            │ F            │ 188818373 │ 283107483036.109 │ 268952035589.054 │  279714361804.23 │ 25.5025937044707 │
38237.6725307617│ 0.0499976863510723 │     7403889 │
 
│ N            │ F            │   4913382 │ 7364213967.94998 │  6995782725.6633 │ 7275821143.98952 │ 25.5321530459003 │
38267.7833908406│ 0.0500308669240696 │      192439 │
 
│ N            │ O            │ 375088356 │ 562442339707.852 │ 534321895537.884 │ 555701690243.972 │ 25.4978961033505 │
38233.9150565265│ 0.0499956453049625 │    14710561 │
 
│ R            │ F            │ 188960009 │ 283310887148.206 │ 269147687267.211 │ 279912972474.866 │ 25.5132328961366 │
38252.4148049933│ 0.0499958481590264 │     7406353 │
 

└──────────────┴──────────────┴───────────┴──────────────────┴──────────────────┴──────────────────┴──────────────────┴──────────────────┴────────────────────┴─────────────┘
(4 rows)

Time: 4367.486 ms (00:04.367)
tpch_5[9586][1]=# set jit_expressions=1;set jit_tuple_deforming=1;
tpch_5[9586][1]=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql
<repeat>
(4 rows)

Time: 3158.575 ms (00:03.159)

tpch_5[9586][1]=# set jit_expressions=0;set jit_tuple_deforming=0;
tpch_5[9586][1]=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql
<repeat>
(4 rows)
Time: 4383.562 ms (00:04.384)

The potential wins of the JITing itself are considerably larger than the
already significant gains demonstrated above - this version here doesn't
exactly generate the nicest native code around.  After these patches the
bottlencks for TCP-H's Q01 are largely inside the float* functions and
the non-expressionified execGrouping.c code.  The latter needs to be
expressified to gain benefits due to JIT - that shouldn't be very hard.

The code generation can be improved by moving more of the variable data
into llvm allocated stack data, that also has other benefits.

The patch series currently consists out of the following:

0001-Rely-on-executor-utils-to-build-targetlist-for-DML-R.patch
- boring prep work

0002-WIP-Allow-tupleslots-to-have-a-fixed-tupledesc-use-i.patch
- for JITed deforming we need to know whether a slot's tupledesc will
  change

0003-WIP-Add-configure-infrastructure-to-enable-LLVM.patch
- boring

0004-WIP-Beginning-of-a-LLVM-JIT-infrastructure.patch
- infrastructure for llvm, including memory lifetime management, and
  bulk emission of functions.

0005-Perform-slot-validity-checks-in-a-separate-pass-over.patch
- boring, prep work for expression jiting

0006-WIP-deduplicate-int-float-overflow-handling-code.patch
- boring

0007-Pass-through-PlanState-parent-to-expression-instanti.patch
- boring

0008-WIP-JIT-compile-expression.patch
- that's the biggest patch, actually adding JITing
- code needs to be better documented, tested, and deduplicated

0009-Simplify-aggregate-code-a-bit.patch
0010-More-efficient-AggState-pertrans-iteration.patch
0011-Avoid-dereferencing-tts_values-nulls-repeatedly.patch
0012-Centralize-slot-deforming-logic-a-bit.patch
- boring, mostly to make comparison between JITed and non-jitted a bit
  fairer and to remove unnecessary other bottlenecks.

0013-WIP-Make-scan-desc-available-for-all-PlanStates.patch
- this isn't clean enough.

0014-WIP-JITed-tuple-deforming.patch

- do JITing of deforming, but only when called from within expression,
  there we know which columns we want to be deformed etc.

- Not clear what'd be a good way to also JIT other deforming without
  additional infrastructure - doing a separate function emission for
  every slot_deform_tuple() is unattractive performancewise and
  memory-lifetime wise, I did have that at first.

0015-WIP-Expression-based-agg-transition.patch
- allows to JIT aggregate transition invocation, but also speeds up
  aggregates without JIT.

0016-Hacky-Preliminary-inlining-implementation.patch
- allows to inline functions, by using bitcode. That bitcode can be
  loaded from a list of directories - as long as compatibly configured
  the bitcode doesn't have to be generated by the same compiler as the
  postgres binary. i.e. gcc postgres + clang bitcode works.

I've whacked this around quite heavily today, this likely has some new
bugs, sorry for that :(


I plan to spend some considerable time over the next weeks to clean this
up and address some of the areas where the performance isn't yet as good
as desirable.


Greetings,

Andres Freund

[1] http://archives.postgresql.org/message-id/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Hi,

Here's an updated version of the patchset. There's some substantial
changes here, but it's still very obviously very far from committable as
a whole. There's some helper commmits that are simple and independent
enough to be committable earlier on.

The git tree of this work, which is *frequently* rebased, is at:
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

The biggest changes are:

- The JIT "infrastructure" is less bad than before, and starting to
shape up.
- The tuple deforming logic is considerably faster than before due to
various optimizations. The optimizations are:
- build deforming exactly to the required natts for the specific caller
- avoid checking the tuple's natts for attributes that have
"following" NOT NULL columns.
- a bunch of minor codegen improvements.
- The tuple deforming codegen also got simpler by relying on LLVM to
promote a stack variable to a register, instead of working with a
register manually - the need to keep IR in SSA form makes doing so
manually rather painful.
- WIP patch to do execGrouping.c TupleHashTableMatch() via JIT. That
makes the column comparison faster, but more importantly it JITs the
deforming (one side at least always is a MinimalTuple).
- All tests pass with JITed expression, tuple deforming, agg transition
value computation and execGrouping logic. There were a number of bugs,
who would have imagined that.
- some more experimental changes later in the series to address some
bottlenecks.

Functionally this covers all of what I think a sensible goal for v11
is. There's a lot of details to figure out, and the inlining
*implementation* isn't what I think we should do. I'll follow up, not
tonight though, with an email outlining the first few design decisions
we're going to have to finalize, which'll be around the memory/lifetime
management of functions, and other infrastructure pieces (currently
patch 0006).

As the patchset is pretty large already, and not going to get any
smaller, I'll make smaller adjustments solely via the git tree, rather
than full reposts.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] JIT compiling - v4.0

From

Ants Aasma

Date:

04 October 2017, 11:56:47

On Wed, Oct 4, 2017 at 9:48 AM, Andres Freund <andres@anarazel.de> wrote:
> Here's an updated version of the patchset.  There's some substantial
> changes here, but it's still very obviously very far from committable as
> a whole. There's some helper commmits that are simple and independent
> enough to be committable earlier on.

Looks pretty impressive already.

I wanted to take it for a spin, but got errors about the following
symbols being missing:

LLVMOrcUnregisterPerf
LLVMOrcRegisterGDB
LLVMOrcRegisterPerf
LLVMOrcGetSymbolAddressIn
LLVMLinkModules2Needed

As far as I can tell these are not in mainline LLVM. Is there a branch
or patchset of LLVM available somewhere that I need to use this?

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JIT compiling - v4.0

From

Andres Freund

Date:

05 October 2017, 09:57:39

On 2017-10-04 11:56:47 +0300, Ants Aasma wrote:
> On Wed, Oct 4, 2017 at 9:48 AM, Andres Freund <andres@anarazel.de> wrote:
> > Here's an updated version of the patchset.  There's some substantial
> > changes here, but it's still very obviously very far from committable as
> > a whole. There's some helper commmits that are simple and independent
> > enough to be committable earlier on.
>
> Looks pretty impressive already.

Thanks!


> I wanted to take it for a spin, but got errors about the following
> symbols being missing:
>
> LLVMOrcUnregisterPerf
> LLVMOrcRegisterGDB
> LLVMOrcRegisterPerf
> LLVMOrcGetSymbolAddressIn
> LLVMLinkModules2Needed
>
> As far as I can tell these are not in mainline LLVM. Is there a branch
> or patchset of LLVM available somewhere that I need to use this?

Oops, I'd forgotten about the modifications. Sorry. I've attached them
here.  The GDB and Perf stuff should now be an optional dependency,
too.  The required changes are fairly small, so they hopefully shouldn't
be too hard to upstream.

Please check the git tree for a rebased version of the pg patches, with
a bunch bugfixes (oops, some last minute "cleanups") and performance
fixes.

Here's some numbers for a a TPC-H scale 5 run. Obviously the Q01 numbers
are pretty nice in partcular. But it's also visible that the shorter
query can loose, which is largely due to the JIT overhead - that can be
ameliorated to some degree, but JITing obviously isn't always going to
be a win.

It's pretty impressive that in q01, even after all of this, expression
evaluation *still* is 35% of the total time (25% in the aggregate
transition function). That's partially just because the query does
primarily aggregation, but also because the generated code can stand a
good chunk of improvements.

master q01 min: 14146.498     dev min: 11479.05 [diff -23.24]     dev-jit min: 8659.961 [diff -63.36]
dev-jit-deformmin: 7279.395 [diff -94.34]     dev-jit-deform-inline min: 6997.956 [diff -102.15]
 
master q02 min: 1234.229     dev min: 1208.102 [diff -2.16]     dev-jit min: 1292.983 [diff +4.54]     dev-jit-deform
min:1580.505 [diff +21.91]     dev-jit-deform-inline min: 1809.046 [diff +31.77]
 
master q03 min: 6220.814     dev min: 5424.107 [diff -14.69]     dev-jit min: 5175.125 [diff -20.21]     dev-jit-deform
min:4257.368 [diff -46.12]     dev-jit-deform-inline min: 4218.115 [diff -47.48]
 
master q04 min: 947.476     dev min: 970.608 [diff +2.38]     dev-jit min: 969.944 [diff +2.32]     dev-jit-deform min:
999.006[diff +5.16]     dev-jit-deform-inline min: 1033.78 [diff +8.35]
 
master q05 min: 4729.9     dev min: 4059.665 [diff -16.51]     dev-jit min: 4182.941 [diff -13.08]     dev-jit-deform
min:4147.493 [diff -14.04]     dev-jit-deform-inline min: 4284.473 [diff -10.40]
 
master q06 min: 1603.708     dev min: 1592.107 [diff -0.73]     dev-jit min: 1556.216 [diff -3.05]     dev-jit-deform
min:1516.078 [diff -5.78]     dev-jit-deform-inline min: 1579.839 [diff -1.51]
 
master q07 min: 4549.738     dev min: 4331.565 [diff -5.04]     dev-jit min: 4475.654 [diff -1.66]     dev-jit-deform
min:4645.773 [diff +2.07]     dev-jit-deform-inline min: 4885.781 [diff +6.88]
 
master q08 min: 1394.428     dev min: 1350.363 [diff -3.26]     dev-jit min: 1434.366 [diff +2.78]     dev-jit-deform
min:1716.65 [diff +18.77]     dev-jit-deform-inline min: 1938.152 [diff +28.05]
 
master q09 min: 5958.198     dev min: 5700.329 [diff -4.52]     dev-jit min: 5491.683 [diff -8.49]     dev-jit-deform
min:5582.431 [diff -6.73]     dev-jit-deform-inline min: 5797.475 [diff -2.77]
 
master q10 min: 5228.69     dev min: 4475.154 [diff -16.84]     dev-jit min: 4269.365 [diff -22.47]     dev-jit-deform
min:3767.888 [diff -38.77]     dev-jit-deform-inline min: 3962.084 [diff -31.97]
 
master q11 min: 281.201     dev min: 280.132 [diff -0.38]     dev-jit min: 351.85 [diff +20.08]     dev-jit-deform min:
455.885[diff +38.32]     dev-jit-deform-inline min: 532.093 [diff +47.15]
 
master q12 min: 4289.268     dev min: 4082.359 [diff -5.07]     dev-jit min: 4007.199 [diff -7.04]     dev-jit-deform
min:3752.396 [diff -14.31]     dev-jit-deform-inline min: 3916.653 [diff -9.51]
 
master q13 min: 7110.545     dev min: 6898.576 [diff -3.07]     dev-jit min: 6579.554 [diff -8.07]     dev-jit-deform
min:6304.15 [diff -12.79]     dev-jit-deform-inline min: 6135.952 [diff -15.88]
 
master q14 min: 678.024     dev min: 650.943 [diff -4.16]     dev-jit min: 682.387 [diff +0.64]     dev-jit-deform min:
746.354[diff +9.16]     dev-jit-deform-inline min: 878.437 [diff +22.81]
 
master q15 min: 1641.897     dev min: 1650.57 [diff +0.53]     dev-jit min: 1661.591 [diff +1.19]     dev-jit-deform
min:1821.02 [diff +9.84]     dev-jit-deform-inline min: 1863.304 [diff +11.88]
 
master q16 min: 1890.246     dev min: 1819.423 [diff -3.89]     dev-jit min: 1838.079 [diff -2.84]     dev-jit-deform
min:1962.274 [diff +3.67]     dev-jit-deform-inline min: 2096.154 [diff +9.82]
 
master q17 min: 502.605     dev min: 462.881 [diff -8.58]     dev-jit min: 495.648 [diff -1.40]     dev-jit-deform min:
537.666[diff +6.52]     dev-jit-deform-inline min: 613.144 [diff +18.03]
 
master q18 min: 12863.972     dev min: 11257.57 [diff -14.27]     dev-jit min: 10847.61 [diff -18.59]
dev-jit-deformmin: 10119.769 [diff -27.12]     dev-jit-deform-inline min: 10103.051 [diff -27.33]
 
master q19 min: 281.991     dev min: 264.191 [diff -6.74]     dev-jit min: 331.102 [diff +14.83]     dev-jit-deform
min:373.759 [diff +24.55]     dev-jit-deform-inline min: 531.07 [diff +46.90]
 
master q20 min: 541.154     dev min: 511.372 [diff -5.82]     dev-jit min: 565.378 [diff +4.28]     dev-jit-deform min:
662.926[diff +18.37]     dev-jit-deform-inline min: 805.835 [diff +32.85]
 
master q22 min: 678.266     dev min: 656.643 [diff -3.29]     dev-jit min: 676.886 [diff -0.20]     dev-jit-deform min:
735.058[diff +7.73]     dev-jit-deform-inline min: 943.013 [diff +28.07]
 

master total min: 76772.848     dev min: 69125.71 [diff -11.06]     dev-jit min: 65545.522 [diff -17.13]
dev-jit-deformmin: 62963.844 [diff -21.93]     dev-jit-deform-inline min: 64925.407 [diff -18.25]
 


Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Hi,

One part of the work to make JITing worth it's while is JITing tuple
deforming. That's currently often the biggest consumer of time, and if not
most often in the top entries.

My experimentation shows that tuple deforming is primarily beneficial
when it happens as *part* of jit compiling expressions. I'd originally
tried to jit compile deforming inside heaptuple.c, and cache the
deforming program inside the tuple slot. That turns out to not work very
well, because a lot of tuple descriptors are very short lived, computed
during ExecInitNode(). Even if that were not the case, compiling for
each deforming on demand has significant downsides:
- it requires emitting code in smaller increments (whenever something
  new is deformed)
- because the generated code has to be generic for all potential
  deformers, the number of branches to check for that are
  significant. If instead the the deforming code is generated for a
  specific callsite, no branches for the number of to-be-deformed
  columns has to be generated. The primary remaining branches then are
  the ones checking for NULLs and the number of attributes in the
  column, and those can often be optimized away if there's NOT NULL
  columns present.
- the call overhead is still noticeable
- the memory / function lifetime management is awkward.

If the JITing of expressions is instead done as part of expression
evaluation we can emit all the necessary code for the whole plantree
during executor startup, in one go. And, more importantly, LLVMs
optimizer is free to inline the deforming code into the expression code,
often yielding noticeable improvements (although that still could use
some improvements).

To allow doing JITing at ExecReadyExpr() time, we need to know the tuple
descriptor a EEOP_{INNER,OUTER,SCAN}_FETCHSOME step refers to. There's
currently two major impediments to that.

1) At a lot of ExecInitExpr() callsites the tupledescs for inner, outer,
   scan aren't yet known. Therefore that code needs to be reordered so
   we (if applicable):
   a) initialize subsidiary nodes, thereby determining the left/right
      (inner/outer) tupledescs
   b) initialize the scan tuple desc, often that refers to a)
   c) determine the result tuple desc, required to build the projection
   d) build projections
   e) build expressions

   Attached is a patch doing so. Currently it only applies with a few
   preliminary patches applied, but that could be easily reordered.

   The patch is relatively large, as I decided to try to get the
   different ExecInitNode functions to look a bit more similar. There's
   some judgement calls involved, but I think the result looks a good
   bit better, regardless of the later need.

   I'm not really happy with the, preexisting, split of functions
   between execScan.c, execTuples.c, execUtils.c. I wonder if the
   majority, except the low level slot ones, shouldn't be moved to
   execUtils.c, I think that'd be clearer. There seems to be no
   justification for execScan.c to contain
   ExecAssignScanProjectionInfo[WithVarno].

2) TupleSlots need to describe whether they'll contain a fixed tupledesc
   for all their lifetime, or whether they can change their nature. Most
   places don't need to ever change a slot's identity, but in a few
   places it's quite convenient.

   I've introduced the notion that a tupledesc can be marked as "fixed",
   by passing a tupledesc at its creation. That also gains a bit of
   efficiency (memory management overhead, higher cache hit ratio)
   because the slot, tts_values, tts_isnull can be allocated in one
   chunk.

3) At expression initialization time we need to figure out what slots
   (or just descs INNER/OUTER/SCAN refer to. I've solved that by looking
   up inner/outer/scan via the provided parent node, which required
   adding a new field to store the scan slot.

   Currently no expressions initialized with a parent node have a
   INNER/OUTER/SCAN slot + desc that doesn't refer to the relevant node,
   but I'm not sure I like that as a requirement.


Attached is a patch that implements 1 + 2. I'd welcome a quick look
through it. It currently only applies ontop a few other recently
submitted patches, but it'd just be an hour's work or so to reorder
that.

Comments about either the outline above or the patch?

Regards,

Andres

Attachment

0001-WIP-Allow-tupleslots-to-have-a-fixed-tupledesc-use-i.patch

JIT compiling with LLVM v9.0

From

Andres Freund

Date:

24 January 2018, 10:20:38

Hi,

I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.

There's too many small changes, so I'm only going to list the major
things. A good bit of that is new. The actual LLVM IR emissions itself
hasn't changed that drastically.  Since I've not described them in
detail before I'll describe from scratch in a few cases, even if things
haven't fully changed.


== JIT Interface ==

To avoid emitting code in very small increments (increases mmap/mremap
rw vs exec remapping, compile/optimization time), code generation
doesn't happen for every single expression individually, but in batches.

The basic object to emit code via is a jit context created with:
  extern LLVMJitContext *llvm_create_context(bool optimize);
which in case of expression is stored on-demand in the EState. For other
usecases that might not be the right location.

To emit LLVM IR (ie. the portabe code that LLVM then optimizes and
generates native code for), one gets a module from that with:
  extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);

to which "arbitrary" numbers of functions can be added. In case of
expression evaluation, we get the module once for every expression, and
emit one function for the expression itself, and one for every
applicable/referenced deform function.

As explained above, we do not want to emit code immediately from within
ExecInitExpr()/ExecReadyExpr(). To facilitate that readying a JITed
expression sets the function to callback, which gets the actual native
function on the first actual call.  That allows to batch together the
generation of all native functions that are defined before the first
expression is evaluated - in a lot of queries that'll be all.

Said callback then calls
  extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
which'll emit code for the "in progress" mutable module if necessary,
and then searches all generated functions for the name. The names are
created via
  extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
currently "evalexpr" and deform" with a generation and counter suffix.

Currently expression which do not have access to an EState, basically
all "parent" less expressions, aren't JIT compiled. That could be
changed, but I so far do not see a huge need.


== Error handling ==

There's two aspects to error handling.

Firstly, generated (LLVM IR) and emitted functions (mmap()ed segments)
need to be cleaned up both after a successful query execution and after
an error.  I've settled on a fairly boring resowner based mechanism. On
errors all expressions owned by a resowner are released, upon success
expressions are reassigned to the parent / released on commit (unless
executor shutdown has cleaned them up of course).


A second, less pretty and newly developed, aspect of error handling is
OOM handling inside LLVM itself. The above resowner based mechanism
takes care of cleaning up emitted code upon ERROR, but there's also the
chance that LLVM itself runs out of memory. LLVM by default does *not*
use any C++ exceptions. It's allocations are primarily funneled through
the standard "new" handlers, and some direct use of malloc() and
mmap(). For the former a 'new handler' exists
http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
latter LLVM provides callback that get called upon failure
(unfortunately mmap() failures are treated as fatal rather than OOM
errors).
What I've chosen to do, and I'd be interested to get some input about
that, is to have two functions that LLVM using code must use:
  extern void llvm_enter_fatal_on_oom(void);
  extern void llvm_leave_fatal_on_oom(void);
before interacting with LLVM code (ie. emitting IR, or using the above
functions) llvm_enter_fatal_on_oom() needs to be called.

When a libstdc++ new or LLVM error occurs, the handlers set up by the
above functions trigger a FATAL error. We have to use FATAL rather than
ERROR, as we *cannot* reliably throw ERROR inside a foreign library
without risking corrupting its internal state.

Users of the above sections do *not* have to use PG_TRY/CATCH blocks,
the handlers instead are reset on toplevel sigsetjmp() level.


Using a relatively small enter/leave protected section of code, rather
than setting up these handlers globally, avoids negative interactions
with extensions that might use C++ like e.g. postgis. As LLVM code
generation should never execute arbitrary code, just setting these
handlers temporarily ought to suffice.


== LLVM Interface / patches ==

Unfortunately a bit of required LLVM functionality, particularly around
error handling but also initialization, aren't currently fully exposed
via LLVM's C-API.  A bit more *optional* API isn't exposed either.

Instead of requiring a brand-new version of LLVM that has exposed this
functionality I decided it's better to have a small C++ wrapper that can
provide this functionality. Due to that new wrapper significantly older
LLVM versions can now be used (for now I've only runtime tested 5.0 and
master, 4.0 would be possible with a few ifdefs, a bit older probably
doable as well).  Given that LLVM is written in C++ itself, and optional
dependency to a C++ compiler for one file doesn't seem to be too bad.


== Inlining ==

One big advantage of JITing expressions is that it can significantly
reduce the overhead of postgres' extensible function/operator mechanism,
by inlining the body of called operators.

This is the part of code that I've worked on most significantly. While I
think JITing is an entirely viable project without committed inlining, I
felt that we definitely need to know how exactly we want to do inlining
before merging other parts.  3 different implementations later, I'm
fairly confident that I have a good concept, even though a few corners
still need to be smoothed.

As a quick background, LLVM works on the basis of a high-level
"abstract" assembly representation (llvm.org/docs/LangRef.html). This
can be generated in memory, stored in binary form (bitcode files ending
in .bc) or text representation (.ll files).  The clang compiler always
generates the in-memory representation and the -emit-llvm flag tells it
to write that out to disk, rather than .o files/binaries.


This facility allows us to get the bitcode for all operators
(e.g. int8eq, float8pl etc), without maintaining two copies. The way
I've currently set it up is that, if --with-llvm is passed to configure,
all backend files are also compiled to bitcode files.  These bitcode
files get installed into the server's
  $pkglibdir/bitcode/postgres/
under their original subfolder, eg.
  ~/build/postgres/dev-assert/install/lib/bitcode/postgres/utils/adt/float.bc
Using existing LLVM functionality (for parallel LTO compilation),
additionally an index is over these is stored to
  $pkglibdir/bitcode/postgres.index.bc

When deciding to JIT for the first time, $pkglibdir/bitcode/ is scanned
for all .index.bc files and a *combined* index over all these files is
built in memory.  The reason for doing so is that that allows "easy"
access to inlining access for extensions - they can install code into
  $pkglibdir/bitcode/[extension]/
accompanied by
  $pkglibdir/bitcode/[extension].index.bc
just alongside the actual library.


The inlining implementation, I had to write my own LLVM's isn't suitable
for a number of reasons, can then use the combined in-memory index to
look up all 'extern' function references, judge their size, and then
open just the file containing its implementation (ie. the above
float.bc).  Currently there's a limit of 150 instructions for functions
to be inlined, functions used by inlined functions have a budget of 0.5
* limit, and so on.  This gets rid of most operators I in queries I
tested, although there's a few that resist inlining due to references to
file-local static variables - but those largely don't seem to be
performance relevant.


== Type Synchronization ==

For my current two main avenues of performance optimizations due to
JITing, expression eval and tuple deforming, it's obviously required
that code generation knows about at least a few postgres types (tuple
slots, heap tuples, expr context/state, etc).

Initially I'd provided LLVM by emitting types manually like:
   {
       LLVMTypeRef members[15];

       members[ 0] = LLVMInt32Type(); /* type */
       members[ 1] = LLVMInt8Type(); /* isempty */
       members[ 2] = LLVMInt8Type(); /* shouldFree */
       members[ 3] = LLVMInt8Type(); /* shouldFreeMin */
       members[ 4] = LLVMInt8Type(); /* slow */
       members[ 5] = LLVMPointerType(StructHeapTupleData, 0); /* tuple */
       members[ 6] = LLVMPointerType(StructtupleDesc, 0); /* tupleDescriptor */
       members[ 7] = TypeMemoryContext; /* mcxt */
       members[ 8] = LLVMInt32Type(); /* buffer */
       members[ 9] = LLVMInt32Type(); /* nvalid */
       members[10] = LLVMPointerType(TypeSizeT, 0); /* values */
       members[11] = LLVMPointerType(LLVMInt8Type(), 0); /* nulls */
       members[12] = LLVMPointerType(StructMinimalTupleData, 0); /* mintuple */
       members[13] = StructHeapTupleData; /* minhdr */
       members[14] = LLVMInt64Type(); /* off */

       StructTupleTableSlot = LLVMStructCreateNamed(LLVMGetGlobalContext(),
                                                    "struct.TupleTableSlot");
       LLVMStructSetBody(StructTupleTableSlot, members, lengthof(members), false);
   }
and then using numeric offset when emitting code like:
  LLVMBuildStructGEP(builder, v_slot, 9, "")
to compute the address of nvalid field of a slot at runtime.

but that obviously duplicates a lot of information and is incredibly
failure prone. Doesn't seem acceptable.

What I've now instead done is have one small file (llvmjit_types.c)
which references each of the types required for JITing. That file is
translated to bitcode at compile time, and loaded when LLVM is
initialized in a backend.  That works very well to synchronize the type
definition, unfortunately it does *not* synchronize offsets as the IR
level representation doesn't know field names.

Instead I've added defines to the original struct definition that
provide access to the relevant offsets. Eg.
#define FIELDNO_TUPLETABLESLOT_NVALID 9
    int            tts_nvalid;        /* # of valid values in tts_values */
while that still needs to be defined, it's only required for a
relatively small number of fields, and it's bunched together with the
struct definition, so it's easily kept synchronized.

A significant downside for this is that clang needs to be around to
create that bitcode file, but that doesn't seem that bad as an optional
*build*-time, *not* runtime, dependency.

Not a perfect solution, but I don't quite see a better approach.


== Minimal cost based planning & config ==

Currently there's a number of GUCs that influence JITing:

- jit_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
  get JITed, *without* optimization (expensive part), corresponding to
  -O0. This commonly already results in significant speedups if
  expression/deforming is a bottleneck (removing dynamic branches
  mostly).
- jit_optimize_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
  get JITed, *with* optimization (expensive part).
- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
  higher cost.

For all of these -1 is a hard disable.

There currently also exist:
- jit_expressions = 0/1
- jit_deform = 0/1
- jit_perform_inlining = 0/1
but I think they could just be removed in favor of the above.

Additionally there's a few debugging/other GUCs:

- jit_debugging_support = 0/1 - register generated functions with the
  debugger. Unfortunately GDBs JIT integration scales O(#functions^2),
  albeit with a very small constant, so it cannot always be enabled :(
- jit_profiling_support = 0/1 - emit information so perf gets notified
  about JITed functions. As this logs data to disk that is not
  automatically cleaned up (otherwise it'd be useless), this definitely
  cannot be enabled by default.
- jit_dump_bitcode = 0/1 - log generated pre/post optimization bitcode
  to disk. This is quite useful for development, so I'd want to keep it.
- jit_log_ir = 0/1 - dump generated IR to the logfile. I found this to
  be too verbose, and I think it should be yanked.

Do people feel these should be hidden behind #ifdefs, always present but
prevent from being set to a meaningful, or unrestricted?


=== Remaining work ==

These I'm planning to tackle in the near future and need to be tackled
before mergin.

- Add a big readme
- Add docs
- Add / check LLVM 4.0 support
- reconsider location of JITing code (lib/ and heaptuple.c specifically)
- Split llvmjit_wrap.cpp into three files (error handling, inlining,
  temporary LLVM C API extensions)
- Split the bigger commit, improve commit messages
- Significant amounts of local code cleanup and comments
  - duplicated code in expression emission for very related step types
  - more consistent LLVM variable naming
  - pgindent
- timing information about JITing needs to be fewer messages, and hidden
  behind a GUC.
- improve logging (mostly remove)

== Future Todo (some already in-progress) ==

- JITed hash computation for nodeAgg & nodeHash. That's currently a
  major bottleneck.
- Increase quality of generated code. There's a *lot* left still on the
  table. The generated code currently spills far too much into memory,
  and LLVM only can optimize that away to a limited degree.  I've
  experimented some and for TPCH Q01 it's possible to get at least
  another x1.8 due to that, with expression eval *still* being the
  bottleneck afterwards...
- Caching of the generated code, drastically reducing overhead and
  allowing JITing to be beneficial in OLTP cases. Currently the biggest
  obstacle to that is the number of specific memory locations referenced
  in the expression representation, but that definitely can be improved
  (a lot of it by the above point alone).
- More elaborate planning model
- The cloning of modules could e reduced to only cloning required
  parts. As that's the most expensive part of inlining and most of the
  time only a few functions are used, this should probably be done soon.

== Code ==

As the patchset is large (500kb) and I'm still quickly evolving it, I do
not yet want to attach it. The git tree is at
  https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
  https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

to build --with-llvm has to be passed to configure, llvm-config either
needs to be in PATH or provided with LLVM_CONFIG to make. A c++ compiler
and clang need to be available under common names or provided via CXX /
CLANG respectively.

Regards,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Pierre Ducroquet

Date:

25 January 2018, 00:35:08

On Wednesday, January 24, 2018 8:20:38 AM CET Andres Freund wrote:
> As the patchset is large (500kb) and I'm still quickly evolving it, I do
> not yet want to attach it. The git tree is at
>   https://git.postgresql.org/git/users/andresfreund/postgres.git
> in the jit branch
>
> https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shor
> tlog;h=refs/heads/jit
>
> to build --with-llvm has to be passed to configure, llvm-config either
> needs to be in PATH or provided with LLVM_CONFIG to make. A c++ compiler
> and clang need to be available under common names or provided via CXX /
> CLANG respectively.
>
> Regards,
>
> Andres Freund

Hi

I tried to build on Debian sid, using GCC 7 and LLVM 5. I used the following
to compile, using your branch @3195c2821d :

$ export LLVM_CONFIG=/usr/bin/llvm-config-5.0
$ ./configure --with-llvm
$ make

And I had the following build error :
llvmjit_wrap.cpp:32:10: fatal error: llvm-c/DebugInfo.h: No such file or
directory
 #include "llvm-c/DebugInfo.h"
          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.

In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
++ API in llvm/IR/DebugInfo.h.

For 'sport' (I have not played with LLVM API since more than one year), I
tried to fix it, changing it to the C++ include.

The DebugInfo related one was easy, only one function was used.
But I still could not build because the LLVM API changed between 5.0 and 6.0
regarding value info SummaryList.

llvmjit_wrap.cpp: In function
‘std::unique_ptr<llvm::StringMap<llvm::StringSet<> > >
llvm_build_inline_plan(llvm::Module*)’:
llvmjit_wrap.cpp:285:48: error: ‘class llvm::GlobalValueSummary’ has no member
named ‘getBaseObject’
    fs = llvm::cast<llvm::FunctionSummary>(gvs->getBaseObject());
                                                ^~~~~~~~~~~~~

That one was a bit uglier.

I'm not sure how to test everything properly, so the patch is attached for
both these issues, do as you wish with it… :)

Regards

 Pierre Ducroquet

Attachment

0001-Allow-building-with-LLVM-5.0.patch

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

25 January 2018, 01:06:30

Hi,

On 2018-01-24 22:35:08 +0100, Pierre Ducroquet wrote:
> I tried to build on Debian sid, using GCC 7 and LLVM 5. I used the following 
> to compile, using your branch @3195c2821d :

Thanks!

> $ export LLVM_CONFIG=/usr/bin/llvm-config-5.0
> $ ./configure --with-llvm
> $ make
> 
> And I had the following build error :
> llvmjit_wrap.cpp:32:10: fatal error: llvm-c/DebugInfo.h: No such file or 
> directory
>  #include "llvm-c/DebugInfo.h"
>           ^~~~~~~~~~~~~~~~~~~~
> compilation terminated.
> 
> In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
> ++ API in llvm/IR/DebugInfo.h.

Hm, I compiled against 5.0 quite recently, but added the stripping of
debuginfo lateron.  I'll add a fallback method, thanks for pointing that
out!

> But I still could not build because the LLVM API changed between 5.0 and 6.0 
> regarding value info SummaryList. 

Hm, thought these changes were from before my 5.0 test. But the code
evolved heavily, so I might misremember. Let me see.

Thanks, I'll try to push fixes into the tree soon-ish..

> I'm not sure how to test everything properly, so the patch is attached for
> both these issues, do as you wish with it… :)

What I do for testing is running postgres' tests against a started
server that has all cost based behaviour turned off (which makes no
sense from a runtime optimization perspective, but increases
coverage...).

The flags I pass to the server are:
  -c jit_expressions=1 -c jit_tuple_deforming=1  -c jit_perform_inlining=1 -c jit_above_cost=0 -c
jit_optimize_above_cost=0
then I run
  make -s installcheck-parallel
to see whether things pass.  The flags makes the tests slow-ish, but
tests everything under jit. In particular errors.sql's recursion check
takes a while...

Obviously none of the standard tests are interesting from a performance
perspective...

FWIW, here's an shortened excerpt of the debugging output of TPCH query:

DEBUG:  checking inlinability of ExecAggInitGroup
DEBUG:  considering extern function datumCopy at 75 for inlining
DEBUG:  inline top function ExecAggInitGroup total_instcount: 24, partial: 21

so the inliner found a reference to ExecAggInitGroup, inlined it, and
scheduled to checkout datumCopy, externally referenced from
ExecAggInitGroup, later.

DEBUG:  uneligible to import errstart due to early threshold: 150 vs 37

elog stuff wasn't inlined because errstart has 150 insn, but at this
point the limit was 37 (aka 150 / 2 / 2). Early means this was decided
based on the summary.  There's also 'late' checks preventing inlining if
dependencies of the inlined variable (local static functions, constant
static global variables) make it bigger than the summary knows about.

Then we get to execute the importing:
DEBUG:  performing import of postgres/utils/fmgr/fmgr.bc pg_detoast_datum, pg_detoast_datum_packed
DEBUG:  performing import of postgres/utils/adt/arrayfuncs.bc construct_array
DEBUG:  performing import of postgres/utils/error/assert.bc ExceptionalCondition, .str.1, .str
DEBUG:  performing import of postgres/utils/adt/expandeddatum.bc EOH_flatten_into, DeleteExpandedObject, .str.1,
.str.2,.str.4, EOH_get_flat_size

DEBUG:  performing import of postgres/utils/adt/int8.bc __func__.overflowerr, .str, .str.12, int8inc, overflowerr,
pg_add_s64_overflow
...
DEBUG:  performing import of postgres/utils/adt/date.bc date_le_timestamp,  date2timestamp,  .str,
__func__.date2timestamp, .str.26

And there's a timing summary (debugging build)
DEBUG:  time to inline: 0.145s
DEBUG:  time to opt: 0.156s
DEBUG:  time to emit: 0.078s

Same debugging build:

tpch_10[6930][1]=# set jit_expressions = 1;
tpch_10[6930][1]=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql 
...
Time: 28442.870 ms (00:28.443)

tpch_10[6930][1]=# set jit_expressions = 0;
tpch_10[6930][1]=# \i ~/tmp/tpch/pg-tpch/queries/q01.sql 
...
Time: 70357.830 ms (01:10.358)
tpch_10[6930][1]=# show max_parallel_workers_per_gather;
┌─────────────────────────────────┐
│ max_parallel_workers_per_gather │
├─────────────────────────────────┤
│ 0                               │
└─────────────────────────────────┘

Now admittedly a debugging/assertion enabled build isn't quite a fair
fight, but it's not that much smaller a win without that.

- Andres

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

25 January 2018, 01:42:12

Hi,

On 2018-01-24 14:06:30 -0800, Andres Freund wrote:
> > In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
> > ++ API in llvm/IR/DebugInfo.h.
> 
> Hm, I compiled against 5.0 quite recently, but added the stripping of
> debuginfo lateron.  I'll add a fallback method, thanks for pointing that
> out!

Went more with your fix, there's not much point in using the C API
here. Should probably remove the use of it nearly entirely from the .cpp
file (save for wrap/unwrap() use). But man, the 'class Error' usage is
one major ugly pain.

> > But I still could not build because the LLVM API changed between 5.0 and 6.0 
> > regarding value info SummaryList. 
> 
> Hm, thought these changes were from before my 5.0 test. But the code
> evolved heavily, so I might misremember. Let me see.

Ah, that one was actually easier to fix. There's no need to get the base
object at all, so it's just a one-line change.

> Thanks, I'll try to push fixes into the tree soon-ish..

Pushed.

Thanks again for looking!

- Andres

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

25 January 2018, 09:33:30

On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet <p.psql@pinaraf.info> wrote:
> In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
> ++ API in llvm/IR/DebugInfo.h.

The LLVM APIs don't seem to be very stable; won't there just be a
continuous stream of similar issues?

Pinning major postgresql versions to specific LLVM versions doesn't
seem very appealing. Even if you aren't interested in the latest
changes in LLVM, trying to get the right version on your machine will
be annoying.

Regards,
    Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

25 January 2018, 09:38:16

Hi,

On 2018-01-24 22:33:30 -0800, Jeff Davis wrote:
> On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet <p.psql@pinaraf.info> wrote:
> > In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only as a C
> > ++ API in llvm/IR/DebugInfo.h.
> 
> The LLVM APIs don't seem to be very stable; won't there just be a
> continuous stream of similar issues?

There'll be some of that yes. But the entire difference between 5 and
what will be 6 was not including one header, and not calling one unneded
function. That doesn't seem like a crazy amount of adaption that needs
to be done.  From a quick look about porting to 4, it'll be a bit, but
not much more effort.

The reason I'm using the C-API where possible is that it's largely
forward compatible (i.e. new features added, but seldomly things are
removed). The C++ code changes a bit more, but it's not that much code
we're interfacing with either.

I think we'll have to make do with a number of ifdefs - I don't really
see an alternative. Unless you've a better idea?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

25 January 2018, 09:51:36

On Tue, Jan 23, 2018 at 11:20 PM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> I've spent the last weeks working on my LLVM compilation patchset. In
> the course of that I *heavily* revised it. While still a good bit away
> from committable, it's IMO definitely not a prototype anymore.

Great!

A couple high-level questions:

1. I notice a lot of use of the LLVM builder, for example, in
slot_compile_deform(). Why can't you do the same thing you did with
function code, where you create the ".bc" at build time from plain C
code, and then load it at runtime?
2. I'm glad you considered extensions. How far can we go with this in
the future? Can we have bitcode-only extensions that don't need a .so
file? Can we store the bitcode in pg_proc, simplifying deployment and
allowing extensions to travel over replication? I am not asking for
this now, of course, but I'd like to get the idea out there so we
leave room.

Regards,
     Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

25 January 2018, 10:02:54

Hi!

On 2018-01-24 22:51:36 -0800, Jeff Davis wrote:
> A couple high-level questions:
> 
> 1. I notice a lot of use of the LLVM builder, for example, in
> slot_compile_deform(). Why can't you do the same thing you did with
> function code, where you create the ".bc" at build time from plain C
> code, and then load it at runtime?

Not entirely sure what you mean. You mean why I don't inline
slot_getsomeattrs() etc and instead generate code manually?  The reason
is that the generated code is a *lot* smarter due to knowing the
specific tupledesc.

> 2. I'm glad you considered extensions. How far can we go with this in
> the future?

> Can we have bitcode-only extensions that don't need a .so
> file?

Hm. I don't see a big problem introducing this. There'd be some
complexity in how to manage the lifetime of JITed functions generated
that way, but that should be solvable.

> Can we store the bitcode in pg_proc, simplifying deployment and
> allowing extensions to travel over replication?

Yes, we could. You'd need to be a bit careful that all the machines have
similar-ish cpu generations or compile with defensive settings, but that
seems okay.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Pierre Ducroquet

Date:

25 January 2018, 12:00:14

On Thursday, January 25, 2018 7:38:16 AM CET Andres Freund wrote:
> Hi,
> 
> On 2018-01-24 22:33:30 -0800, Jeff Davis wrote:
> > On Wed, Jan 24, 2018 at 1:35 PM, Pierre Ducroquet <p.psql@pinaraf.info> 
wrote:
> > > In LLVM 5.0, it looks like DebugInfo.h is not available in llvm-c, only
> > > as a C ++ API in llvm/IR/DebugInfo.h.
> > 
> > The LLVM APIs don't seem to be very stable; won't there just be a
> > continuous stream of similar issues?
> 
> There'll be some of that yes. But the entire difference between 5 and
> what will be 6 was not including one header, and not calling one unneded
> function. That doesn't seem like a crazy amount of adaption that needs
> to be done.  From a quick look about porting to 4, it'll be a bit, but
> not much more effort.

I don't know when this would be released, but the minimal supported LLVM 
version will have a strong influence on the availability of that feature. If 
today this JIT compiling was released with only LLVM 5/6 support, it would be 
unusable for most Debian users (llvm-5 is only available in sid). Even llvm 4 
is not available in latest stable.
I'm already trying to build with llvm-4 and I'm going to try further with llvm 
3.9 (Debian Stretch doesn't have a more recent than this one, and I won't have 
something better to play with my data), I'll keep you informed. For sport, I 
may also try llvm 3.5 (for Debian Jessie).

 Pierre

Re: JIT compiling with LLVM v9.0

From

Konstantin Knizhnik

Date:

25 January 2018, 18:40:53

On 24.01.2018 10:20, Andres Freund wrote:
> Hi,
>
> I've spent the last weeks working on my LLVM compilation patchset. In
> the course of that I *heavily* revised it. While still a good bit away
> from committable, it's IMO definitely not a prototype anymore.
>
> There's too many small changes, so I'm only going to list the major
> things. A good bit of that is new. The actual LLVM IR emissions itself
> hasn't changed that drastically.  Since I've not described them in
> detail before I'll describe from scratch in a few cases, even if things
> haven't fully changed.
>
>
> == JIT Interface ==
>
> To avoid emitting code in very small increments (increases mmap/mremap
> rw vs exec remapping, compile/optimization time), code generation
> doesn't happen for every single expression individually, but in batches.
>
> The basic object to emit code via is a jit context created with:
>    extern LLVMJitContext *llvm_create_context(bool optimize);
> which in case of expression is stored on-demand in the EState. For other
> usecases that might not be the right location.
>
> To emit LLVM IR (ie. the portabe code that LLVM then optimizes and
> generates native code for), one gets a module from that with:
>    extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);
>
> to which "arbitrary" numbers of functions can be added. In case of
> expression evaluation, we get the module once for every expression, and
> emit one function for the expression itself, and one for every
> applicable/referenced deform function.
>
> As explained above, we do not want to emit code immediately from within
> ExecInitExpr()/ExecReadyExpr(). To facilitate that readying a JITed
> expression sets the function to callback, which gets the actual native
> function on the first actual call.  That allows to batch together the
> generation of all native functions that are defined before the first
> expression is evaluated - in a lot of queries that'll be all.
>
> Said callback then calls
>    extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
> which'll emit code for the "in progress" mutable module if necessary,
> and then searches all generated functions for the name. The names are
> created via
>    extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
> currently "evalexpr" and deform" with a generation and counter suffix.
>
> Currently expression which do not have access to an EState, basically
> all "parent" less expressions, aren't JIT compiled. That could be
> changed, but I so far do not see a huge need.

Hi,

As far as I understand generation of native code is now always done for 
all supported expressions and individually by each backend.
I wonder it will be useful to do more efforts to understand when 
compilation to native code should be done and when interpretation is better.
For example many JIT-able languages like Lua are using traces, i.e. 
query is first interpreted  and trace is generated. If the same trace is 
followed more than N times, then native code is generated for it.

In context of DBMS executor it is obvious that only frequently executed 
or expensive queries have to be compiled.
So we can use estimated plan cost and number of query executions as 
simple criteria for JIT-ing the query.
May be compilation of simple queries (with small cost) should be done 
only for prepared statements...

Another question is whether it is sensible to redundantly do expensive 
work (llvm compilation) in all backends.
This question refers to shared prepared statement cache. But even 
without such cache, it seems to be possible to use for library name some 
signature of the compiled expression and allow
to share this libraries between backends. So before starting code 
generation, ExecReadyCompiledExpr can first build signature and check if 
correspondent library is already present.
Also it will be easier to control space used by compiled libraries in 
this case.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

25 January 2018, 22:12:42

Hi,

On 2018-01-25 10:00:14 +0100, Pierre Ducroquet wrote:
> I don't know when this would be released,

August-October range.

> but the minimal supported LLVM 
> version will have a strong influence on the availability of that feature. If 
> today this JIT compiling was released with only LLVM 5/6 support, it would be 
> unusable for most Debian users (llvm-5 is only available in sid). Even llvm 4 
> is not available in latest stable.
> I'm already trying to build with llvm-4 and I'm going to try further with llvm 
> 3.9 (Debian Stretch doesn't have a more recent than this one, and I won't have 
> something better to play with my data), I'll keep you informed. For sport, I 
> may also try llvm 3.5 (for Debian Jessie).

I don't think it's unreasonable to not support super old llvm
versions. This is a complex feature, and will take some time to
mature. Supporting too many LLVM versions at the outset will have some
cost.  Versions before 3.8 would require supporting mcjit rather than
orc, and I don't think that'd be worth doing.  I think 3.9 might be a
reasonable baseline...

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

25 January 2018, 22:20:28

Hi,

On 2018-01-25 18:40:53 +0300, Konstantin Knizhnik wrote:
> As far as I understand generation of native code is now always done for all
> supported expressions and individually by each backend.

Mostly, yes. It's done "always" done, because there's cost based checks
whether to do so or not.

> I wonder it will be useful to do more efforts to understand when compilation
> to native code should be done and when interpretation is better.
> For example many JIT-able languages like Lua are using traces, i.e. query is
> first interpreted  and trace is generated. If the same trace is followed
> more than N times, then native code is generated for it.

Right. That's where I actually had started out, but my experimentation
showed that that's not that interesting a path to pursue. Emitting code
in much smaller increments (as you'd do so for individual expressions)
has considerable overhead. We also have a planner that allows us
reasonable guesses when to JIT and when not - something not available in
many other languages.

That said, nothing in the infrastructure would preent you from pursuing
that, it'd just be a wrapper function for the generated exprs that
tracks infocations.

> Another question is whether it is sensible to redundantly do expensive work
> (llvm compilation) in all backends.

Right now we kinda have to, but I really want to get rid of
that. There's some pointers included as constants in the generated
code. I plan to work on getting rid of that requirement, but after
getting the basics in (i.e. realistically not this release).  Even after
that I'm personally much more interested in caching the generated code
inside a backend, rather than across backends.   Function addresses et
al being different between backends would add some complications, can be
overcome, but I'm doubtful it's immediately worth it.

> So before starting code generation, ExecReadyCompiledExpr can first
> build signature and check if correspondent library is already present.
> Also it will be easier to control space used by compiled libraries in
> this

Right, I definitely think we want to do that at some point not too far
away in the future. That makes the applicability of JITing much broader.

More advanced forms of this are that you JIT in the background for
frequently executed code (so not to incur latency the first time
somebody executes). Aand/or that you emit unoptimized code the first
time through, which is quite quick, and run the optimizer after the
query has been executed a number of times.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Konstantin Knizhnik

Date:

26 January 2018, 10:44:24

Hi,

I've spent the last weeks working on my LLVM compilation patchset. In
the course of that I *heavily* revised it. While still a good bit away
from committable, it's IMO definitely not a prototype anymore.

Below are results on my system for Q1 TPC-H scale 10 (~13Gb database)

Options	Time
Default	20075
jit_expressions=on	16105
jit_tuple_deforming=on	14734
jit_perform_inlining=on	13441

Also I noticed that parallel execution didsables JIT.
At my computer with 4 cores time of Q1 with parallel execution is 6549.
Are there any principle problems with combining JIT and parallel execution?

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

26 January 2018, 11:23:41

Hi,

Thanks for testing things out!

On 2018-01-26 10:44:24 +0300, Konstantin Knizhnik wrote:
> Also I noticed that parallel execution didsables JIT.

Oh, oops, I broke that recently by moving where the decisition about
whether to jit or not is. There actually is JITing, but only in the
leader.

> Are there any principle problems with combining JIT and parallel execution?

No, there's not, I just need to send down the flag to JIT down to the
workers. Will look at it tomorrow.  If you want to measure / play around
till then you can manually hack the PGJIT_* checks in execExprCompile.c

with that done, on my laptop, tpch-Q01, scale 10:

SET max_parallel_workers_per_gather=0; SET jit_expressions = 1;
15145.508 ms
SET max_parallel_workers_per_gather=0; SET jit_expressions = 0;
23808.809 ms
SET max_parallel_workers_per_gather=4; SET jit_expressions = 1;
4775.170 ms
SET max_parallel_workers_per_gather=4; SET jit_expressions = 0;
7173.483 ms

(that's with inlining and deforming enabled too)

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Konstantin Knizhnik

Date:

26 January 2018, 13:06:27

On 26.01.2018 11:23, Andres Freund wrote:
> Hi,
>
> Thanks for testing things out!
>
Thank you for this work.
One more question: do you have any idea how to profile JITed code?
There is no LLVMOrcRegisterPerf in LLVM 5, so jit_profiling_support 
option does nothing.
And without it perf is not able to unwind stack trace for generated code.
A attached the produced profile, looks like "unknown" bar corresponds to 
JIT code.

There is NoFramePointerElim option in LLVMMCJITCompilerOptions 
structure, but it requires use of ExecutionEngine.
Something like this:

     mod = llvm_mutable_module(context);
     {
         struct LLVMMCJITCompilerOptions options;
         LLVMExecutionEngineRef jit;
         char* error;
         LLVMCreateExecutionEngineForModule(&jit, mod, &error);
         LLVMInitializeMCJITCompilerOptions(&options, sizeof(options));
         options.NoFramePointerElim = 1;
         LLVMCreateMCJITCompilerForModule(&jit, mod, &options, 
sizeof(options),
                                          &error);
     }
     ...

But you are compiling code using LLVMOrcAddEagerlyCompiledIR
and I find no way to pass no-omit-frame pointer option here.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

q1.svg

Re: JIT compiling with LLVM v9.0

From

David Fetter

Date:

26 January 2018, 17:52:22

On Thu, Jan 25, 2018 at 11:20:28AM -0800, Andres Freund wrote:
> On 2018-01-25 18:40:53 +0300, Konstantin Knizhnik wrote:
> > Another question is whether it is sensible to redundantly do
> > expensive work (llvm compilation) in all backends.
> 
> Right now we kinda have to, but I really want to get rid of that.
> There's some pointers included as constants in the generated code. I
> plan to work on getting rid of that requirement, but after getting
> the basics in (i.e. realistically not this release).  Even after
> that I'm personally much more interested in caching the generated
> code inside a backend, rather than across backends.   Function
> addresses et al being different between backends would add some
> complications, can be overcome, but I'm doubtful it's immediately
> worth it.

If we go with threading for this part, sharing that state may be
simpler.  It seems a lot of work is going into things that threading
does at a much lower developer cost, but that's a different
conversation.

> > So before starting code generation, ExecReadyCompiledExpr can first
> > build signature and check if correspondent library is already present.
> > Also it will be easier to control space used by compiled libraries in
> > this
> 
> Right, I definitely think we want to do that at some point not too far
> away in the future. That makes the applicability of JITing much broader.
> 
> More advanced forms of this are that you JIT in the background for
> frequently executed code (so not to incur latency the first time
> somebody executes). Aand/or that you emit unoptimized code the first
> time through, which is quite quick, and run the optimizer after the
> query has been executed a number of times.

Both sound pretty neat.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

26 January 2018, 22:38:17

Hi,

On 2018-01-26 13:06:27 +0300, Konstantin Knizhnik wrote:
> One more question: do you have any idea how to profile JITed code?

Yes ;). It depends a bit on what exactly you want to do. Is it
sufficient to get time associated with the parent caller, or do you need
instruction-level access.

> There is no LLVMOrcRegisterPerf in LLVM 5, so jit_profiling_support option
> does nothing.

Right, it's a patch I'm trying to get into the next version of
llvm. With that you get access to the shared object and everything.

> And without it perf is not able to unwind stack trace for generated
> code.

You can work around that by using --call-graph lbr with a sufficiently
new perf. That'll not know function names et al, but at least the parent
will be associated correctly.

> But you are compiling code using LLVMOrcAddEagerlyCompiledIR
> and I find no way to pass no-omit-frame pointer option here.

It shouldn't be too hard to open code support for it, encapsulated in a
function:
    // Set function attribute "no-frame-pointer-elim" based on
    // NoFramePointerElim.
    for (auto &F : *Mod) {
      auto Attrs = F.getAttributes();
      StringRef Value(options.NoFramePointerElim ? "true" : "false");
      Attrs = Attrs.addAttribute(F.getContext(), AttributeList::FunctionIndex,
                                 "no-frame-pointer-elim", Value);
      F.setAttributes(Attrs);
    }
that's all that option did for mcjit.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

27 January 2018, 05:26:03

On Wed, Jan 24, 2018 at 11:02 PM, Andres Freund <andres@anarazel.de> wrote:
> Not entirely sure what you mean. You mean why I don't inline
> slot_getsomeattrs() etc and instead generate code manually?  The reason
> is that the generated code is a *lot* smarter due to knowing the
> specific tupledesc.

I would like to see if we can get a combination of JIT and LTO to work
together to specialize generic code at runtime.

Let's say you have a function f(int x, int y, int z). You want to be
able to specialize it on y at runtime, so that a loop gets unrolled in
the common case where y is small.

1. At build time, create bitcode for the generic implementation of f().
2. At run time, load the generic bitcode into a module (let's call it
the "generic module")
3. At run time, create a new module (let's call it the "bind module")
that only does the following things:
   a. declares a global variable bind_y, and initialize it to the value 3
   b. declares a wrapper function f_wrapper(int x, int z), and all the
function does is call f(x, bind_y, z)
4. Link the generic module and the bind module together (let's call
the result the "linked module")
5. Optimize the linked module

After sorting out a few details about symbols and inlining, what will
happen is that the generic f() will be inlined into f_wrapper, and it
will see that bind_y is a constant, and then unroll a "for" loop over
y.

I experimented a bit before and it works for basic cases, but I'm not
sure if it's as good as your hand-generated LLVM.

If we can make this work, it would be a big win for
readability/maintainability. The hand-generated LLVM is limited to the
bind module, which is very simple, and doesn't need to be changed when
the implementation of f() changes.

Regards,
     Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

27 January 2018, 05:40:42

Hi,

On 2018-01-26 18:26:03 -0800, Jeff Davis wrote:
> On Wed, Jan 24, 2018 at 11:02 PM, Andres Freund <andres@anarazel.de> wrote:
> > Not entirely sure what you mean. You mean why I don't inline
> > slot_getsomeattrs() etc and instead generate code manually?  The reason
> > is that the generated code is a *lot* smarter due to knowing the
> > specific tupledesc.
> 
> I would like to see if we can get a combination of JIT and LTO to work
> together to specialize generic code at runtime.

Well, LTO can't quite work. It relies on being able to mark code in
modules linked together as externally visible - and cleary we can't do
that for a running postgres binary. At least in all incarnations I'm
aware of.  But that's why the tree I posted supports inlining of code.

> Let's say you have a function f(int x, int y, int z). You want to be
> able to specialize it on y at runtime, so that a loop gets unrolled in
> the common case where y is small.
> 
> 1. At build time, create bitcode for the generic implementation of f().
> 2. At run time, load the generic bitcode into a module (let's call it
> the "generic module")
> 3. At run time, create a new module (let's call it the "bind module")
> that only does the following things:
>    a. declares a global variable bind_y, and initialize it to the value 3
>    b. declares a wrapper function f_wrapper(int x, int z), and all the
> function does is call f(x, bind_y, z)
> 4. Link the generic module and the bind module together (let's call
> the result the "linked module")
> 5. Optimize the linked module

Afaict that's effectively what I've already implemented. We could export
more input as constants to the generated program, but other than that...

Whenever any extern functions are referenced, and jit_inlining=1, then
the code will see whether the called external code is available as jit
bitcode. Based on a simple instruction based cost limit that function
will get inlined (unless it references file local non-constant static
variables and such).

Now the JITed expressions tree currently makes it hard for LLVM to
recognize some constant input as constant, but what's largely needed for
that to be better is some improvements in where temporary values are
stored (should be in alloca's rather than local memory, so mem2reg can
do its thing).  It's a TODO... Right now LLVM will figure out constant
inputs to non-strict functions, but not strict ones, but after fixing
some of what I've mentioned previously it works pretty universally.

Have I misunderstood adn there's some significant functional difference?

> I experimented a bit before and it works for basic cases, but I'm not
> sure if it's as good as your hand-generated LLVM.

For deforming it doesn't even remotely get as good in my experiments.

> If we can make this work, it would be a big win for
> readability/maintainability. The hand-generated LLVM is limited to the
> bind module, which is very simple, and doesn't need to be changed when
> the implementation of f() changes.

Right. Thats why I think we definitely want that for the large majority
of referenced functionality.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

27 January 2018, 09:52:35

Hi,

On Fri, Jan 26, 2018 at 6:40 PM, Andres Freund <andres@anarazel.de> wrote:
>> I would like to see if we can get a combination of JIT and LTO to work
>> together to specialize generic code at runtime.
>
> Well, LTO can't quite work. It relies on being able to mark code in
> modules linked together as externally visible - and cleary we can't do
> that for a running postgres binary. At least in all incarnations I'm
> aware of.  But that's why the tree I posted supports inlining of code.

I meant a more narrow use of LTO: since we are doing linking in step
#4 and optimization in step #5, it's optimizing the code after
linking, which is a kind of LTO (though perhaps I'm misusing the
term?).

The version of LLVM that I tried this against had a linker option
called "InternalizeLinkedSymbols" that would prevent the visibility
problem you mention (assuming I understand you correctly). That option
is no longer there so I will have to figure out how to do it with the
current LLVM API.

> Afaict that's effectively what I've already implemented. We could export
> more input as constants to the generated program, but other than that...

I brought this up in the context of slot_compile_deform(). In your
patch, you have code like:

+               if (!att->attnotnull)
+               {
...
+                       v_nullbyte = LLVMBuildLoad(
+                               builder,
+                               LLVMBuildGEP(builder, v_bits,
+                                                        &v_nullbyteno, 1, ""),
+                               "attnullbyte");
+
+                       v_nullbit = LLVMBuildICmp(
+                               builder,
+                               LLVMIntEQ,
+                               LLVMBuildAnd(builder, v_nullbyte,
v_nullbytemask, ""),
+                               LLVMConstInt(LLVMInt8Type(), 0, false),
+                               "attisnull");
...

So it looks like you are reimplementing the generic code, but with
conditional code gen. If the generic code changes, someone will need
to read, understand, and change this code, too, right?

With my approach, then it would initially do *un*conditional code gen,
and be less efficient and less specialized than the code generated by
your current patch. But then it would link in the constant tupledesc,
and optimize, and the optimizer will realize that they are constants
(hopefully) and then cut out a lot of the dead code and specialize it
to the given tupledesc.

This places a lot of faith in the optimizer and I realize it may not
happen as nicely with real code as it did with my earlier experiments.
Maybe you already tried and you are saying that's a dead end? I'll
give it a shot, though.

> Now the JITed expressions tree currently makes it hard for LLVM to
> recognize some constant input as constant, but what's largely needed for
> that to be better is some improvements in where temporary values are
> stored (should be in alloca's rather than local memory, so mem2reg can
> do its thing).  It's a TODO... Right now LLVM will figure out constant
> inputs to non-strict functions, but not strict ones, but after fixing
> some of what I've mentioned previously it works pretty universally.
>
>
> Have I misunderstood adn there's some significant functional difference?

I'll try to explain with code, and then we can know for sure ;-)

Sorry for the ambiguity, I'm probably misusing a few terms.

>> I experimented a bit before and it works for basic cases, but I'm not
>> sure if it's as good as your hand-generated LLVM.
>
> For deforming it doesn't even remotely get as good in my experiments.

I'd like some more information here -- what didn't work? It didn't
recognize constants? Or did recognize them, but didn't optimize as
well as you did by hand?

Regards,
      Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

28 January 2018, 00:20:37

Hi,

On 2018-01-26 22:52:35 -0800, Jeff Davis wrote:
> The version of LLVM that I tried this against had a linker option
> called "InternalizeLinkedSymbols" that would prevent the visibility
> problem you mention (assuming I understand you correctly).

I don't think they're fully solvable - you can't really internalize a
reference to a mutable static variable in another translation
unit. Unless you modify that translation unit, which doesn't work when
postgres running.

> That option is no longer there so I will have to figure out how to do
> it with the current LLVM API.

Look at the llvmjit_wrap.c code invoking FunctionImporter - that pretty
much does that.  I'll push a cleaned up version of that code sometime
this weekend (it'll then live in llvmjit_inline.cpp).

> > Afaict that's effectively what I've already implemented. We could export
> > more input as constants to the generated program, but other than that...
>
> I brought this up in the context of slot_compile_deform(). In your
> patch, you have code like:
>
> +               if (!att->attnotnull)
> +               {
> ...
> +                       v_nullbyte = LLVMBuildLoad(
> +                               builder,
> +                               LLVMBuildGEP(builder, v_bits,
> +                                                        &v_nullbyteno, 1, ""),
> +                               "attnullbyte");
> +
> +                       v_nullbit = LLVMBuildICmp(
> +                               builder,
> +                               LLVMIntEQ,
> +                               LLVMBuildAnd(builder, v_nullbyte,
> v_nullbytemask, ""),
> +                               LLVMConstInt(LLVMInt8Type(), 0, false),
> +                               "attisnull");
> ...
>
> So it looks like you are reimplementing the generic code, but with
> conditional code gen. If the generic code changes, someone will need
> to read, understand, and change this code, too, right?

Right. Not that that's code that has changed that much...

> With my approach, then it would initially do *un*conditional code gen,
> and be less efficient and less specialized than the code generated by
> your current patch. But then it would link in the constant tupledesc,
> and optimize, and the optimizer will realize that they are constants
> (hopefully) and then cut out a lot of the dead code and specialize it
> to the given tupledesc.

Right.

> This places a lot of faith in the optimizer and I realize it may not
> happen as nicely with real code as it did with my earlier experiments.
> Maybe you already tried and you are saying that's a dead end? I'll
> give it a shot, though.

I did that, yes. There's two major downsides:

a) The code isn't as efficient as the handrolled code. The handrolled
   code e.g. can take into account that it doesn't need to access the
   NULL bitmap for a NOT NULL column and we don't need to check the
   tuple's number of attributes if there's a following NOT NULL
   attribute. Those safe a good number of cycles.

b) The optimizations to take advantage of the constants and make the
   code faster with the constant tupledesc is fairly slow (you pretty
   much need at least an -O2 equivalent), whereas the handrolled tuple
   deforming is faster than the slot_getsomeattrs with just a single,
   pretty cheap, mem2reg pass.  We're talking about ~1ms vs 70-100ms in
   a lot of cases.  The optimizer often will not actually unroll the
   loop with many attributes despite that being beneficial.

I think in most cases using the approach you advocate makes sense, to
avoid duplication, but tuple deforming is such a major bottleneck that I
think it's clearly worth doing it manually. Being able to use llvm with
just a always-inline and a mem2reg pass makes it so much more widely
applicable than doing the full inlining and optimization work.

> >> I experimented a bit before and it works for basic cases, but I'm not
> >> sure if it's as good as your hand-generated LLVM.
> >
> > For deforming it doesn't even remotely get as good in my experiments.
>
> I'd like some more information here -- what didn't work? It didn't
> recognize constants? Or did recognize them, but didn't optimize as
> well as you did by hand?

It didn't optimize as well as I did by hand, without significantly
complicating (and slowing) the originating the code. It sometimes
decided not to unroll the loop, and it takes a *lot* longer than the
direct emission of the code.

I'm hoping to work on making more of the executor JITed, and there I do
think it's largely going to be what you're proposing, due to the sheer
mass of code.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

28 January 2018, 03:56:17

On Sat, Jan 27, 2018 at 1:20 PM, Andres Freund <andres@anarazel.de> wrote:
> b) The optimizations to take advantage of the constants and make the
>    code faster with the constant tupledesc is fairly slow (you pretty
>    much need at least an -O2 equivalent), whereas the handrolled tuple
>    deforming is faster than the slot_getsomeattrs with just a single,
>    pretty cheap, mem2reg pass.  We're talking about ~1ms vs 70-100ms in
>    a lot of cases.  The optimizer often will not actually unroll the
>    loop with many attributes despite that being beneficial.

This seems like the major point. We would have to customize the
optimization passes a lot and/or choose carefully which ones we apply.

> I think in most cases using the approach you advocate makes sense, to
> avoid duplication, but tuple deforming is such a major bottleneck that I
> think it's clearly worth doing it manually. Being able to use llvm with
> just a always-inline and a mem2reg pass makes it so much more widely
> applicable than doing the full inlining and optimization work.

OK.

On another topic, I'm trying to find a way we could break this patch
into smaller pieces. For instance, if we concentrate on tuple
deforming, maybe it would be committable in time for v11?

I see that you added some optimizations to the existing generic code.
Do those offer a measurable improvement, and if so, can you commit
those first to make the JIT stuff more readable?

Also, I'm sure you considered this, but I'd like to ask if we can try
harder make the JIT itself happen in an extension. It has some pretty
huge benefits:
  * The JIT code is likely to go through a lot of changes, and it
would be nice if it wasn't tied to a yearly release cycle.
  * Would mean postgres itself isn't dependent on a huge library like
llvm, which just seems like a good idea from a packaging standpoint.
  * May give GCC or something else a chance to compete with it's own JIT.
  * It may make it easier to get something in v11.

It appears reasonable to make the slot deforming and expression
evaluator parts an extension. execExpr.h only exports a couple new
functions; heaptuple.c has a lot of changes but they seem like they
could be separated (unless I'm missing something).

The biggest problem is that the inlining would be much harder to
separate out, because you are building the .bc files at build time. I
really like the idea of inlining, but it doesn't necessarily need to
be in the first commit.

Regards,
     Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

28 January 2018, 04:15:42

Hi,

On 2018-01-27 16:56:17 -0800, Jeff Davis wrote:
> On another topic, I'm trying to find a way we could break this patch
> into smaller pieces. For instance, if we concentrate on tuple
> deforming, maybe it would be committable in time for v11?

Yea, I'd planned and started to do so. I actually hope we can get more
committed than just the tuple deforming code - for one it currently
integrates directly with the expression evaluation code, and my
experience with trying to do so outside of it have not gone well.

> I see that you added some optimizations to the existing generic code.
> Do those offer a measurable improvement, and if so, can you commit
> those first to make the JIT stuff more readable?

I think basically the later a patch currently is in the series the less
important it is.

I've already committed a lot of preparatory patches (like that aggs now
use the expression engine), and I plan to continue doing so.

> Also, I'm sure you considered this, but I'd like to ask if we can try
> harder make the JIT itself happen in an extension. It has some pretty
> huge benefits:

I'm very strongly against this. To the point that I'll not pursue JITing
further if that becomes a requirement.

I could be persuaded to put it into a shared library instead of the main
binary itself, but I think developing it outside of core is entirely
infeasible because quite freuquently both non-JITed code and JITed code
need adjustments. That'd solve your concern about

>   * Would mean postgres itself isn't dependent on a huge library like
> llvm, which just seems like a good idea from a packaging standpoint.

to some degree.

I think it's a fools errand to try to keep in sync with core changes on
the expression evaluation and struct definition side of things. There's
planner integration, error handling integration and similar related
things too, all of which require core changes. Therefore I don't think
there's a reasonable chance of success of doing this outside of core
postgres.

> It appears reasonable to make the slot deforming and expression
> evaluator parts an extension. execExpr.h only exports a couple new
> functions; heaptuple.c has a lot of changes but they seem like they
> could be separated (unless I'm missing something).

The heaptuple.c stuff could largely be dropped, that was more an effort
to level the plainfield a bit to make the comparison fairer.  I kinda
wondered about putting the JIT code in a heaptuple_jit.c file instead of
heaptuple.c.

> The biggest problem is that the inlining would be much harder to
> separate out, because you are building the .bc files at build time. I
> really like the idea of inlining, but it doesn't necessarily need to
> be in the first commit.

Well, but doing this outside of core would pretty much prohibit doing so
forever, no? Getting the inlining design right has influenced several
other parts of the code. I think it's right that the inlining doesn't
necessarily have to be part of the initial set of commits (and I plan to
separate it out in the next revision), but I do think it has to be
written in a reasonably ready form at the time of commit.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

28 January 2018, 09:06:59

On Sat, Jan 27, 2018 at 5:15 PM, Andres Freund <andres@anarazel.de> wrote:
>> Also, I'm sure you considered this, but I'd like to ask if we can try
>> harder make the JIT itself happen in an extension. It has some pretty
>> huge benefits:
>
> I'm very strongly against this. To the point that I'll not pursue JITing
> further if that becomes a requirement.

I would like to see this feature succeed and I'm not making any
specific demands.

> infeasible because quite freuquently both non-JITed code and JITed code
> need adjustments. That'd solve your concern about

Can you explain further?

> I think it's a fools errand to try to keep in sync with core changes on
> the expression evaluation and struct definition side of things. There's
> planner integration, error handling integration and similar related
> things too, all of which require core changes. Therefore I don't think
> there's a reasonable chance of success of doing this outside of core
> postgres.

I wasn't suggesting the entire patch be done outside of core. Core
will certainly need to know about JIT compilation, but I am not
convinced that it needs to know about the details of LLVM. All the
references to the LLVM library itself are contained in a few files, so
you've already got it well organized. What's stopping us from putting
that code into a "jit provider" extension that implements the proper
interfaces?

> Well, but doing this outside of core would pretty much prohibit doing so
> forever, no?

First of all, building .bc files at build time is much less invasive
than linking to the LLVM library. Any version of clang will produce
bitcode that can be read by any LLVM library or tool later (more or
less).

Second, we could change our minds later. Mark any extension APIs as
experimental, and decide we want to move LLVM into postgres whenever
it is needed.

Third, there's lots of cool stuff we can do here:
  * put the source in the catalog
  * an extension could have its own catalog and build the source into
bitcode and cache it there
  * the source for functions would flow to replicas, etc.
  * security-conscious environments might even choose to run some of
the C code in a safe C interpreter rather than machine code

So I really don't see this as permanently closing off our options.

Regards,
     Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Pierre Ducroquet

Date:

29 January 2018, 01:02:56

On Thursday, January 25, 2018 8:12:42 PM CET Andres Freund wrote:
> Hi,
> 
> On 2018-01-25 10:00:14 +0100, Pierre Ducroquet wrote:
> > I don't know when this would be released,
> 
> August-October range.
> 
> > but the minimal supported LLVM
> > version will have a strong influence on the availability of that feature.
> > If today this JIT compiling was released with only LLVM 5/6 support, it
> > would be unusable for most Debian users (llvm-5 is only available in
> > sid). Even llvm 4 is not available in latest stable.
> > I'm already trying to build with llvm-4 and I'm going to try further with
> > llvm 3.9 (Debian Stretch doesn't have a more recent than this one, and I
> > won't have something better to play with my data), I'll keep you
> > informed. For sport, I may also try llvm 3.5 (for Debian Jessie).
> 
> I don't think it's unreasonable to not support super old llvm
> versions. This is a complex feature, and will take some time to
> mature. Supporting too many LLVM versions at the outset will have some
> cost.  Versions before 3.8 would require supporting mcjit rather than
> orc, and I don't think that'd be worth doing.  I think 3.9 might be a
> reasonable baseline...
> 
> Greetings,
> 
> Andres Freund

Hi

I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM documentation is 
really lacking when it comes to porting from version x to x+1.
The only really missing part I found is that in 3.9, GlobalValueSummary has no 
flag showing if it's not EligibleToImport. I am not sure about the 
consequences.
I'm still fixing some runtime issues so I will not bother you with the patch 
right now.
BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc 
file when cleaning, and doesn't seem to install in the right folder.

Regards

    Pierre

Re: JIT compiling with LLVM v9.0

From

Pierre Ducroquet

Date:

29 January 2018, 09:59:06

On Thursday, January 25, 2018 8:02:54 AM CET Andres Freund wrote:
> Hi!
> 
> On 2018-01-24 22:51:36 -0800, Jeff Davis wrote:
> > Can we store the bitcode in pg_proc, simplifying deployment and
> > allowing extensions to travel over replication?
> 
> Yes, we could. You'd need to be a bit careful that all the machines have
> similar-ish cpu generations or compile with defensive settings, but that
> seems okay.

Hi

Doing this would 'bind' the database to the LLVM release used. LLVM can, as 
far as I know, generate bitcode only for the current version, and will only be 
able to read bitcode from previous versions. So you can't have, for instance a 
master server with LLVM 5 and a standby server with LLVM 4.
So maybe PostgreSQL would have to expose what LLVM version is currently used ? 
Or a major PostgreSQL release could accept only one major LLVM release, as was 
suggested in another thread ?

 Pierre

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

29 January 2018, 12:36:53

Hi,

On 2018-01-27 22:06:59 -0800, Jeff Davis wrote:
> > infeasible because quite freuquently both non-JITed code and JITed code
> > need adjustments. That'd solve your concern about
> 
> Can you explain further?

There's already a *lot* of integration points in the patchseries. Error
handling needs to happen in parts of code we do not want to make
extensible, the defintion of expression steps has to exactly match, the
core code needs to emit the right types for syncing, the core code needs
to define the right FIELDNO accessors, there needs to be planner
integrations.  Many of those aren't doable with even remotely the same
effort, both initial and continual, from non-core code....

I think those alone make it bad, but there'll be more. Short-Medium term
expression evaluation needs to evolve further to make JITing cachable:
http://archives.postgresql.org/message-id/20180124203616.3gx4vm45hpoijpw3%40alap3.anarazel.de
which again definitely has to be happen in core and will require
corresponding changes on the JIT side very step. Then we'll need to
introduce something like plancache (or something similar?) support for
JITing to reuse JITed functions.

Then there's also a significant difference in how large the adoption's
going to be, and how all the core code that'd need to be added is
supposed to be testable without the JIT emitting side in core.

> > I think it's a fools errand to try to keep in sync with core changes on
> > the expression evaluation and struct definition side of things. There's
> > planner integration, error handling integration and similar related
> > things too, all of which require core changes. Therefore I don't think
> > there's a reasonable chance of success of doing this outside of core
> > postgres.
> 
> I wasn't suggesting the entire patch be done outside of core. Core
> will certainly need to know about JIT compilation, but I am not
> convinced that it needs to know about the details of LLVM. All the
> references to the LLVM library itself are contained in a few files, so
> you've already got it well organized. What's stopping us from putting
> that code into a "jit provider" extension that implements the proper
> interfaces?

The above hopefully answers that?

What we could do, imo somewhat realistically, is to put most of the
provider into a dynamically loaded shared library that lives in core
(similar to how we build the pgoutput output plugin shared library as
part of core). But that still would end up hard coding things like LLVM
specific error handling etc, which we currently do *NOT* want to be
extensible.

> > Well, but doing this outside of core would pretty much prohibit doing so
> > forever, no?
> 
> First of all, building .bc files at build time is much less invasive
> than linking to the LLVM library.

Could you expand on that, I don't understand why that'd be the case?

> Any version of clang will produce bitcode that can be read by any LLVM
> library or tool later (more or less).

Well, forward portable, not backward portable.

> Second, we could change our minds later. Mark any extension APIs as
> experimental, and decide we want to move LLVM into postgres whenever
> it is needed.
> 
> Third, there's lots of cool stuff we can do here:
>   * put the source in the catalog
>   * an extension could have its own catalog and build the source into
> bitcode and cache it there
>   * the source for functions would flow to replicas, etc.
>   * security-conscious environments might even choose to run some of
> the C code in a safe C interpreter rather than machine code

I agree, but what does that have to do with the llvmjit stuff being an
extension or not?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

29 January 2018, 12:46:13

Hi,

On 2018-01-28 23:02:56 +0100, Pierre Ducroquet wrote:
> I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM documentation is 
> really lacking when it comes to porting from version x to x+1.
> The only really missing part I found is that in 3.9, GlobalValueSummary has no 
> flag showing if it's not EligibleToImport. I am not sure about the 
> consequences.

I think that'd not be too bad, it'd just lead to some small increase in
overhead as more modules would be loaded.

> BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc 
> file when cleaning, and doesn't seem to install in the right folder.

Hm, both seems to be right here? Note that the llvmjit_types.bc file
should *not* go into the bitcode/ directory, as it's about syncing types
not inlining. I've added a comment to that effect.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.1

From

Andres Freund

Date:

29 January 2018, 12:53:50

Hi,

On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
> == Code ==
> 
> As the patchset is large (500kb) and I'm still quickly evolving it, I do
> not yet want to attach it. The git tree is at
>   https://git.postgresql.org/git/users/andresfreund/postgres.git
> in the jit branch
>   https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

I've just pushed an updated and rebased version of the tree:
- Split the large "jit infrastructure" commits into a number of smaller
  commits
- Split the C++ file
- Dropped some of the performance stuff done to heaptuple.c - that was
  mostly to make performance comparisons a bit more interesting, but
  doesn't seem important enough to deal with.
- Added a commit renaming datetime.h symbols so they don't conflict with
  LLVM variables anymore, removing ugly #undef PM/#define PM dance
  around includes. Will post separately.
- Reduced the number of pointer constants in the generated LLVM IR, by
  doing more getelementptr accesses (stem from before the time types
  were automatically synced)
- Increased number of comments a bit

There's a jit-before-rebase-2018-01-29 tag, for the state of the tree
before the rebase.

Regards,

Andres

Re: JIT compiling with LLVM v9.0

From

Pierre Ducroquet

Date:

29 January 2018, 13:20:38

On Monday, January 29, 2018 10:46:13 AM CET Andres Freund wrote:
> Hi,
> 
> On 2018-01-28 23:02:56 +0100, Pierre Ducroquet wrote:
> > I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM
> > documentation is really lacking when it comes to porting from version x
> > to x+1.
> > The only really missing part I found is that in 3.9, GlobalValueSummary
> > has no flag showing if it's not EligibleToImport. I am not sure about the
> > consequences.
> 
> I think that'd not be too bad, it'd just lead to some small increase in
> overhead as more modules would be loaded.
> 
> > BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc
> > file when cleaning, and doesn't seem to install in the right folder.
> 
> Hm, both seems to be right here? Note that the llvmjit_types.bc file
> should *not* go into the bitcode/ directory, as it's about syncing types
> not inlining. I've added a comment to that effect.

The file was installed in lib/ while the code expected it in lib/postgresql. 
So there was something wrong here.
And deleting the file when cleaning is needed if at configure another llvm 
version is used. The file must be generated with a clang release that is not 
more recent than the llvm version linked to postgresql. Otherwise, the bitcode 
generated is not accepted by llvm.

Regards

 Pierre

Re: JIT compiling with LLVM v9.0

From

Konstantin Knizhnik

Date:

29 January 2018, 15:45:56

On 26.01.2018 22:38, Andres Freund wrote:
> And without it perf is not able to unwind stack trace for generated
>> code.
> You can work around that by using --call-graph lbr with a sufficiently
> new perf. That'll not know function names et al, but at least the parent
> will be associated correctly.

With --call-graph lbr result is ... slightly different (see attached 
profile) but still there is "unknown" bar.

>> But you are compiling code using LLVMOrcAddEagerlyCompiledIR
>> and I find no way to pass no-omit-frame pointer option here.
> It shouldn't be too hard to open code support for it, encapsulated in a
> function:
>      // Set function attribute "no-frame-pointer-elim" based on
>      // NoFramePointerElim.
>      for (auto &F : *Mod) {
>        auto Attrs = F.getAttributes();
>        StringRef Value(options.NoFramePointerElim ? "true" : "false");
>        Attrs = Attrs.addAttribute(F.getContext(), AttributeList::FunctionIndex,
>                                   "no-frame-pointer-elim", Value);
>        F.setAttributes(Attrs);
>      }
> that's all that option did for mcjit.

I have implemented the following function:

void
llvm_no_frame_pointer_elimination(LLVMModuleRef mod)
{
     llvm::Module *module = llvm::unwrap(mod);
     for (auto &F : *module) {
         auto Attrs = F.getAttributes();
         Attrs = Attrs.addAttribute(F.getContext(), 
llvm::AttributeList::FunctionIndex,
                                    "no-frame-pointer-elim", "true");
         F.setAttributes(Attrs);
     }
}

and call it before LLVMOrcAddEagerlyCompiledIR in llvm_compile_module:

         llvm_no_frame_pointer_elimination(context->module);
         smod = LLVMOrcMakeSharedModule(context->module);

         if (LLVMOrcAddEagerlyCompiledIR(compile_orc, &orc_handle, smod,
                                         llvm_resolve_symbol, NULL))
         {
             elog(ERROR, "failed to jit module");
         }


... but it has no effect: produced profile is the same (with 
--call-graph dwarf).
May be you can point me on my mistake...


Actually I am trying to find answer for the question why your version of 
JIT provides ~2 times speedup at Q1, while ISPRAS version 

(https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)
speedup Q1 is 5.5x times.
May be it is because them are using double type to calculate aggregates 
while as far as I understand you are using standard Postgres aggregate 
functions?
Or may be because ISPRAS version is not checking for NULL values...

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

q1.svg

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

29 January 2018, 20:00:23

Hi,

On 2018-01-29 15:45:56 +0300, Konstantin Knizhnik wrote:
> On 26.01.2018 22:38, Andres Freund wrote:
> > And without it perf is not able to unwind stack trace for generated
> > > code.
> > You can work around that by using --call-graph lbr with a sufficiently
> > new perf. That'll not know function names et al, but at least the parent
> > will be associated correctly.
> 
> With --call-graph lbr result is ... slightly different (see attached
> profile) but still there is "unknown" bar.

Right. All that allows is to attribute the cost below the parent in the
perf report --children case. For it to be attributed to proper symbols
you need my llvm patch to support pef.



> Actually I am trying to find answer for the question why your version of JIT
> provides ~2 times speedup at Q1, while ISPRAS version
(https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)
> speedup Q1 is 5.5x times.
> May be it is because them are using double type to calculate aggregates
> while as far as I understand you are using standard Postgres aggregate
> functions?
> Or may be because ISPRAS version is not checking for NULL values...

All of those together, yes. And added that I'm aiming to work
incrementally towards core inclusions, rather than getting the best
results.  There's a *lot* that can be done to improve the generated code
- after e.g. hacking together an improvement to the argument passing (by
allocating isnull / nargs / arg[], argnull[] as a separate on-stack from
FunctionCallInfoData), I get another 1.8x.  Eliminating redundant float
overflow checks gives another 1.2x. And so on.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

29 January 2018, 21:28:18

On Mon, Jan 29, 2018 at 1:36 AM, Andres Freund <andres@anarazel.de> wrote:
> There's already a *lot* of integration points in the patchseries. Error
> handling needs to happen in parts of code we do not want to make
> extensible, the defintion of expression steps has to exactly match, the
> core code needs to emit the right types for syncing, the core code needs
> to define the right FIELDNO accessors, there needs to be planner
> integrations.  Many of those aren't doable with even remotely the same
> effort, both initial and continual, from non-core code....

OK. How about this: are you open to changes that move us in the
direction of extensibility later? (By this I do *not* mean imposing a
bunch of requirements on you... either small changes to your patches
or something part of another commit.) Or are you determined that this
always should be a part of core?

I don't want to stand in your way, but I am also hesitant to dive head
first into LLVM and not look back. Postgres has always been lean, fast
building, and with few dependencies. Who knows what LLVM will do in
the future and how that will affect postgres? Especially when, on day
one, we already know that it causes a few annoyances?

In other words, are you "strongly against [extensbility being a
requirement for the first commit]" or "strongly against [extensible
JIT]"?

>> > Well, but doing this outside of core would pretty much prohibit doing so
>> > forever, no?
>>
>> First of all, building .bc files at build time is much less invasive
>> than linking to the LLVM library.
>
> Could you expand on that, I don't understand why that'd be the case?

Building the .bc files at build time depends on LLVM, but is not very
version-dependent and has no impact on the resulting binary. That's
less invasive than a dependency on a library with an unstable API that
doesn't entirely work with our error reporting facility.

>> Third, there's lots of cool stuff we can do here:
>>   * put the source in the catalog
>>   * an extension could have its own catalog and build the source into
>> bitcode and cache it there
>>   * the source for functions would flow to replicas, etc.
>>   * security-conscious environments might even choose to run some of
>> the C code in a safe C interpreter rather than machine code
>
> I agree, but what does that have to do with the llvmjit stuff being an
> extension or not?

If the source for functions is in the catalog, we could build the
bitcode at runtime and still do the inlining. We wouldn't need to do
anything at build time. (Again, this would be "cool stuff for the
future", I am not asking you for it now.)

Regards,
     Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

29 January 2018, 21:40:06

Hi,

On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
> OK. How about this: are you open to changes that move us in the
> direction of extensibility later? (By this I do *not* mean imposing a
> bunch of requirements on you... either small changes to your patches
> or something part of another commit.)

I'm good with that.

> Or are you determined that this always should be a part of core?

I do think JIT compilation should be in core, yes. And after quite some
looking around that currently means either using LLVM or building our
own from scratch, and the latter doesn't seem attractive. But that
doesn't mean there *also* can be extensibility. If somebody wants to
experiment with a more advanced version of JIT compilation, develop a
gcc backed version (which can't be in core due to licensing), ... - I'm
happy to provide hooks that only require a reasonable effort and don't
affect the overall stability of the system (i.e. no callback from
PostgresMain()'s sigsetjmp() block).

> I don't want to stand in your way, but I am also hesitant to dive head
> first into LLVM and not look back. Postgres has always been lean, fast
> building, and with few dependencies.

It's an optional dependency, and it doesn't increase build time that
much... If we were to move the llvm interfacing code to a .so, there'd
not even be a packaging issue, you can just package that .so separately
and get errors if somebody tries to enable LLVM without that .so being
installed.

> In other words, are you "strongly against [extensbility being a
> requirement for the first commit]" or "strongly against [extensible
> JIT]"?

I'm strongly against there not being an in-core JIT. I'm not at all
against adding APIs that allow to do different JIT implementations out
of core.

> If the source for functions is in the catalog, we could build the
> bitcode at runtime and still do the inlining. We wouldn't need to do
> anything at build time. (Again, this would be "cool stuff for the
> future", I am not asking you for it now.)

Well, the source would require an actual compiler around. And the
inlining *just* for the function code itself isn't actually that
interesting, you e.g. want to also be able to

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Tomas Vondra

Date:

30 January 2018, 00:51:38

On 01/24/2018 08:20 AM, Andres Freund wrote:
> Hi,
> 
> I've spent the last weeks working on my LLVM compilation patchset. In
> the course of that I *heavily* revised it. While still a good bit away
> from committable, it's IMO definitely not a prototype anymore.
> 
> There's too many small changes, so I'm only going to list the major
> things. A good bit of that is new. The actual LLVM IR emissions itself
> hasn't changed that drastically.  Since I've not described them in
> detail before I'll describe from scratch in a few cases, even if things
> haven't fully changed.
> 

Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.

I wonder if I'm doing something wrong, or if there's something wrong
with my environment. I do have this:

$ clang -v
clang version 5.0.0 (trunk 299717)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

make-failure.txt

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

30 January 2018, 00:57:29

Hi,

On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
> Hi, I wanted to look at this, but my attempts to build the jit branch
> fail with some compile-time warnings (uninitialized variables) and
> errors (unknown types, incorrect number of arguments). See the file
> attached.

Which git hash are you building?  What llvm version is this building
against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
--version return?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Tomas Vondra

Date:

30 January 2018, 01:01:14

On 01/29/2018 10:57 PM, Andres Freund wrote:
> Hi,
> 
> On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
>> Hi, I wanted to look at this, but my attempts to build the jit branch
>> fail with some compile-time warnings (uninitialized variables) and
>> errors (unknown types, incorrect number of arguments). See the file
>> attached.
> 
> Which git hash are you building?  What llvm version is this building
> against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
> --version return?
> 

I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
HEAD in the jit branch, AFAICS).

I'm building like this:

$ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
              --with-llvm --prefix=/home/postgres/pg-llvm

$ make -s -j4 install

and llvm-config --version says this:

$ llvm-config --version
5.0.0svn


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

30 January 2018, 01:17:22

On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
> On 01/29/2018 10:57 PM, Andres Freund wrote:
> > Hi,
> > 
> > On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
> >> Hi, I wanted to look at this, but my attempts to build the jit branch
> >> fail with some compile-time warnings (uninitialized variables) and
> >> errors (unknown types, incorrect number of arguments). See the file
> >> attached.
> > 
> > Which git hash are you building?  What llvm version is this building
> > against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
> > --version return?
> > 
> 
> I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
> HEAD in the jit branch, AFAICS).

The warnings come from an incomplete patch I probably shouldn't have
pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
(although will cause a handful of "ERROR: hm" regression failures),
but I'll definitely pop that commit on the next rebase.  If you want you
can just reset --hard to its parent.


That errors are weird however:

> llvmjit.c: In function ‘llvm_get_function’:
> llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from incompatible pointer type
[-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, &addr, mangled))
>                                              ^
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ but argument is of type
‘LLVMOrcTargetAddress* {aka long unsigned int *}’
 
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>                       ^~~~~~~~~~~~~~~~~~~~~~~
> llvmjit.c:239:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, &addr, mangled))
>       ^~~~~~~~~~~~~~~~~~~~~~~
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>                       ^~~~~~~~~~~~~~~~~~~~~~~
> llvmjit.c:243:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from incompatible pointer type
[-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, &addr, mangled))
>                                              ^

> I'm building like this:
> 
> $ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
>               --with-llvm --prefix=/home/postgres/pg-llvm
> 
> $ make -s -j4 install
> 
> and llvm-config --version says this:
> 
> $ llvm-config --version
> 5.0.0svn

Is thta llvm-config the one in /usr/local/include/ referenced by the
error message above?  Or is it possible that llvm-config is from a
different version than the one the compiler picks the headers up from?

could you go to src/backend/lib, rm llvmjit.o, and show the full output
of make llvmjit.o?

I wonder whether the issue is that my configure patch does
        -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
rather than
        -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
and that it thus picks up the wrong header first?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Tomas Vondra

Date:

30 January 2018, 01:49:14

On 01/29/2018 11:17 PM, Andres Freund wrote:
> On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
>> On 01/29/2018 10:57 PM, Andres Freund wrote:
>>> Hi,
>>>
>>> On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
>>>> Hi, I wanted to look at this, but my attempts to build the jit branch
>>>> fail with some compile-time warnings (uninitialized variables) and
>>>> errors (unknown types, incorrect number of arguments). See the file
>>>> attached.
>>>
>>> Which git hash are you building?  What llvm version is this building
>>> against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
>>> --version return?
>>>
>>
>> I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
>> HEAD in the jit branch, AFAICS).
> 
> The warnings come from an incomplete patch I probably shouldn't have
> pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
> (although will cause a handful of "ERROR: hm" regression failures),
> but I'll definitely pop that commit on the next rebase.  If you want you
> can just reset --hard to its parent.
> 

OK

> 
> That errors are weird however:
> 
>> ...                                          ^
> 
>> I'm building like this:
>>
>> $ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
>>               --with-llvm --prefix=/home/postgres/pg-llvm
>>
>> $ make -s -j4 install
>>
>> and llvm-config --version says this:
>>
>> $ llvm-config --version
>> 5.0.0svn
> 
> Is thta llvm-config the one in /usr/local/include/ referenced by the
> error message above?

I don't see it referenced anywhere, but it comes from here:

$ which llvm-config
/usr/local/bin/llvm-config

> Or is it possible that llvm-config is from a different version than
> the one the compiler picks the headers up from?
> 

I don't think so. I don't have any other llvm versions installed, AFAICS.

> could you go to src/backend/lib, rm llvmjit.o, and show the full output
> of make llvmjit.o?
> 

Attached.

> I wonder whether the issue is that my configure patch does
>         -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> rather than
>         -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
> and that it thus picks up the wrong header first?
> 

I've tried this configure tweak:

   if test -n "$LLVM_CONFIG"; then
     for pgac_option in `$LLVM_CONFIG --cflags`; do
       case $pgac_option in
-        -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
+        -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
       esac
     done

and that indeed changes the failure to this:

Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
a member of ‘llvm’
   llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
   ^~~~
llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
   llvm::remove_bad_alloc_error_handler();
   ^~~~
llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
   llvm::remove_bad_alloc_error_handler();
   ^~~~
make[3]: *** [<builtin>: llvmjit_error.o] Error 1
make[2]: *** [common.mk:45: lib-recursive] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile:38: all-backend-recurse] Error 2
make: *** [GNUmakefile:11: all-src-recurse] Error 2

I'm not sure what that means, though ... maybe I really have system
broken in some strange way.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

llvmjit.txt

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

30 January 2018, 02:15:51

Hi,

On 2018-01-29 23:49:14 +0100, Tomas Vondra wrote:
> On 01/29/2018 11:17 PM, Andres Freund wrote:
> > On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
> >> $ llvm-config --version
> >> 5.0.0svn
> >
> > Is thta llvm-config the one in /usr/local/include/ referenced by the
> > error message above?
>
> I don't see it referenced anywhere, but it comes from here:
>
> $ which llvm-config
> /usr/local/bin/llvm-config
>
> > Or is it possible that llvm-config is from a different version than
> > the one the compiler picks the headers up from?
> >
>
> I don't think so. I don't have any other llvm versions installed, AFAICS.

Hm.


> > could you go to src/backend/lib, rm llvmjit.o, and show the full output
> > of make llvmjit.o?
> >
>
> Attached.
>
> > I wonder whether the issue is that my configure patch does
> >         -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> > rather than
> >         -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
> > and that it thus picks up the wrong header first?
> >
>
> I've tried this configure tweak:
>
>    if test -n "$LLVM_CONFIG"; then
>      for pgac_option in `$LLVM_CONFIG --cflags`; do
>        case $pgac_option in
> -        -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> +        -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
>        esac
>      done
>
> and that indeed changes the failure to this:

Err, huh?  I don't understand how that can change anything if you
actually only have only one version of LLVM installed. Perhaps the
effect was just an ordering related artifact of [parallel] make?
I.e. just a question what failed first?


> Writing postgres.bki
> Writing schemapg.h
> Writing postgres.description
> Writing postgres.shdescription
> llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
> llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
> a member of ‘llvm’
>    llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
>    ^~~~
> llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
> llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>    llvm::remove_bad_alloc_error_handler();
>    ^~~~
> llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
> llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>    llvm::remove_bad_alloc_error_handler();
>    ^~~~

It's a bit hard to interpret this without the actual compiler
invocation. But I've just checked both manually by inspecting 5.0 source
and by compiling against 5.0 that that function definition definitely
exists:

andres@alap4:~/src/llvm-5$ git branch
  master
* release_50
andres@alap4:~/src/llvm-5$ ack remove_bad_alloc_error_handler
lib/Support/ErrorHandling.cpp
139:void llvm::remove_bad_alloc_error_handler() {

include/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();

So does my system llvm 5:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-5.0/
/usr/include/llvm-5.0/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();

But not in 4.0:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-4.0/


> gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-fno-omit-frame-pointer-O2 -I../../../src/include  -D_GNU_SOURCE -I/usr/local/include -DNDEBUG -DLLVM_BUILD_GLOBAL_ISEL
-D_GNU_SOURCE-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS   -c -o llvmjit.o llvmjit.c
 
> llvmjit.c: In function ‘llvm_get_function’:
> llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from incompatible pointer type
[-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, &addr, mangled))
>                                              ^
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ but argument is of type
‘LLVMOrcTargetAddress* {aka long unsigned int *}’
 
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>                       ^~~~~~~~~~~~~~~~~~~~~~~

To me this looks like those headers are from llvm 4, rather than 5:
$ grep -A2 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-4/include/llvm-c/OrcBindings.h
/**
 * Get symbol address from JIT instance.
 */
LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
                                             const char *SymbolName);

$ grep -A3 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-5/include/llvm-c/OrcBindings.h
/**
 * Get symbol address from JIT instance.
 */
LLVMOrcErrorCode LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
                                         LLVMOrcTargetAddress *RetAddr,
                                         const char *SymbolName);

So it does appear that your llvm-config and the actually installed llvm
don't quite agree. How did you install llvm?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Tomas Vondra

Date:

30 January 2018, 02:16:46

On 01/29/2018 11:49 PM, Tomas Vondra wrote:
> 
> ...
>
> and that indeed changes the failure to this:
> 
> Writing postgres.bki
> Writing schemapg.h
> Writing postgres.description
> Writing postgres.shdescription
> llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
> llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
> a member of ‘llvm’
>    llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
>    ^~~~
> llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
> llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>    llvm::remove_bad_alloc_error_handler();
>    ^~~~
> llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
> llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>    llvm::remove_bad_alloc_error_handler();
>    ^~~~
> make[3]: *** [<builtin>: llvmjit_error.o] Error 1
> make[2]: *** [common.mk:45: lib-recursive] Error 2
> make[2]: *** Waiting for unfinished jobs....
> make[1]: *** [Makefile:38: all-backend-recurse] Error 2
> make: *** [GNUmakefile:11: all-src-recurse] Error 2
> 
> 
> I'm not sure what that means, though ... maybe I really have system
> broken in some strange way.
> 

FWIW I've installed llvm 5.0.1 from distribution package, and now
everything builds fine (I don't even need the configure tweak).

I think I had to build the other binaries because there was no 5.x llvm
back then, but it's too far back so I don't remember.

Anyway, seems I'm fine for now. Sorry for the noise.


-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

30 January 2018, 02:24:12

Hi,

On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
> FWIW I've installed llvm 5.0.1 from distribution package, and now
> everything builds fine (I don't even need the configure tweak).
> 
> I think I had to build the other binaries because there was no 5.x llvm
> back then, but it's too far back so I don't remember.
> 
> Anyway, seems I'm fine for now.

Phew, I'm relieved.  I'd guess you buily a 5.0 version while 5.0 was
still in development, so not all 5.0 functionality was available. Hence
the inconsistent looking result.  While I think we can support 4.0
without too much problem, there's obviously no point in trying to
support old between releases versions...

> Sorry for the noise.

No worries.

- Andres

Re: JIT compiling with LLVM v9.1

From

Craig Ringer

Date:

30 January 2018, 05:03:00

On 29 January 2018 at 22:53, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
> == Code ==
>
> As the patchset is large (500kb) and I'm still quickly evolving it, I do
> not yet want to attach it. The git tree is at
> https://git.postgresql.org/git/users/andresfreund/postgres.git
> in the jit branch
> https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

I've just pushed an updated and rebased version of the tree:
- Split the large "jit infrastructure" commits into a number of smaller
commits
- Split the C++ file
- Dropped some of the performance stuff done to heaptuple.c - that was
mostly to make performance comparisons a bit more interesting, but
doesn't seem important enough to deal with.
- Added a commit renaming datetime.h symbols so they don't conflict with
LLVM variables anymore, removing ugly #undef PM/#define PM dance
around includes. Will post separately.
- Reduced the number of pointer constants in the generated LLVM IR, by
doing more getelementptr accesses (stem from before the time types
were automatically synced)
- Increased number of comments a bit

There's a jit-before-rebase-2018-01-29 tag, for the state of the tree
before the rebase.

If you submit the C++ support separately I'd like to sign up as reviewer and get that in. It's non-intrusive and just makes our existing c++ compilation support actually work properly. Your patch is a more complete version of the C++ support I hacked up during linux.conf.au - I should've thought to look in your tree.

The only part I had to add that I don't see in yours is a workaround for mismatched throw() annotations on our redefinition of inet_net_ntop :

src/include/port.h:

@@ -421,7 +425,7 @@ extern int pg_codepage_to_encoding(UINT cp);

/* port/inet_net_ntop.c */

extern char *inet_net_ntop(int af, const void *src, int bits,

- char *dst, size_t size);

+ char *dst, size_t size) __THROW;

src/include/c.h:

@@ -1131,6 +1131,16 @@ extern int fdatasync(int fildes);

#define NON_EXEC_STATIC static

#endif

+/*

+ * glibc uses __THROW when compiling with the c++ compiler, but port.h reclares

+ * inet_net_ntop. If we don't annotate it the same way as the prototype in

+ * <inet/arpa.h> we'll upset g++, so we must use __THROW from <sys/cdefs.h>. If

+ * we're not on glibc, we need to define it away.

+ */

+#ifndef __GNU_LIBRARY__

+#define __THROW

+#endif

/* /port compatibility functions */

#include "port.h"

This might be better solved by renaming it to pg_inet_net_ntop so we don't conflict with a standard name.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

30 January 2018, 07:53:48

Hi,

On Mon, Jan 29, 2018 at 10:40 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
>> OK. How about this: are you open to changes that move us in the
>> direction of extensibility later? (By this I do *not* mean imposing a
>> bunch of requirements on you... either small changes to your patches
>> or something part of another commit.)
>
> I'm good with that.
>
>
>> Or are you determined that this always should be a part of core?

> I'm strongly against there not being an in-core JIT. I'm not at all
> against adding APIs that allow to do different JIT implementations out
> of core.

I can live with that.

I recommend that you discuss with packagers and a few others, to
reduce the chance of disagreement later.

> Well, the source would require an actual compiler around. And the
> inlining *just* for the function code itself isn't actually that
> interesting, you e.g. want to also be able to

I think you hit enter too quicly... what's the rest of that sentence?

Regards,
      Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Robert Haas

Date:

30 January 2018, 21:46:37

On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
> It's an optional dependency, and it doesn't increase build time that
> much... If we were to move the llvm interfacing code to a .so, there'd
> not even be a packaging issue, you can just package that .so separately
> and get errors if somebody tries to enable LLVM without that .so being
> installed.

I suspect that would be really valuable.  If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all of
LLVM, some people are going to complain, either because they are
trying to build little tiny machine images or because they are subject
to policies which preclude the presence of a compiler on a production
server.  If you can do 'yum install postgresql-server' without
additional dependencies and 'yum install postgresql-server-jit' to
make it go faster, that issue is solved.

Unfortunately, that has the pretty significant downside that a lot of
people who actually want the postgresql-server-jit package will not
realize that they need to install it, which sucks.  But I think it
might still be the better way to go.  Anyway, it's for individual
packagers to cope with that problem; as far as the patch goes, +1 for
structuring things in a way which gives packagers the option to divide
it up that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: JIT compiling with LLVM v9.0

From

Robert Haas

Date:

30 January 2018, 21:57:50

On Wed, Jan 24, 2018 at 2:20 AM, Andres Freund <andres@anarazel.de> wrote:
> == Error handling ==
>
> There's two aspects to error handling.
>
> Firstly, generated (LLVM IR) and emitted functions (mmap()ed segments)
> need to be cleaned up both after a successful query execution and after
> an error.  I've settled on a fairly boring resowner based mechanism. On
> errors all expressions owned by a resowner are released, upon success
> expressions are reassigned to the parent / released on commit (unless
> executor shutdown has cleaned them up of course).

Cool.

> A second, less pretty and newly developed, aspect of error handling is
> OOM handling inside LLVM itself. The above resowner based mechanism
> takes care of cleaning up emitted code upon ERROR, but there's also the
> chance that LLVM itself runs out of memory. LLVM by default does *not*
> use any C++ exceptions. It's allocations are primarily funneled through
> the standard "new" handlers, and some direct use of malloc() and
> mmap(). For the former a 'new handler' exists
> http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
> latter LLVM provides callback that get called upon failure
> (unfortunately mmap() failures are treated as fatal rather than OOM
> errors).
> What I've chosen to do, and I'd be interested to get some input about
> that, is to have two functions that LLVM using code must use:
>   extern void llvm_enter_fatal_on_oom(void);
>   extern void llvm_leave_fatal_on_oom(void);
> before interacting with LLVM code (ie. emitting IR, or using the above
> functions) llvm_enter_fatal_on_oom() needs to be called.
>
> When a libstdc++ new or LLVM error occurs, the handlers set up by the
> above functions trigger a FATAL error. We have to use FATAL rather than
> ERROR, as we *cannot* reliably throw ERROR inside a foreign library
> without risking corrupting its internal state.

That bites, although it's probably tolerable if we expect such errors
only in exceptional situations such as a needed shared library failing
to load or something. Killing the session when we run out of memory
during JIT compilation is not very nice at all.  Does the LLVM library
have any useful hooks that we can leverage here, like a hypothetical
function LLVMProvokeFailureAsSoonAsConvenient()?  The equivalent
function for PostgreSQL would do { InterruptPending = true;
QueryCancelPending = true; }.  And maybe LLVMSetProgressCallback()
that would get called periodically and let us set a handler that could
check for interrupts on the PostgreSQL side and then call
LLVMProvokeFailureAsSoonAsConvenient() as applicable?  This problem
can't be completely unique to PostgreSQL; anybody who is using LLVM
for JIT from a long-running process needs a solution, so you might
think that the library would provide one.

> This facility allows us to get the bitcode for all operators
> (e.g. int8eq, float8pl etc), without maintaining two copies. The way
> I've currently set it up is that, if --with-llvm is passed to configure,
> all backend files are also compiled to bitcode files.  These bitcode
> files get installed into the server's
>   $pkglibdir/bitcode/postgres/
> under their original subfolder, eg.
>   ~/build/postgres/dev-assert/install/lib/bitcode/postgres/utils/adt/float.bc
> Using existing LLVM functionality (for parallel LTO compilation),
> additionally an index is over these is stored to
>   $pkglibdir/bitcode/postgres.index.bc

That sounds pretty sweet.

> When deciding to JIT for the first time, $pkglibdir/bitcode/ is scanned
> for all .index.bc files and a *combined* index over all these files is
> built in memory.  The reason for doing so is that that allows "easy"
> access to inlining access for extensions - they can install code into
>   $pkglibdir/bitcode/[extension]/
> accompanied by
>   $pkglibdir/bitcode/[extension].index.bc
> just alongside the actual library.

But that means that if an extension is installed after the initial
scan has been done, concurrent sessions won't notice the new files.
Maybe that's OK, but I wonder if we can do better.

> Do people feel these should be hidden behind #ifdefs, always present but
> prevent from being set to a meaningful, or unrestricted?

We shouldn't allow non-superusers to set any GUC that dumps files to
the data directory or provides an easy to way to crash the server, run
the machine out of memory, or similar.  GUCs that just print stuff, or
make queries faster/slower, can be set by anyone, I think.  I favor
having the debugging stuff available in the default build.  This
feature has a chance of containing bugs, and those bugs will be hard
to troubleshoot if the first step in getting information on what went
wrong is "recompile".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

30 January 2018, 22:08:44

Hi,

On 2018-01-30 13:57:50 -0500, Robert Haas wrote:
> > When a libstdc++ new or LLVM error occurs, the handlers set up by the
> > above functions trigger a FATAL error. We have to use FATAL rather than
> > ERROR, as we *cannot* reliably throw ERROR inside a foreign library
> > without risking corrupting its internal state.
> 
> That bites, although it's probably tolerable if we expect such errors
> only in exceptional situations such as a needed shared library failing
> to load or something. Killing the session when we run out of memory
> during JIT compilation is not very nice at all.  Does the LLVM library
> have any useful hooks that we can leverage here, like a hypothetical
> function LLVMProvokeFailureAsSoonAsConvenient()?

I don't see how that'd help if a memory allocation fails? We can't just
continue in that case? You could arguably have reserve memory pool that
you release in that case and then try to continue, but that seems
awfully fragile.


> The equivalent function for PostgreSQL would do { InterruptPending =
> true; QueryCancelPending = true; }.  And maybe
> LLVMSetProgressCallback() that would get called periodically and let
> us set a handler that could check for interrupts on the PostgreSQL
> side and then call LLVMProvokeFailureAsSoonAsConvenient() as
> applicable?  This problem can't be completely unique to PostgreSQL;
> anybody who is using LLVM for JIT from a long-running process needs a
> solution, so you might think that the library would provide one.

The ones I looked at just error out.  Needing to handle OOM in soft fail
manner isn't actually that common a demand, I guess :/.


> > for all .index.bc files and a *combined* index over all these files is
> > built in memory.  The reason for doing so is that that allows "easy"
> > access to inlining access for extensions - they can install code into
> >   $pkglibdir/bitcode/[extension]/
> > accompanied by
> >   $pkglibdir/bitcode/[extension].index.bc
> > just alongside the actual library.
> 
> But that means that if an extension is installed after the initial
> scan has been done, concurrent sessions won't notice the new files.
> Maybe that's OK, but I wonder if we can do better.

I mean we could periodically rescan, rescan after sighup, or such? But
that seems like something for later to me. It's not going to be super
common to install new extensions while a lot of sessions are
running. And things will work in that case, the functions just won't get inlined...


> > Do people feel these should be hidden behind #ifdefs, always present but
> > prevent from being set to a meaningful, or unrestricted?
> 
> We shouldn't allow non-superusers to set any GUC that dumps files to
> the data directory or provides an easy to way to crash the server, run
> the machine out of memory, or similar.

I don't buy the OOM one - there's so so so many of those already...

The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
if profiling can only be done by a superuser? Hm :/

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Robert Haas

Date:

30 January 2018, 23:06:02

On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:
>> That bites, although it's probably tolerable if we expect such errors
>> only in exceptional situations such as a needed shared library failing
>> to load or something. Killing the session when we run out of memory
>> during JIT compilation is not very nice at all.  Does the LLVM library
>> have any useful hooks that we can leverage here, like a hypothetical
>> function LLVMProvokeFailureAsSoonAsConvenient()?
>
> I don't see how that'd help if a memory allocation fails? We can't just
> continue in that case? You could arguably have reserve memory pool that
> you release in that case and then try to continue, but that seems
> awfully fragile.

Well, I'm just asking what the library supports.  For example:

https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html

If you had something like that, you could arrange to safely interrupt
the library the next time the progress-function was called.

> The ones I looked at just error out.  Needing to handle OOM in soft fail
> manner isn't actually that common a demand, I guess :/.

Bummer.

> I mean we could periodically rescan, rescan after sighup, or such? But
> that seems like something for later to me. It's not going to be super
> common to install new extensions while a lot of sessions are
> running. And things will work in that case, the functions just won't get inlined...

Fair enough.

>> > Do people feel these should be hidden behind #ifdefs, always present but
>> > prevent from being set to a meaningful, or unrestricted?
>>
>> We shouldn't allow non-superusers to set any GUC that dumps files to
>> the data directory or provides an easy to way to crash the server, run
>> the machine out of memory, or similar.
>
> I don't buy the OOM one - there's so so so many of those already...
>
> The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
> if profiling can only be done by a superuser? Hm :/

The server's ~/.debug/jit?  Or are you somehow getting the output to the client?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: JIT compiling with LLVM v9.0

From

ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)

Date:

30 January 2018, 23:07:27

Robert Haas <robertmhaas@gmail.com> writes:

> Unfortunately, that has the pretty significant downside that a lot of
> people who actually want the postgresql-server-jit package will not
> realize that they need to install it, which sucks.  But I think it
> might still be the better way to go.  Anyway, it's for individual
> packagers to cope with that problem; as far as the patch goes, +1 for
> structuring things in a way which gives packagers the option to divide
> it up that way.

I don't know about rpm/yum/dnf, but in dpkg/apt one could declare that
postgresql-server recommends postgresql-server-jit, which installs the
package by default, but can be overridden by config or on the command
line.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."                              -- Skud's Meta-Law

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

30 January 2018, 23:20:56

Hi,

On 2018-01-30 15:06:02 -0500, Robert Haas wrote:
> On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:
> >> That bites, although it's probably tolerable if we expect such errors
> >> only in exceptional situations such as a needed shared library failing
> >> to load or something. Killing the session when we run out of memory
> >> during JIT compilation is not very nice at all.  Does the LLVM library
> >> have any useful hooks that we can leverage here, like a hypothetical
> >> function LLVMProvokeFailureAsSoonAsConvenient()?
> >
> > I don't see how that'd help if a memory allocation fails? We can't just
> > continue in that case? You could arguably have reserve memory pool that
> > you release in that case and then try to continue, but that seems
> > awfully fragile.
> 
> Well, I'm just asking what the library supports.  For example:
> 
> https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html

I get that type of function, what I don't understand how that applies to
OOM:

> If you had something like that, you could arrange to safely interrupt
> the library the next time the progress-function was called.

Yea, but how are you going to *get* to the next time, given that an
allocator just couldn't allocate memory? You can't just return a NULL
pointer because the caller will use that memory?


> > The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
> > if profiling can only be done by a superuser? Hm :/
> 
> The server's ~/.debug/jit?  Or are you somehow getting the output to the client?

Yes, the servers - I'm not sure I understand the "client" bit? It's
about perf profiling, which isn't available to the client either?


Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Tomas Vondra

Date:

31 January 2018, 00:24:51

On 01/30/2018 12:24 AM, Andres Freund wrote:
> Hi,
> 
> On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
>> FWIW I've installed llvm 5.0.1 from distribution package, and now
>> everything builds fine (I don't even need the configure tweak).
>>
>> I think I had to build the other binaries because there was no 5.x llvm
>> back then, but it's too far back so I don't remember.
>>
>> Anyway, seems I'm fine for now.
> 
> Phew, I'm relieved.  I'd guess you buily a 5.0 version while 5.0 was
> still in development, so not all 5.0 functionality was available. Hence
> the inconsistent looking result.  While I think we can support 4.0
> without too much problem, there's obviously no point in trying to
> support old between releases versions...
> 

That's quite possible, but I don't really remember :-/

But I ran into another issue today, where everything builds fine (llvm
5.0.1, gcc 6.4.0), but at runtime I get errors like this:

ERROR:
LLVMCreateMemoryBufferWithContentsOfFile(/home/tomas/pg-llvm/lib/postgresql/llvmjit_types.bc)
failed: No such file or directory

It seems the llvmjit_types.bc file ended up in the parent directory
(/home/tomas/pg-llvm/lib/) for some reason. After simply copying it to
the expected place everything started working.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

From

David Fetter

Date:

31 January 2018, 00:57:06

On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
> > It's an optional dependency, and it doesn't increase build time
> > that much... If we were to move the llvm interfacing code to a
> > .so, there'd not even be a packaging issue, you can just package
> > that .so separately and get errors if somebody tries to enable
> > LLVM without that .so being installed.
> 
> I suspect that would be really valuable.  If 'yum install
> postgresql-server' (or your favorite equivalent) sucks down all of
> LLVM,

As I understand it, LLVM is organized in such a way as not to require
this.  Andres, am I understanding correctly that what you're using
doesn't require much of LLVM at runtime?

> some people are going to complain, either because they are
> trying to build little tiny machine images or because they are
> subject to policies which preclude the presence of a compiler on a
> production server.  If you can do 'yum install postgresql-server'
> without additional dependencies and 'yum install
> postgresql-server-jit' to make it go faster, that issue is solved.

Would you consider it solved if there were some very small part of the
LLVM (or similar JIT-capable) toolchain added as a dependency, or does
it need to be optional into a long future?

> Unfortunately, that has the pretty significant downside that a lot of
> people who actually want the postgresql-server-jit package will not
> realize that they need to install it, which sucks.

It does indeed.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

31 January 2018, 01:08:30

Hi,

On 2018-01-30 22:57:06 +0100, David Fetter wrote:
> On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> > On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
> > > It's an optional dependency, and it doesn't increase build time
> > > that much... If we were to move the llvm interfacing code to a
> > > .so, there'd not even be a packaging issue, you can just package
> > > that .so separately and get errors if somebody tries to enable
> > > LLVM without that .so being installed.
> > 
> > I suspect that would be really valuable.  If 'yum install
> > postgresql-server' (or your favorite equivalent) sucks down all of
> > LLVM,
> 
> As I understand it, LLVM is organized in such a way as not to require
> this.  Andres, am I understanding correctly that what you're using
> doesn't require much of LLVM at runtime?

I'm not sure what you exactly mean. Yes, you need the llvm library at
runtime. Perhaps you're thinking of clang or llvm binarieries? The
latter we *not* need.

What's required is something like:
$ apt show libllvm5.0
Package: libllvm5.0
Version: 1:5.0.1-2
Priority: optional
Section: libs
Source: llvm-toolchain-5.0
Maintainer: LLVM Packaging Team <pkg-llvm-team@lists.alioth.debian.org>
Installed-Size: 56.9 MB
Depends: libc6 (>= 2.15), libedit2 (>= 2.11-20080614), libffi6 (>= 3.0.4), libgcc1 (>= 1:3.4), libstdc++6 (>= 6),
libtinfo5(>= 6), zlib1g (>= 1:1.2.0)

Breaks: libllvm3.9v4
Replaces: libllvm3.9v4
Homepage: http://www.llvm.org/
Tag: role::shared-lib
Download-Size: 13.7 MB
APT-Manual-Installed: no
APT-Sources: http://debian.osuosl.org/debian unstable/main amd64 Packages
Description: Modular compiler and toolchain technologies, runtime library
 LLVM is a collection of libraries and tools that make it easy to build
 compilers, optimizers, just-in-time code generators, and many other
 compiler-related programs.
 .
 This package contains the LLVM runtime library.

So ~14MB to download, ~57MB on disk.  We only need a subset of
libllvm5.0, and LLVM allows to build such a subset. But obviously
distributions aren't going to target their LLVM just for postgres.

> > Unfortunately, that has the pretty significant downside that a lot of
> > people who actually want the postgresql-server-jit package will not
> > realize that they need to install it, which sucks.
> 
> It does indeed.

With things like apt recommends and such I don't think this is a huge
problem.  It'll be installed by default unless somebody is on a space
constrained system and doesn't want that...

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

31 January 2018, 01:57:59

Hi,

On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
> > It's an optional dependency, and it doesn't increase build time that
> > much... If we were to move the llvm interfacing code to a .so, there'd
> > not even be a packaging issue, you can just package that .so separately
> > and get errors if somebody tries to enable LLVM without that .so being
> > installed.
> 
> I suspect that would be really valuable.  If 'yum install
> postgresql-server' (or your favorite equivalent) sucks down all of
> LLVM, some people are going to complain, either because they are
> trying to build little tiny machine images or because they are subject
> to policies which preclude the presence of a compiler on a production
> server.  If you can do 'yum install postgresql-server' without
> additional dependencies and 'yum install postgresql-server-jit' to
> make it go faster, that issue is solved.

So, I'm working on that now.  In the course of this I'll be
painfully rebase and rename a lot of code, which I'd like not to repeat
unnecessarily.

Right now there primarily is:

src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
src/backend/access/common/heaptuple.c - emit LLVM IR for deforming

Given that we need a shared library it'll be best buildsystem wise if
all of this is in a directory, and there's a separate file containing
the stubs that call into it.

I'm not quite sure where to put the code. I'm a bit inclined to add a
new
src/backend/jit/
because we're dealing with code from across different categories? There
we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
specific code?

Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
and the actual llvm using code into src/backend/executor/llvmjit/?

Comments?

Andres Freund

Re: JIT compiling with LLVM v9.0

From

David Fetter

Date:

31 January 2018, 02:19:33

On Tue, Jan 30, 2018 at 02:08:30PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2018-01-30 22:57:06 +0100, David Fetter wrote:
> > On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> > > On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
> > > > It's an optional dependency, and it doesn't increase build
> > > > time that much... If we were to move the llvm interfacing code
> > > > to a .so, there'd not even be a packaging issue, you can just
> > > > package that .so separately and get errors if somebody tries
> > > > to enable LLVM without that .so being installed.
> > > 
> > > I suspect that would be really valuable.  If 'yum install
> > > postgresql-server' (or your favorite equivalent) sucks down all
> > > of LLVM,
> > 
> > As I understand it, LLVM is organized in such a way as not to
> > require this.  Andres, am I understanding correctly that what
> > you're using doesn't require much of LLVM at runtime?
> 
> I'm not sure what you exactly mean. Yes, you need the llvm library
> at runtime. Perhaps you're thinking of clang or llvm binarieries?
> The latter we *not* need.

I was, and glad I understood correctly.

> What's required is something like:
> $ apt show libllvm5.0
> Package: libllvm5.0
> Version: 1:5.0.1-2
> Priority: optional
> Section: libs
> Source: llvm-toolchain-5.0
> Maintainer: LLVM Packaging Team <pkg-llvm-team@lists.alioth.debian.org>
> Installed-Size: 56.9 MB
> Depends: libc6 (>= 2.15), libedit2 (>= 2.11-20080614), libffi6 (>= 3.0.4), libgcc1 (>= 1:3.4), libstdc++6 (>= 6),
libtinfo5(>= 6), zlib1g (>= 1:1.2.0)
 
> Breaks: libllvm3.9v4
> Replaces: libllvm3.9v4
> Homepage: http://www.llvm.org/
> Tag: role::shared-lib
> Download-Size: 13.7 MB
> APT-Manual-Installed: no
> APT-Sources: http://debian.osuosl.org/debian unstable/main amd64 Packages
> Description: Modular compiler and toolchain technologies, runtime library
>  LLVM is a collection of libraries and tools that make it easy to build
>  compilers, optimizers, just-in-time code generators, and many other
>  compiler-related programs.
>  .
>  This package contains the LLVM runtime library.
> 
> So ~14MB to download, ~57MB on disk.  We only need a subset of
> libllvm5.0, and LLVM allows to build such a subset. But obviously
> distributions aren't going to target their LLVM just for postgres.

True, although if they're using an LLVM only for PostgreSQL and care
about 57MB of disk, they're probably also ready to do that work.

> > > Unfortunately, that has the pretty significant downside that a
> > > lot of people who actually want the postgresql-server-jit
> > > package will not realize that they need to install it, which
> > > sucks.
> > 
> > It does indeed.
> 
> With things like apt recommends and such I don't think this is a
> huge problem.  It'll be installed by default unless somebody is on a
> space constrained system and doesn't want that...

Don't most of the wins for JITing come in the OLAP space anyway?  I'm
having trouble picturing a severely space-constrained OLAP system, but
of course it's a possible scenario.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: JIT compiling with LLVM v9.0

From

Jason Petersen

Date:

31 January 2018, 02:25:19

On Jan 30, 2018, at 2:08 PM, Andres Freund <andres@anarazel.de> wrote:

With things like apt recommends and such I don't think this is a huge problem.

I don’t believe there is a similar widely-supported dependency type in yum/rpm, though. rpm 4.12 adds support for Weak Dependencies, which have Recommends/Suggests-style semantics, but AFAIK it’s not going to be on most RPM machines (I haven’t checked most OSes yet, but IIRC it’s mostly a Fedora thing at this point?)

Which means in the rpm packages we’ll have to decide whether this is required or must be opt-in by end users (which as discussed would hurt adoption).

Jason Petersen

Software Engineer | Citus Data

303.736.9255

jason@citusdata.com

Re: JIT compiling with LLVM v9.0

From

Thomas Munro

Date:

31 January 2018, 04:42:26

On Wed, Jan 31, 2018 at 11:57 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
>> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:
>> > It's an optional dependency, and it doesn't increase build time that
>> > much... If we were to move the llvm interfacing code to a .so, there'd
>> > not even be a packaging issue, you can just package that .so separately
>> > and get errors if somebody tries to enable LLVM without that .so being
>> > installed.
>>
>> I suspect that would be really valuable.  If 'yum install
>> postgresql-server' (or your favorite equivalent) sucks down all of
>> LLVM, some people are going to complain, either because they are
>> trying to build little tiny machine images or because they are subject
>> to policies which preclude the presence of a compiler on a production
>> server.  If you can do 'yum install postgresql-server' without
>> additional dependencies and 'yum install postgresql-server-jit' to
>> make it go faster, that issue is solved.
>
> So, I'm working on that now.  In the course of this I'll be
> painfully rebase and rename a lot of code, which I'd like not to repeat
> unnecessarily.
>
> Right now there primarily is:
>
> src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
> src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
> src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
> src/backend/access/common/heaptuple.c - emit LLVM IR for deforming
>
> Given that we need a shared library it'll be best buildsystem wise if
> all of this is in a directory, and there's a separate file containing
> the stubs that call into it.
>
> I'm not quite sure where to put the code. I'm a bit inclined to add a
> new
> src/backend/jit/
> because we're dealing with code from across different categories? There
> we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> specific code?
>
> Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
> and the actual llvm using code into src/backend/executor/llvmjit/?
>
> Comments?

I'm just starting to look at this (amazing) work, and I don't have a
strong opinion yet.  But certainly, making it easy for packagers to
put the -jit stuff into a separate package for the reasons already
given sounds sensible to me.  Some systems package LLVM as one
gigantic package that'll get you 1GB of compiler/debugger/other stuff
and perhaps violate local rules by installing a compiler when you
really just wanted libLLVM{whatever}.so.  I guess it should be made
very clear to users (explain plans, maybe startup message, ...?)
whether JIT support is active/installed so that people are at least
very aware when they encounter a system that is interpreting stuff it
could be compiling.   Putting all the JIT into a separate directory
under src/backend/jit certainly looks sensible at first glance, but
I'm not sure.

Incidentally, from commit fdc6c7a6dddbd6df63717f2375637660bcd00fc6
(HEAD -> jit, andresfreund/jit) on your branch I get:

ccache c++ -Wall -Wpointer-arith -fno-strict-aliasing -fwrapv -g -g
-O2 -fno-exceptions -I../../../src/include
-I/usr/local/llvm50/include -DLLVM_BUILD_GLOBAL_ISEL
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-I/usr/local/include  -c -o llvmjit_error.o llvmjit_error.cpp -MMD -MP
-MF .deps/llvmjit_error.Po
In file included from llvmjit_error.cpp:26:
In file included from ../../../src/include/lib/llvmjit.h:48:
In file included from /usr/local/llvm50/include/llvm-c/Types.h:17:
In file included from /usr/local/llvm50/include/llvm/Support/DataTypes.h:33:
/usr/include/c++/v1/cmath:555:1: error: templates must have C++ linkage
template <class _A1>
^~~~~~~~~~~~~~~~~~~~
llvmjit_error.cpp:24:1: note: extern "C" language linkage
specification begins here
extern "C"
^

$ c++ -v
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)

This seems to be a valid complaint.  I don't think you should be
(indirectly) wrapping Types.h in extern "C".  At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?

-- 
Thomas Munro
http://www.enterprisedb.com

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

31 January 2018, 05:05:33

On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
> I'm just starting to look at this (amazing) work, and I don't have a
> strong opinion yet.  But certainly, making it easy for packagers to
> put the -jit stuff into a separate package for the reasons already
> given sounds sensible to me.  Some systems package LLVM as one
> gigantic package that'll get you 1GB of compiler/debugger/other stuff
> and perhaps violate local rules by installing a compiler when you
> really just wanted libLLVM{whatever}.so.  I guess it should be made
> very clear to users (explain plans, maybe startup message, ...?)

I'm not quite sure I understand. You mean have it display whether
available? I think my plan is to "just" set jit_expressions=on (or
whatever we're going to name it) fail if the prerequisites aren't
available. I personally don't think this should be enabled by default,
definitely not in the first release.

> $ c++ -v
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
> LLVM 4.0.0)
> 
> This seems to be a valid complaint.  I don't think you should be
> (indirectly) wrapping Types.h in extern "C".  At a guess, your
> llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
> specifiers, so you can use it from C or C++, but making sure that you
> don't #include LLVM's headers from a bizarro context where __cplusplus
> is defined but the linkage is unexpectedly already "C"?

Hm, this seems like a bit of pointless nitpickery by the compiler to me,
but I guess...

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Thomas Munro

Date:

31 January 2018, 05:48:09

On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
>> I'm just starting to look at this (amazing) work, and I don't have a
>> strong opinion yet.  But certainly, making it easy for packagers to
>> put the -jit stuff into a separate package for the reasons already
>> given sounds sensible to me.  Some systems package LLVM as one
>> gigantic package that'll get you 1GB of compiler/debugger/other stuff
>> and perhaps violate local rules by installing a compiler when you
>> really just wanted libLLVM{whatever}.so.  I guess it should be made
>> very clear to users (explain plans, maybe startup message, ...?)
>
> I'm not quite sure I understand. You mean have it display whether
> available? I think my plan is to "just" set jit_expressions=on (or
> whatever we're going to name it) fail if the prerequisites aren't
> available. I personally don't think this should be enabled by default,
> definitely not in the first release.

I assumed (incorrectly) that you wanted it to default to on if
available, so I was suggesting making it obvious to end users if
they've accidentally forgotten to install -jit.  If it's not enabled
until you actually ask for it and trying to enable it when it's not
installed barfs, then that seems sensible.

>> This seems to be a valid complaint.  I don't think you should be
>> (indirectly) wrapping Types.h in extern "C".  At a guess, your
>> llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
>> specifiers, so you can use it from C or C++, but making sure that you
>> don't #include LLVM's headers from a bizarro context where __cplusplus
>> is defined but the linkage is unexpectedly already "C"?
>
> Hm, this seems like a bit of pointless nitpickery by the compiler to me,
> but I guess...

Well that got me curious about how GCC could possibly be accepting
that (it certainly doesn't like extern "C" template ... any more than
the next compiler).  I dug a bit and realised that it's the stdlib
that's different:  libstdc++ has its own extern "C++" in <cmath>,
while  libc++ doesn't.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

31 January 2018, 05:55:12

Hi,

On 2018-01-31 15:48:09 +1300, Thomas Munro wrote:
> On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund <andres@anarazel.de> wrote:
> > I'm not quite sure I understand. You mean have it display whether
> > available? I think my plan is to "just" set jit_expressions=on (or
> > whatever we're going to name it) fail if the prerequisites aren't
> > available. I personally don't think this should be enabled by default,
> > definitely not in the first release.
> 
> I assumed (incorrectly) that you wanted it to default to on if
> available, so I was suggesting making it obvious to end users if
> they've accidentally forgotten to install -jit.  If it's not enabled
> until you actually ask for it and trying to enable it when it's not
> installed barfs, then that seems sensible.

I'm open to changing my mind on it, but it seems a bit weird that a
feature that relies on a shlib being installed magically turns itself on
if avaible. And leaving that angle aside, ISTM, that it's a complex
enough feature that it should be opt-in the first release... Think we
roughly did that right for e.g. parallellism.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Konstantin Knizhnik

Date:

31 January 2018, 11:03:44

On 31.01.2018 05:48, Thomas Munro wrote:
>
>>> This seems to be a valid complaint.  I don't think you should be
>>> (indirectly) wrapping Types.h in extern "C".  At a guess, your
>>> llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
>>> specifiers, so you can use it from C or C++, but making sure that you
>>> don't #include LLVM's headers from a bizarro context where __cplusplus
>>> is defined but the linkage is unexpectedly already "C"?
>> Hm, this seems like a bit of pointless nitpickery by the compiler to me,
>> but I guess...
> Well that got me curious about how GCC could possibly be accepting
> that (it certainly doesn't like extern "C" template ... any more than
> the next compiler).  I dug a bit and realised that it's the stdlib
> that's different:  libstdc++ has its own extern "C++" in <cmath>,
> while  libc++ doesn't.
>
The same problem takes place with old versions of GCC: I have to upgrade 
GCC to 7.2 to make it possible to compile this code.
The problem in not in compiler itself, but in libc++ headers.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: JIT compiling with LLVM v9.0

From

Peter Eisentraut

Date:

31 January 2018, 18:22:25

On 1/30/18 21:55, Andres Freund wrote:
> I'm open to changing my mind on it, but it seems a bit weird that a
> feature that relies on a shlib being installed magically turns itself on
> if avaible. And leaving that angle aside, ISTM, that it's a complex
> enough feature that it should be opt-in the first release... Think we
> roughly did that right for e.g. parallellism.

That sounds reasonable, for both of those reasons.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

From

Robert Haas

Date:

31 January 2018, 19:53:25

On Wed, Jan 31, 2018 at 10:22 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 1/30/18 21:55, Andres Freund wrote:
>> I'm open to changing my mind on it, but it seems a bit weird that a
>> feature that relies on a shlib being installed magically turns itself on
>> if avaible. And leaving that angle aside, ISTM, that it's a complex
>> enough feature that it should be opt-in the first release... Think we
>> roughly did that right for e.g. parallellism.
>
> That sounds reasonable, for both of those reasons.

The first one is a problem that's not going to go away.  If the
problem of JIT being enabled "magically" is something we're concerned
about, we need to figure out a good solution, not just disable the
feature by default.

As far as the second one, looking back at what happened with parallel
query, I found (on a quick read) 13 back-patched commits in
REL9_6_STABLE prior to the release of 10.0, 3 of which I would qualify
as low-importance (improving documentation, fixing something that's
not really a bug, improving a test case).  A couple of those were
really stupid mistakes on my part.  On the other hand, would it have
been overall worse for our users if that feature had been turned on in
9.6?  I don't know.  They would have had those bugs (at least until we
fixed them) but they would have had parallel query, too.  It's hard
for me to judge whether that was a win or a loss, and so here.  Like
parallel query, this is a feature which seems to have a low risk of
data corruption, but a fairly high risk of wrong answers to queries
and/or strange errors.   Users don't like that.  On the other hand,
also like parallel query, if you've got the right kind of queries, it
can make them go a lot faster.  Users DO like that.

So I could go either way on whether to enable this in the first
release.  I definitely would not like to see it stay disabled by
default for a second release unless we find a lot of problems with it.
There's no point in developing new features unless users are going to
get the benefit of them, and while SOME users will enable features
that aren't turned on by default, many will not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: JIT compiling with LLVM v9.0

From

Robert Haas

Date:

31 January 2018, 19:56:59

On Tue, Jan 30, 2018 at 5:57 PM, Andres Freund <andres@anarazel.de> wrote:
> Given that we need a shared library it'll be best buildsystem wise if
> all of this is in a directory, and there's a separate file containing
> the stubs that call into it.
>
> I'm not quite sure where to put the code. I'm a bit inclined to add a
> new
> src/backend/jit/
> because we're dealing with code from across different categories? There
> we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> specific code?

That's kind of ugly, in that if we eventually end up with many
different parts of the system using JIT, they're all going to have to
all put their code in that directory rather than putting it with the
subsystem to which it pertains.  On the other hand, I don't really
have a better idea.  I'd definitely at least try to keep
executor-specific considerations in a separate FILE from general JIT
infrastructure, and make, as far as possible, a clean separation at
the API level.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

31 January 2018, 21:34:36

Hi,

On 2018-01-31 11:53:25 -0500, Robert Haas wrote:
> On Wed, Jan 31, 2018 at 10:22 AM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> > On 1/30/18 21:55, Andres Freund wrote:
> >> I'm open to changing my mind on it, but it seems a bit weird that a
> >> feature that relies on a shlib being installed magically turns itself on
> >> if avaible. And leaving that angle aside, ISTM, that it's a complex
> >> enough feature that it should be opt-in the first release... Think we
> >> roughly did that right for e.g. parallellism.
> >
> > That sounds reasonable, for both of those reasons.
> 
> The first one is a problem that's not going to go away.  If the
> problem of JIT being enabled "magically" is something we're concerned
> about, we need to figure out a good solution, not just disable the
> feature by default.

That's a fair argument, and I don't really have a good answer to it. We
could have a jit = off/try/on, and use that to signal things? I.e. it
can be set to try (possibly default in version + 1), and things will
work if it's not installed, but if set to on it'll refuse to work if not
enabled. Similar to how huge pages work now.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

31 January 2018, 21:37:52

Hi,

On 2018-01-31 11:56:59 -0500, Robert Haas wrote:
> On Tue, Jan 30, 2018 at 5:57 PM, Andres Freund <andres@anarazel.de> wrote:
> > Given that we need a shared library it'll be best buildsystem wise if
> > all of this is in a directory, and there's a separate file containing
> > the stubs that call into it.
> >
> > I'm not quite sure where to put the code. I'm a bit inclined to add a
> > new
> > src/backend/jit/
> > because we're dealing with code from across different categories? There
> > we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> > specific code?
> 
> That's kind of ugly, in that if we eventually end up with many
> different parts of the system using JIT, they're all going to have to
> all put their code in that directory rather than putting it with the
> subsystem to which it pertains.

Yea, that's what I really dislike about the idea too.

> On the other hand, I don't really have a better idea.

I guess one alternative would be to leave the individual files in their
subsystem directories, but not in the corresponding OBJS lists, and
instead pick them up from the makefile in the jit shlib?  That might
better...

It's a bit weird because the files would be compiled when make-ing that
directory and rather when the jit shlib one made, but that's not too
bad.

> I'd definitely at least try to keep executor-specific considerations
> in a separate FILE from general JIT infrastructure, and make, as far
> as possible, a clean separation at the API level.

Absolutely.  Right now there's general infrastructure files (error
handling, optimization, inlining), expression compilation, tuple deform
compilation, and I thought to continue keeping the files separately just
like that.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Robert Haas

Date:

31 January 2018, 22:45:46

On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund <andres@anarazel.de> wrote:
>> The first one is a problem that's not going to go away.  If the
>> problem of JIT being enabled "magically" is something we're concerned
>> about, we need to figure out a good solution, not just disable the
>> feature by default.
>
> That's a fair argument, and I don't really have a good answer to it. We
> could have a jit = off/try/on, and use that to signal things? I.e. it
> can be set to try (possibly default in version + 1), and things will
> work if it's not installed, but if set to on it'll refuse to work if not
> enabled. Similar to how huge pages work now.

We could do that, but I'd be more inclined just to let JIT be
magically enabled.  In general, if a user could do 'yum install ip4r'
(for example) and have that Just Work without any further database
configuration, I think a lot of people would consider that to be a
huge improvement.  Unfortunately we can't really do that for various
reasons, the biggest of which is that there's no way for installing an
OS package to modify the internal state of a database that may not
even be running at the time.  But as a general principle, I think
having to configure both the OS and the DB is an anti-feature, and
that if installing an extra package is sufficient to get the
new-and-improved behavior, users will like it.  Bonus points if it
doesn't require a server restart.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

31 January 2018, 22:49:42

On 2018-01-31 14:45:46 -0500, Robert Haas wrote:
> On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund <andres@anarazel.de> wrote:
> >> The first one is a problem that's not going to go away.  If the
> >> problem of JIT being enabled "magically" is something we're concerned
> >> about, we need to figure out a good solution, not just disable the
> >> feature by default.
> >
> > That's a fair argument, and I don't really have a good answer to it. We
> > could have a jit = off/try/on, and use that to signal things? I.e. it
> > can be set to try (possibly default in version + 1), and things will
> > work if it's not installed, but if set to on it'll refuse to work if not
> > enabled. Similar to how huge pages work now.
> 
> We could do that, but I'd be more inclined just to let JIT be
> magically enabled.  In general, if a user could do 'yum install ip4r'
> (for example) and have that Just Work without any further database
> configuration, I think a lot of people would consider that to be a
> huge improvement.  Unfortunately we can't really do that for various
> reasons, the biggest of which is that there's no way for installing an
> OS package to modify the internal state of a database that may not
> even be running at the time.  But as a general principle, I think
> having to configure both the OS and the DB is an anti-feature, and
> that if installing an extra package is sufficient to get the
> new-and-improved behavior, users will like it.

I'm not seing a contradiction between what you describe as desired, and
what I describe?  If it defaulted to try, that'd just do what you want,
no? I do think it's important to configure the system so it'll error if
JITing is not available.


> Bonus points if it doesn't require a server restart.

I think server restart might be doable (although it'll increase memory
usage because the shlib needs to be loaded in each backend rather than
postmaster), but once a session is running I'm fairly sure we do not
want to retry. Re-checking whether a shlib is available on the
filesystem every query does not sound like a good idea...

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Robert Haas

Date:

31 January 2018, 23:00:29

On Wed, Jan 31, 2018 at 2:49 PM, Andres Freund <andres@anarazel.de> wrote:
>> We could do that, but I'd be more inclined just to let JIT be
>> magically enabled.  In general, if a user could do 'yum install ip4r'
>> (for example) and have that Just Work without any further database
>> configuration, I think a lot of people would consider that to be a
>> huge improvement.  Unfortunately we can't really do that for various
>> reasons, the biggest of which is that there's no way for installing an
>> OS package to modify the internal state of a database that may not
>> even be running at the time.  But as a general principle, I think
>> having to configure both the OS and the DB is an anti-feature, and
>> that if installing an extra package is sufficient to get the
>> new-and-improved behavior, users will like it.
>
> I'm not seing a contradiction between what you describe as desired, and
> what I describe?  If it defaulted to try, that'd just do what you want,
> no? I do think it's important to configure the system so it'll error if
> JITing is not available.

Hmm, I guess that's true.  I'm not sure that we really need a way to
error out if JIT is not available, but maybe we do.

>> Bonus points if it doesn't require a server restart.
>
> I think server restart might be doable (although it'll increase memory
> usage because the shlib needs to be loaded in each backend rather than
> postmaster), but once a session is running I'm fairly sure we do not
> want to retry. Re-checking whether a shlib is available on the
> filesystem every query does not sound like a good idea...

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: JIT compiling with LLVM v9.0

From

Peter Eisentraut

Date:

01 February 2018, 16:42:35

On 1/31/18 13:34, Andres Freund wrote:
> That's a fair argument, and I don't really have a good answer to it. We
> could have a jit = off/try/on, and use that to signal things? I.e. it
> can be set to try (possibly default in version + 1), and things will
> work if it's not installed, but if set to on it'll refuse to work if not
> enabled. Similar to how huge pages work now.

But that setup also has the problem that you can't query the setting to
know whether it's actually on.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

From

Peter Eisentraut

Date:

01 February 2018, 16:46:08

On 1/31/18 14:45, Robert Haas wrote:
> We could do that, but I'd be more inclined just to let JIT be
> magically enabled.  In general, if a user could do 'yum install ip4r'
> (for example) and have that Just Work without any further database
> configuration,

One way to do that would be to have a system-wide configuration file
like /usr/local/pgsql/etc/postgresql/postgresql.conf, which in turn
includes /usr/local/pgsql/etc/postgresql/postgreql.conf.d/*, and have
the add-on package install its configuration file with the setting jit =
on there.

Then again, if we want to make it simpler, just link the whole thing in
and turn it on by default and be done with it.

Presumably, there will be planner-level knobs to model the jit startup
time, and if you don't like it, you can set that very high to disable
it.  So we don't necessarily need a separate turn-it-off-it's-broken
setting.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

01 February 2018, 17:01:17

On 2018-02-01 08:46:08 -0500, Peter Eisentraut wrote:
> On 1/31/18 14:45, Robert Haas wrote:
> > We could do that, but I'd be more inclined just to let JIT be
> > magically enabled.  In general, if a user could do 'yum install ip4r'
> > (for example) and have that Just Work without any further database
> > configuration,
> 
> One way to do that would be to have a system-wide configuration file
> like /usr/local/pgsql/etc/postgresql/postgresql.conf, which in turn
> includes /usr/local/pgsql/etc/postgresql/postgreql.conf.d/*, and have
> the add-on package install its configuration file with the setting jit =
> on there.

I think Robert's comment about extensions wasn't about extensions and
jit, just about needing CREATE EXTENSION. I don't see any
need for per-extension/shlib configurability of JITing.


> Then again, if we want to make it simpler, just link the whole thing in
> and turn it on by default and be done with it.

I'd personally be ok with that too...

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Merlin Moncure

Date:

01 February 2018, 17:51:17

On Wed, Jan 31, 2018 at 1:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jan 31, 2018 at 1:34 PM, Andres Freund <andres@anarazel.de> wrote:
>>> The first one is a problem that's not going to go away.  If the
>>> problem of JIT being enabled "magically" is something we're concerned
>>> about, we need to figure out a good solution, not just disable the
>>> feature by default.
>>
>> That's a fair argument, and I don't really have a good answer to it. We
>> could have a jit = off/try/on, and use that to signal things? I.e. it
>> can be set to try (possibly default in version + 1), and things will
>> work if it's not installed, but if set to on it'll refuse to work if not
>> enabled. Similar to how huge pages work now.
>
> We could do that, but I'd be more inclined just to let JIT be
> magically enabled.  In general, if a user could do 'yum install ip4r'
> (for example) and have that Just Work without any further database
> configuration, I think a lot of people would consider that to be a
> huge improvement.  Unfortunately we can't really do that for various
> reasons, the biggest of which is that there's no way for installing an
> OS package to modify the internal state of a database that may not
> even be running at the time.  But as a general principle, I think
> having to configure both the OS and the DB is an anti-feature, and
> that if installing an extra package is sufficient to get the
> new-and-improved behavior, users will like it.  Bonus points if it
> doesn't require a server restart.

You bet.  It'd be helpful to have some obvious, well advertised ways
to determine when it's enabled and when it isn't, and to have a
straightforward process to determine what to fix when it's not enabled
and the user thinks it ought to be though.

merlin

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

01 February 2018, 20:32:17

On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> The same problem takes place with old versions of GCC: I have to upgrade GCC
> to 7.2 to make it possible to compile this code.
> The problem in not in compiler itself, but in libc++ headers.

How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
CXXFLAGS required?

Regards,
     Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

02 February 2018, 04:05:55

On 2018-02-01 09:32:17 -0800, Jeff Davis wrote:
> On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
> > The same problem takes place with old versions of GCC: I have to upgrade GCC
> > to 7.2 to make it possible to compile this code.
> > The problem in not in compiler itself, but in libc++ headers.
> 
> How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
> and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
> CXXFLAGS required?

Just to understand: You're running in the issue with the header being
included from within the extern "C" {}?  Hm, I've pushed a quick fix for
that.

Other than that, you can compile with both gcc or clang, but clang needs
to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
(in that order) exist, similar with llvm-config llvm-config-5.0 being
guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
both of that. E.g.
./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
is what I use, although I also add:
LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
so I don't have to install llvm anywhere the system knows about.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

From

Thomas Munro

Date:

02 February 2018, 05:16:25

On Fri, Feb 2, 2018 at 2:05 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2018-02-01 09:32:17 -0800, Jeff Davis wrote:
>> On Wed, Jan 31, 2018 at 12:03 AM, Konstantin Knizhnik
>> <k.knizhnik@postgrespro.ru> wrote:
>> > The same problem takes place with old versions of GCC: I have to upgrade GCC
>> > to 7.2 to make it possible to compile this code.
>> > The problem in not in compiler itself, but in libc++ headers.
>>
>> How can I get this branch to compile on ubuntu 16.04? I have llvm-5.0
>> and gcc-5.4 installed. Do I need to compile with clang or gcc? Any
>> CXXFLAGS required?
>
> Just to understand: You're running in the issue with the header being
> included from within the extern "C" {}?  Hm, I've pushed a quick fix for
> that.

That change wasn't quite enough: to get this building against libc++
(Clang's native stdlb) I also needed this change to llvmjit.h so that
<llvm-c/Types.h> wouldn't be included with the wrong linkage (perhaps
you can find a less ugly way):

+#ifdef __cplusplus
+}
+#endif
 #include <llvm-c/Types.h>
+#ifdef __cplusplus
+extern "C"
+{
+#endif

> Other than that, you can compile with both gcc or clang, but clang needs
> to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
> (in that order) exist, similar with llvm-config llvm-config-5.0 being
> guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
> both of that. E.g.
> ./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
> is what I use, although I also add:
> LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
> so I don't have to install llvm anywhere the system knows about.

BTW if you're building with clang (vendor compiler on at least macOS
and FreeBSD) you'll probably need CXXFLAGS=-std=c++11 (or later
standard) because it's still defaulting to '98.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: JIT compiling with LLVM v9.0

From

Thomas Munro

Date:

02 February 2018, 07:11:17

Another small thing which might be environmental... llvmjit_types.bc
is getting installed into ${prefix}/lib here, but you're looking for
it in ${prefix}/lib/postgresql:

gmake[3]: Entering directory '/usr/home/munro/projects/postgres/src/backend/lib'
/usr/bin/install -c -m 644 llvmjit_types.bc '/home/munro/install/lib'

postgres=# set jit_above_cost = 0;
SET
postgres=# set jit_expressions = on;
SET
postgres=# select 4 + 4;
ERROR:  LLVMCreateMemoryBufferWithContentsOfFile(/usr/home/munro/install/lib/postgresql/llvmjit_types.bc)
failed: No such file or directory

$ mv ~/install/lib/llvmjit_types.bc ~/install/lib/postgresql/

postgres=# select 4 + 4;
 ?column?
----------
        8
(1 row)

-- 
Thomas Munro
http://www.enterprisedb.com

Re: JIT compiling with LLVM v9.0

From

Thomas Munro

Date:

02 February 2018, 08:22:34

On Fri, Feb 2, 2018 at 5:11 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Another small thing which might be environmental... llvmjit_types.bc
> is getting installed into ${prefix}/lib here, but you're looking for
> it in ${prefix}/lib/postgresql:

Is there something broken about my installation?  I see simple
arithmetic expressions apparently compiling and working but I can
easily find stuff that breaks... so far I think it's anything
involving string literals:

postgres=# set jit_above_cost = 0;
SET
postgres=# select quote_ident('x');
ERROR:  failed to resolve name MakeExpandedObjectReadOnlyInternal

Well actually just select 'hello world' does it.  I've attached a backtrace.

Tab completion is broken for me with jit_above_cost = 0 due to
tab-complete.c queries failing with various other errors including:

set <tab>:
ERROR:  failed to resolve name ExecEvalScalarArrayOp

update <tab>:
ERROR:  failed to resolve name quote_ident

show <tab>:
ERROR:  failed to resolve name slot_getsomeattrs

I wasn't sure from your status message how much of this is expected at
this stage...

This is built from:

commit 302b7a284d30fb0e00eb5f0163aa933d4d9bea10 (HEAD -> jit, andresfreund/jit)

... plus the extern "C" tweak I posted earlier to make my clang 4.0
compiler happy, built on a FreeBSD 11.1 box with:

./configure --prefix=/home/munro/install/ --enable-tap-tests
--enable-cassert --enable-debug --enable-depend --with-llvm CC="ccache
cc" CXX="ccache c++" CXXFLAGS="-std=c++11"
LLVM_CONFIG=/usr/local/llvm50/bin/llvm-config
--with-libraries="/usr/local/lib" --with-includes="/usr/local/include"

The clang that was used for bitcode was the system /usr/bin/clang,
version 4.0.  Is it a problem that I used that for compiling the
bitcode, but LLVM5 for JIT?  I actually tried
CLANG=/usr/local/llvm50/bin/clang but ran into weird failures I
haven't got to the bottom of at ThinLink time so I couldn't get as far
as a running system.

I installed llvm50 from a package.  I did need to make a tiny tweak by
hand: in src/Makefile.global, llvm-config --system-libs had said
-l/usr/lib/libexecinfo.so which wasn't linking and looks wrong to me
so I changed it to -lexecinfo, noted that it worked and reported a bug
upstream:  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225621

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

backtrace.txt

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

02 February 2018, 09:06:49

On Thu, Feb 1, 2018 at 5:05 PM, Andres Freund <andres@anarazel.de> wrote:
> Just to understand: You're running in the issue with the header being
> included from within the extern "C" {}?  Hm, I've pushed a quick fix for
> that.
>
> Other than that, you can compile with both gcc or clang, but clang needs
> to be available. Will be guessed from PATH if clang clang-5.0 clang-4.0
> (in that order) exist, similar with llvm-config llvm-config-5.0 being
> guessed. LLVM_CONFIG/CLANG/CXX= as an argument to configure overrides
> both of that. E.g.
> ./configure --with-llvm LLVM_CONFIG=~/build/llvm/5/opt/install/bin/llvm-config
> is what I use, although I also add:
> LDFLAGS='-Wl,-rpath,/home/andres/build/llvm/5/opt/install/lib'
> so I don't have to install llvm anywhere the system knows about.

On Ubuntu 16.04
SHA1: 302b7a284
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609

packages: llvm-5.0 llvm-5.0-dev llvm-5.0-runtime libllvm-5.0
clang-5.0 libclang-common-5.0-dev libclang1-5.0

./configure --with-llvm --prefix=/home/jdavis/install/pgsql-dev
...
checking for llvm-config... no
checking for llvm-config-5.0... llvm-config-5.0
checking for clang... no
checking for clang-5.0... clang-5.0
checking for LLVMOrcGetSymbolAddressIn... no
checking for LLVMGetHostCPUName... no
checking for LLVMOrcRegisterGDB... no
checking for LLVMOrcRegisterPerf... no
checking for LLVMOrcUnregisterPerf... no
...

That encounters errors like:

/usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file
requires compiler an
d library support for the ISO C++ 2011 standard. This support must be
enabled with the -st
d=c++11 or -std=gnu++11 compiler options.
...
/usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
function ‘long double
...
/usr/include/c++/5/cmath:926:3: error: template with C linkage
...

So I reconfigure with:
CXXFLAGS="-std=c++11" ./configure --with-llvm
--prefix=/home/jdavis/install/pgsql-dev

I think that got rid of the first error, but the other errors remain.

I also tried installing libc++-dev and using CC=clang-5.0
CXX=clang++-5.0 and with CXXFLAGS="-std=c++11 -stdlib=libc++" but I am
not making much progress, I'm still getting:

/usr/include/c++/v1/cmath:316:1: error: templates must have C++ linkage

I suggest that you share your exact configuration so we can get past
this for now, and you can work on the build issues in the background.
We can't be the first ones with this problem; maybe you can just ask
on an LLVM channel what the right thing to do is that will work on a
variety of machines (or at least reliably detect the problem at
configure time)?

Regards,
    Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Thomas Munro

Date:

02 February 2018, 09:09:51

On Fri, Feb 2, 2018 at 7:06 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> /usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
> function ‘long double
> ...
> /usr/include/c++/5/cmath:926:3: error: template with C linkage

I suspect you can fix these with this change:

+#ifdef __cplusplus
+}
+#endif
 #include <llvm-c/Types.h>
+#ifdef __cplusplus
+extern "C"
+{
+#endif

... in llvmjit.h.

--
Thomas Munro
http://www.enterprisedb.com

Re: JIT compiling with LLVM v9.0

From

Jeff Davis

Date:

02 February 2018, 09:20:01

On Thu, Feb 1, 2018 at 10:09 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Fri, Feb 2, 2018 at 7:06 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>> /usr/include/c++/5/cmath:505:22: error: conflicting declaration of C
>> function ‘long double
>> ...
>> /usr/include/c++/5/cmath:926:3: error: template with C linkage
>
> I suspect you can fix these with this change:
>
> +#ifdef __cplusplus
> +}
> +#endif
>  #include <llvm-c/Types.h>
> +#ifdef __cplusplus
> +extern "C"
> +{
> +#endif
>
> ... in llvmjit.h.

Thanks! That worked, but I had to remove the "-stdlib=libc++" also,
which was causing me problems.

Regards,
    Jeff Davis

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

02 February 2018, 11:50:52

Hi,

On 2018-02-02 18:22:34 +1300, Thomas Munro wrote:
> Is there something broken about my installation?  I see simple
> arithmetic expressions apparently compiling and working but I can
> easily find stuff that breaks... so far I think it's anything
> involving string literals:

That definitely should all work. Did you compile with lto and forced it
to internalize all symbols or such?


> postgres=# set jit_above_cost = 0;
> SET
> postgres=# select quote_ident('x');
> ERROR:  failed to resolve name MakeExpandedObjectReadOnlyInternal

...

> The clang that was used for bitcode was the system /usr/bin/clang,
> version 4.0.  Is it a problem that I used that for compiling the
> bitcode, but LLVM5 for JIT?

No, I did that locally without problems.


> I actually tried CLANG=/usr/local/llvm50/bin/clang but ran into weird
> failures I haven't got to the bottom of at ThinLink time so I couldn't
> get as far as a running system.

So you'd clang 5 level issues rather than with this patchset, do I
understand correctly?


> I installed llvm50 from a package.  I did need to make a tiny tweak by
> hand: in src/Makefile.global, llvm-config --system-libs had said
> -l/usr/lib/libexecinfo.so which wasn't linking and looks wrong to me
> so I changed it to -lexecinfo, noted that it worked and reported a bug
> upstream:  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225621

Yea, that seems outside of my / our hands.

- Andres

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

02 February 2018, 11:53:57

On 2018-02-01 22:20:01 -0800, Jeff Davis wrote:
> Thanks! That worked, but I had to remove the "-stdlib=libc++" also,
> which was causing me problems.

That'll be gone as soon as I finish the shlib thing. Will hope to have
something over the weekend. Right now I'm at FOSDEM and need to prepare
a talk for tomorrow.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.1

From

Pierre Ducroquet

Date:

02 February 2018, 12:48:16

On Monday, January 29, 2018 10:53:50 AM CET Andres Freund wrote:
> Hi,
>
> On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
> > == Code ==
> >
> > As the patchset is large (500kb) and I'm still quickly evolving it, I do
> > not yet want to attach it. The git tree is at
> >
> >   https://git.postgresql.org/git/users/andresfreund/postgres.git
> >
> > in the jit branch
> >
> >   https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=s
> >   hortlog;h=refs/heads/jit
> I've just pushed an updated and rebased version of the tree:
> - Split the large "jit infrastructure" commits into a number of smaller
>   commits
> - Split the C++ file
> - Dropped some of the performance stuff done to heaptuple.c - that was
>   mostly to make performance comparisons a bit more interesting, but
>   doesn't seem important enough to deal with.
> - Added a commit renaming datetime.h symbols so they don't conflict with
>   LLVM variables anymore, removing ugly #undef PM/#define PM dance
>   around includes. Will post separately.
> - Reduced the number of pointer constants in the generated LLVM IR, by
>   doing more getelementptr accesses (stem from before the time types
>   were automatically synced)
> - Increased number of comments a bit
>
> There's a jit-before-rebase-2018-01-29 tag, for the state of the tree
> before the rebase.
>
> Regards,
>
> Andres

Hi

I have successfully built the JIT branch against LLVM 4.0.1 on Debian testing.
This is not enough for Debian stable (LLVM 3.9 is the latest available there),
but it's a first step.
I've split the patch in four files. The first three fix the build issues, the
last one fixes a runtime issue.
I think they are small enough to not be a burden for you in your developments.
But if you don't want to carry these ifdefs right now, I maintain them in a
branch on a personal git and rebase as frequently as I can.

LLVM 3.9 support isn't going to be hard, but I prefer splitting. I also hope
this will help more people test this wonderful toy… :)

Regards

 Pierre

Attachment

Re: JIT compiling with LLVM v9.0

From

Andres Freund

Date:

02 February 2018, 13:08:40

Hi,

On 2018-02-02 18:22:34 +1300, Thomas Munro wrote:
> The clang that was used for bitcode was the system /usr/bin/clang,
> version 4.0.  Is it a problem that I used that for compiling the
> bitcode, but LLVM5 for JIT?  I actually tried
> CLANG=/usr/local/llvm50/bin/clang but ran into weird failures I
> haven't got to the bottom of at ThinLink time so I couldn't get as far
> as a running system.

You're using thinlto to compile pg? Could you provide what you pass to
configure for that? IIRC I tried that a while ago and ran into some
issues with us creating archives (libpgport, libpgcommon).

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.1

From

Pierre Ducroquet

Date:

02 February 2018, 13:40:49

On Friday, February 2, 2018 10:48:16 AM CET Pierre Ducroquet wrote:
> On Monday, January 29, 2018 10:53:50 AM CET Andres Freund wrote:
> > Hi,
> >
> > On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
> > > == Code ==
> > >
> > > As the patchset is large (500kb) and I'm still quickly evolving it, I do
> > > not yet want to attach it. The git tree is at
> > >
> > >   https://git.postgresql.org/git/users/andresfreund/postgres.git
> > >
> > > in the jit branch
> > >
> > >   https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a
> > >   =s
> > >   hortlog;h=refs/heads/jit
> >
> > I've just pushed an updated and rebased version of the tree:
> > - Split the large "jit infrastructure" commits into a number of smaller
> >
> >   commits
> >
> > - Split the C++ file
> > - Dropped some of the performance stuff done to heaptuple.c - that was
> >
> >   mostly to make performance comparisons a bit more interesting, but
> >   doesn't seem important enough to deal with.
> >
> > - Added a commit renaming datetime.h symbols so they don't conflict with
> >
> >   LLVM variables anymore, removing ugly #undef PM/#define PM dance
> >   around includes. Will post separately.
> >
> > - Reduced the number of pointer constants in the generated LLVM IR, by
> >
> >   doing more getelementptr accesses (stem from before the time types
> >   were automatically synced)
> >
> > - Increased number of comments a bit
> >
> > There's a jit-before-rebase-2018-01-29 tag, for the state of the tree
> > before the rebase.
> >
> > Regards,
> >
> > Andres
>
> Hi
>
> I have successfully built the JIT branch against LLVM 4.0.1 on Debian
> testing. This is not enough for Debian stable (LLVM 3.9 is the latest
> available there), but it's a first step.
> I've split the patch in four files. The first three fix the build issues,
> the last one fixes a runtime issue.
> I think they are small enough to not be a burden for you in your
> developments. But if you don't want to carry these ifdefs right now, I
> maintain them in a branch on a personal git and rebase as frequently as I
> can.
>
> LLVM 3.9 support isn't going to be hard, but I prefer splitting. I also hope
> this will help more people test this wonderful toy… :)
>
> Regards
>
>  Pierre

For LLVM 3.9, only small changes were needed.
I've attached the patches to this email.
I only did very basic, primitive testing, but it seems to work.
I'll do more testing in the next days.

 Pierre

Hi,

I've done some initial benchmarking on the branch over the last couple
of days, focusing on analytics workloads using the DBT-3 benchmark.
Attached are two spreadsheets with results from two machines (the same
two I use for all benchmarks), and a couple of charts illustrating the
impact of enabling different JIT options.

I did the tests with 10GB and 50GB data sets (load into database
generally increases the size by a factor of 2-3x). So at least on the
larger machine the 10GB dataset should be fully in memory. The numbers
are medians for 10 consecutive runs of each query, so the data tends to
be well cached.

In this round of tests I've disabled parallelism. Based on discussion
with Andres I've decided to repeat the tests with parallel queries
enabled - that's running now, and will take some time to complete.

According to the results, most of the DBT-3 queries see slight
improvement in the 5-10% range, but the JIT options vary depending on
the query. What surprised me quite a bit is that the improvement is way
more significant on the 50GB dataset (on both machines). I have expected
the opposite behavior, i.e. that the JIT impact will be more obvious on
the small dataset and then will diminish as I/O becomes more prominent.
Yet that's not the case, apparently. One possible explanation is that on
the 50GB data set the queries switch to plans that are more sensitive to
the JIT optimizations.

A couple of queries saw much more significant improvements - Q1 and Q20
got about 30%-40% faster, and I have no problem believing that other
queries may see even more significant benefits.

Other queries (Q19 and Q21) saw regressions - for Q19 it's relatively
harmless, I think. It's a short query and so the relative slowdown seems
somewhat worse that in absolute terms. Not sure what's going on for Q21,
though. But I think we'll need to look at the costing model, and try
tweaking it to make the right decision in those cases.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On Sunday, February 4, 2018 12:45:50 AM CET Andreas Karlsson wrote:
> On 02/02/2018 10:48 AM, Pierre Ducroquet wrote:
> > I have successfully built the JIT branch against LLVM 4.0.1 on Debian
> > testing. This is not enough for Debian stable (LLVM 3.9 is the latest
> > available there), but it's a first step.
> > I've split the patch in four files. The first three fix the build issues,
> > the last one fixes a runtime issue.
> > I think they are small enough to not be a burden for you in your
> > developments. But if you don't want to carry these ifdefs right now, I
> > maintain them in a branch on a personal git and rebase as frequently as I
> > can.
> 
> I tested these patches and while the code built for me and passed the
> test suite on Debian testing I have a weird bug where the very first
> query fails to JIT while the rest work as they should. I think I need to
> dig into LLVM's codebase to see what it is, but can you reproduce this
> bug at your machine?
> 
> Code to reproduce:
> 
> SET jit_expressions = true;
> SET jit_above_cost = 0;
> SELECT 1;
> SELECT 1;
> 
> Output:
> 
> postgres=# SELECT 1;
> ERROR:  failed to jit module
> postgres=# SELECT 1;
>   ?column?
> ----------
>          1
> (1 row)
> 
> Config:
> 
> Version: You patches applied on top of
> 302b7a284d30fb0e00eb5f0163aa933d4d9bea10
> OS: Debian testing
> llvm/clang: 4.0.1-8
> 
> Andreas


I have fixed the patches, I was wrong on 'guessing' the migration of the API 
for one function.
I have rebuilt the whole patch set. It is still based on 302b7a284d and has 
been tested with both LLVM 3.9 and 4.0 on Debian testing.

Thanks for your feedback !

Hi,

On 02/03/2018 01:05 PM, Tomas Vondra wrote:
> Hi,
> 
> ...
> 
> In this round of tests I've disabled parallelism. Based on
> discussion with Andres I've decided to repeat the tests with parallel
> queries enabled - that's running now, and will take some time to
> complete.
> 

And here are the results with parallelism enabled - same machines, but
with max_parallel_workers_per_gather > 0. Based on discussions and
Andres' FOSDEM talk I somehow expected more significant JIT benefits in
the parallel case, but the results are pretty much exactly the same
(modulo speedup thanks to parallelism, of course).

In fact, the JIT impact is much noisier with parallelism enabled, for
some reason, with regressions where there were no measurable regressions
before (particularly for the 10GB case).

That is not to say we shouldn't be doing JIT, or that Andres did not
observe the speedups/benefits he mentioned during the talk - I have no
trouble believing it depends on queries, and DBT-3 may not match that.

I don't plan doing any further benchmarks on this patch series unless
someone requests that (possibly with ideas what to focus on). I'll keep
looking at the patch, of course. I've seen some build issues, so I'll
try finding more details.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: JIT compiling with LLVM v10.0

From

Andres Freund

Date:

07 February 2018, 17:54:05

Hi,

I've pushed v10.0. The big (and pretty painful to make) change is that
now all the LLVM specific code lives in src/backend/jit/llvm, which is
built as a shared library which is loaded on demand.

The layout is now as follows:

src/backend/jit/jit.c:
    Part of JITing always linked into the server. Supports loading the
    LLVM using JIT library.

src/backend/jit/llvm/
Infrastructure:
 llvmjit.c:
    General code generation and optimization infrastructure
 llvmjit_error.cpp, llvmjit_wrap.cpp:
    Error / backward compat wrappers
 llvmjit_inline.cpp:
    Cross module inlining support
Code-Gen:
  llvmjit_expr.c
    Expression compilation
  llvmjit_deform.c
    Deform compilation

I generally like how this shaped out. There's a good amount of followup
cleanup needed, but I'd appreciate some early feedback.


I've also rebased onto a recent master version.

postgres[21915][1]=# SELECT pg_llvmjit_available();
┌──────────────────────┐
│ pg_llvmjit_available │
├──────────────────────┤
│ t                    │
└──────────────────────┘
(1 row)

make -C src/backend/jit/llvm/ uninstall
postgres[21915][1]=# \c
You are now connected to database "postgres" as user "andres".
postgres[21922][1]=# SELECT pg_llvmjit_available();
┌──────────────────────┐
│ pg_llvmjit_available │
├──────────────────────┤
│ f                    │
└──────────────────────┘
(1 row)

Yeha ;)

Greetings,

Andres Freund

Re: JIT compiling with LLVM v10.0

From

Pierre Ducroquet

Date:

07 February 2018, 22:35:12

On Wednesday, February 7, 2018 3:54:05 PM CET Andres Freund wrote:
> Hi,
> 
> I've pushed v10.0. The big (and pretty painful to make) change is that
> now all the LLVM specific code lives in src/backend/jit/llvm, which is
> built as a shared library which is loaded on demand.
> 
> The layout is now as follows:
> 
> src/backend/jit/jit.c:
>     Part of JITing always linked into the server. Supports loading the
>     LLVM using JIT library.
> 
> src/backend/jit/llvm/
> Infrastructure:
>  llvmjit.c:
>     General code generation and optimization infrastructure
>  llvmjit_error.cpp, llvmjit_wrap.cpp:
>     Error / backward compat wrappers
>  llvmjit_inline.cpp:
>     Cross module inlining support
> Code-Gen:
>   llvmjit_expr.c
>     Expression compilation
>   llvmjit_deform.c
>     Deform compilation
> 
> I generally like how this shaped out. There's a good amount of followup
> cleanup needed, but I'd appreciate some early feedback.

Hi

I also find it more readable and it looks cleaner, insane guys could be able 
to write their own JIT engines for PostgreSQL by patching a single file :) 
Since it's now in its own .so file, does it still make as much sense using 
mostly the LLVM C API ?
I'll really look in the jit code itself later, right now I've just rebased my 
previous patches and did a quick check that everything worked for LLVM4 and 
3.9.
I included a small addition to the gitignore file, I'm surprised you were not 
bothered by the various .bc files generated.

Anyway, great work, and I look forward exploring the code :)

 Pierre

On Wednesday, February 14, 2018 7:17:10 PM CET Andres Freund wrote:
> Hi,
> 
> On 2018-02-07 06:54:05 -0800, Andres Freund wrote:
> > I've pushed v10.0. The big (and pretty painful to make) change is that
> > now all the LLVM specific code lives in src/backend/jit/llvm, which is
> > built as a shared library which is loaded on demand.
> > 
> > The layout is now as follows:
> > 
> > src/backend/jit/jit.c:
> >     Part of JITing always linked into the server. Supports loading the
> >     LLVM using JIT library.
> > 
> > src/backend/jit/llvm/
> > 
> > Infrastructure:
> >  llvmjit.c:
> >     General code generation and optimization infrastructure
> >  
> >  llvmjit_error.cpp, llvmjit_wrap.cpp:
> >     Error / backward compat wrappers
> >  
> >  llvmjit_inline.cpp:
> >     Cross module inlining support
> > 
> > Code-Gen:
> >   llvmjit_expr.c
> >   
> >     Expression compilation
> >   
> >   llvmjit_deform.c
> >   
> >     Deform compilation
> 
> I've pushed a revised version that hopefully should address Jeff's
> wish/need of being able to experiment with this out of core. There's now
> a "jit_provider" PGC_POSTMASTER GUC that's by default set to
> "llvmjit". llvmjit.so is the .so implementing JIT using LLVM. It fills a
> set of callbacks via
> extern void _PG_jit_provider_init(JitProviderCallbacks *cb);
> which can also be implemented by any other potential provider.
> 
> The other two biggest changes are that I've added a README
> https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;
> f=src/backend/jit/README;hb=jit and that I've revised the configure support
> so it does more error
> checks, and moved it into config/llvm.m4.
> 
> There's a larger smattering of small changes too.
> 
> I'm pretty happy with how the separation of core / shlib looks now. I'm
> planning to work on cleaning and then pushing some of the preliminary
> patches (fixed tupledesc, grouping) over the next few days.
> 
> Greetings,
> 
> Andres Freund

Hi

Here are the LLVM4 and LLVM3.9 compatibility patches.
Successfully built, and executed some silly queries with JIT forced to make 
sure it worked.

 Pierre

Attachment

Re: JIT compiling with LLVM v10.1

From

Andres Freund

Date:

15 February 2018, 01:44:38

Hi,

On 2018-02-14 23:32:17 +0100, Pierre Ducroquet wrote:
> Here are the LLVM4 and LLVM3.9 compatibility patches.
> Successfully built, and executed some silly queries with JIT forced to make 
> sure it worked.

Thanks!

I'm going to integrate them into my series in the next few days.

Regards,

Andres

Re: JIT compiling with LLVM v10.1

From

Konstantin Knizhnik

Date:

15 February 2018, 11:59:46

On 14.02.2018 21:17, Andres Freund wrote:

Hi,

On 2018-02-07 06:54:05 -0800, Andres Freund wrote:

I've pushed v10.0. The big (and pretty painful to make) change is that
now all the LLVM specific code lives in src/backend/jit/llvm, which is
built as a shared library which is loaded on demand.

The layout is now as follows:

src/backend/jit/jit.c:   Part of JITing always linked into the server. Supports loading the   LLVM using JIT library.

src/backend/jit/llvm/
Infrastructure:llvmjit.c:   General code generation and optimization infrastructurellvmjit_error.cpp, llvmjit_wrap.cpp:   Error / backward compat wrappersllvmjit_inline.cpp:   Cross module inlining support
Code-Gen: llvmjit_expr.c   Expression compilation llvmjit_deform.c   Deform compilation

I've pushed a revised version that hopefully should address Jeff's
wish/need of being able to experiment with this out of core. There's now
a "jit_provider" PGC_POSTMASTER GUC that's by default set to
"llvmjit". llvmjit.so is the .so implementing JIT using LLVM. It fills a
set of callbacks via
extern void _PG_jit_provider_init(JitProviderCallbacks *cb);
which can also be implemented by any other potential provider.

The other two biggest changes are that I've added a README
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=src/backend/jit/README;hb=jit
and that I've revised the configure support so it does more error
checks, and moved it into config/llvm.m4.

There's a larger smattering of small changes too.

I'm pretty happy with how the separation of core / shlib looks now. I'm
planning to work on cleaning and then pushing some of the preliminary
patches (fixed tupledesc, grouping) over the next few days.

Greetings,

Andres Freund

I have made some more experiments with efficiency of JIT-ing of deform tuple and I want to share this results (I hope that them will be interesting).
It is well known fact that Postgres spends most of the time in sequence scan queries for warm data in deforming tuples (17% in case of TPC-H Q1).
Postgres tries to optimize access to the tuple by caching fixed size offsets to the fields whenever possible and loading attributes on demand.
It is also well know recommendation to put fixed size, non-null, frequently used attributes at the beginning of table's attribute list to make this optimization work more efficiently.
You can see in the code of heap_deform_tuple shows that first NULL value will switch it to "slow" mode:

for (attnum = 0; attnum < natts; attnum++)
    {
        Form_pg_attribute thisatt = TupleDescAttr(tupleDesc, attnum);

        if (hasnulls && att_isnull(attnum, bp))
        {
            values[attnum] = (Datum) 0;
            isnull[attnum] = true;
            slow = true;        /* can't use attcacheoff anymore */
            continue;
        }

I tried to investigate importance of this optimization and what is actual penalty of "slow" mode.
At the same time I want to understand how JIT help to speed-up tuple deforming.

I have populated with data three tables:

create table t1(id integer primary key,c1 integer,c2 integer,c3 integer,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
create table t2(id integer primary key,c1 integer,c2 integer,c3 integer,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
create table t3(id integer primary key,c1 integer not null,c2 integer not null,c3 integer not null,c4 integer not null,c5 integer not null,c6 integer not null,c7 integer not null,c8 integer not null,c9 integer not null);
insert into t1 (id,c1,c2,c3,c4,c5,c6,c7,c8) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
insert into t2 (id,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
insert into t3 (id,c1,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0,0);
vacuum analyze t1;
vacuum analyze t2;
vacuum analyze t3;

t1 contains null in last c9 column, t2 - in first c1 columns and t3 has all attributes declared as not-null (and JIT can use this knowledge to generate more efficient deforming code).
All data set is hold in memory (shared buffer size is greater than database size) and I intentionally switch off parallel execution to make results more deterministic.
I run two queries calculating aggregates on one/all not-null fields:

select sum(c8) from t*;
select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t*;

As expected 35% time was spent in heap_deform_tuple.
But results (msec) were slightly confusing and unexected:

select sum(c8) from t*;

	w/o JIT	with JIT
t1	763	563
t2	772	570
t3	776	592

select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t*;

	w/o JIT	with JIT
t1	1239	742
t2	1233	747
t3	1255	803

I repeat each query 10 times and take the minimal time ( I think that it is more meaningful than average time which depends on some other activity on the system).

So there is no big difference between "slow" and "fast" ways of deforming tuple.
Moreover, for sometimes "slow" case is faster. Although I have to say that variance of results is quite large: about 10%.
But in any case, I can made two conclusions from this results:

1. Modern platforms are mostly limited by memory access time, number of performed instructions is less critical.
This is why extra processing needed for nullable attributes can not significantly affect performance.
2. For large number of attributes JIT-ing of deform tuple can improve speed up to two time. Which is quite good result from my point of view.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: JIT compiling with LLVM v9.1

From

Andreas Karlsson

Date:

15 February 2018, 14:54:34

On 02/05/2018 10:44 PM, Pierre Ducroquet wrote:
>> psqlscanslash.l: In function ‘psql_scan_slash_option’:
>> psqlscanslash.l:550:8: warning: variable ‘lexresult’ set but not used
>> [-Wunused-but-set-variable]
>>     int   final_state;
>>           ^~~~~~~~~
> 
> I'm not sure Andres's patches have anything to do with psql, it's surprising.

I managed to track down the bug and apparently when building with 
--with-llvm the -DNDEBUG option is added to CPPFLAGS, but I am not 
entirely sure what the code in config/llvm.m4 is trying to do in the 
first place.

The two issues I see with what the code does are:

1) Why does config/llvm.m4 modify CPPFLAGS? That affects the building of 
the binaries too which may be done with gcc like in my case. Shouldn't 
it use a LLVM_CPPFLAGS or something?

2) When I build with --with-cassert I expect the assertions to be there, 
both in the binaries and the bitcode. Is that just a bug or is there any 
thought behind this?

Below is the diff in src/Makefile.global between when I run configure 
with --with-llvm or not.

diff src/Makefile.global-nollvm  src/Makefile.global-llvm
78c78
< configure_args =  '--prefix=/home/andreas/dev/postgresql-inst' 
'--enable-tap-tests' '--enable-cassert' '--enable-debug'
---
 > configure_args =  '--prefix=/home/andreas/dev/postgresql-inst' 
'--enable-tap-tests' '--enable-cassert' '--enable-debug' '--with-llvm'
190c190
< with_llvm    = no
---
 > with_llvm    = yes
227,229c227,229
< LLVM_CONFIG =
< LLVM_BINPATH =
< CLANG =
---
 > LLVM_CONFIG = /usr/bin/llvm-config
 > LLVM_BINPATH = /usr/lib/llvm-4.0/bin
 > CLANG = /usr/bin/clang
238c238
< CPPFLAGS =  -D_GNU_SOURCE
---
 > CPPFLAGS = -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_CONSTANT_MACROS -D_GNU_SOURCE -DNDEBUG 
-I/usr/lib/llvm-4.0/include  -D_GNU_SOURCE
261c261
< LLVM_CXXFLAGS =
---
 > LLVM_CXXFLAGS =  -std=c++0x -std=c++11 -fno-exceptions
283c283
< LLVM_LIBS=
---
 > LLVM_LIBS= -lLLVM-4.0
297c297
< LDFLAGS +=   -Wl,--as-needed
---
 > LDFLAGS +=  -L/usr/lib/llvm-4.0/lib  -Wl,--as-needed

Re: JIT compiling with LLVM v9.1

From

Andres Freund

Date:

15 February 2018, 20:23:10

Hi,

On 2018-02-15 12:54:34 +0100, Andreas Karlsson wrote:
> 1) Why does config/llvm.m4 modify CPPFLAGS? That affects the building of the
> binaries too which may be done with gcc like in my case. Shouldn't it use a
> LLVM_CPPFLAGS or something?

Well, most of the time cppflags just are things like additional include
directories. And the established precedent is to just add those to the
global cppflags (c.f. libxml stuff in configure in).  I've no problem
changing this, I just followed established practice.

> 2) When I build with --with-cassert I expect the assertions to be there,
> both in the binaries and the bitcode. Is that just a bug or is there any
> thought behind this?

Not sure what you mean by that. NDEBUG and cassert are independent
mechanisms, no?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v10.1

From

Andres Freund

Date:

15 February 2018, 21:11:59

Hi,

On 2018-02-15 11:59:46 +0300, Konstantin Knizhnik wrote:
> It is well known fact that Postgres spends most of the time in sequence scan
> queries for warm data in deforming tuples (17% in case of TPC-H Q1).

I think that the majority of the time therein is not actually
bottlenecked by CPU, but by cache misses.  It might be worthwhile to
repeat your analysis with the last patch of my series applied, and the
#define FASTORDER
uncommented.

> Postgres  tries to optimize access to the tuple by caching fixed size
> offsets to the fields whenever possible and loading attributes on demand.
> It is also well know recommendation to put fixed size, non-null, frequently
> used attributes at the beginning of table's attribute list to make this
> optimization work more efficiently.

FWIW, I think this optimization causes vastly more trouble than it's
worth.

> You can see in the code of heap_deform_tuple shows that first NULL value
> will switch it to "slow" mode:

Note that in most workloads the relevant codepath isn't
heap_deform_tuple but slot_deform_tuple.

> 1. Modern platforms are mostly limited by memory access time, number of
> performed instructions is less critical.

I don't think this is quite the correct result. Especially because a lot
of time is spent accessing memory, having code that the CPU can execute
out-of-order (by speculatively executing forward) is hugely
beneficial.  Some of the benefit of JITing comes from being able to
start deforming the next field while memory fetches for the previous one
are still ongoing (iff dealing with fixed width cols).

> 2. For large number of attributes JIT-ing of deform tuple can improve speed
> up to two time. Which is quite good result from my point of view.

+1

Note the last version has a small deficiency in decoding varlena datums
that I need to fix (varsize_any isn't inlined anymore).

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.1

From

Andreas Karlsson

Date:

16 February 2018, 02:42:00

On 02/15/2018 06:23 PM, Andres Freund wrote:
>> 2) When I build with --with-cassert I expect the assertions to be there,
>> both in the binaries and the bitcode. Is that just a bug or is there any
>> thought behind this?
> 
> Not sure what you mean by that. NDEBUG and cassert are independent
> mechanisms, no?

Yeah, I think I just managed to confuse myself there.

The actual issue is that --with-llvm changes if NDEBUG is set or not, 
which is quite surprising. I would not expect assertions to be disabled 
in the fronted code just because I compiled PostgreSQL with llvm.

Andreas

Re: JIT compiling with LLVM v10.1

From

Jesper Pedersen

Date:

19 February 2018, 21:55:16

Hi,

On 02/14/2018 01:17 PM, Andres Freund wrote:
> On 2018-02-07 06:54:05 -0800, Andres Freund wrote:
>> I've pushed v10.0. The big (and pretty painful to make) change is that
>> now all the LLVM specific code lives in src/backend/jit/llvm, which is
>> built as a shared library which is loaded on demand.
>>

I thought

https://db.in.tum.de/~leis/papers/adaptiveexecution.pdf?lang=en

was relevant for this thread.

Best regards,
  Jesper

JIT compiling with LLVM v11

From

Andres Freund

Date:

01 March 2018, 11:02:42

Hi,

I've pushed a revised version of my JIT patchset.
The git tree is at
https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

Biggest changes:
- LLVM 3.9 - master are now supported. This includes a good chunk of
work by Pierre Ducroquet.

Doing so I found that the patches Pierre provided didn't work when a
query was expensive enough to warrant inlining. Turns out LLVM < 5
can't combine the summaries of multiple thin module summaries.

But that actually turned out to be a good thing, because it made me
think about symbol resolution preferences. Previously it was basically
arbitrary whether a function with conflicting names would be choosen
from core postgres or one of the extension libs providing it.

This is now rewritten so we don't build a combined module summary for
core postgres and extensions at backend start. Instead summaries for
core pg and extensions are loaded separately, and the correct one for
a symbol is used.

- Functions in extension libraries are now not referred to with their C
symbol in LLVM IR, instead we generate a fictious symbol that includes
the library path. E.g. hstore's hstore_from_record is now referenced as
@"pgextern.$libdir/hstore.hstore_from_record".

Both symbol resolution and inlining knows how to properly resolve
those.

- As hinted to above, the inlining support has evolved
considerably. Instead of a combined index built at backend start we
now have individual indexes for each extension / shlib. Symbols are
searched with a search path (internal functions just in the 'postgres'
index, for extensions it's main 'postgres', 'extension'), and symbols
that explicitly reference a function are just looked up within that
extension.

This has the nice advantage that we don't have to process indexes for
extensions that aren't used, which in turn means that extensions can
be installed on the system level while a backend is running, and
JITing will work even for old backends once the extension is created
(or rather functions in it).

Additionally the inline costing logic has improved, the super verbose
logging is #ifdef'ed out ('ilog' wrapper that's just (void) 0).

- The installation of bitcode is now a nice separate make function. pgxs
(including contrib's kinda use of pgxs) now automatically generate &
install bitcode when the server was compiled --with-llvm.

I learned some about make I didn't know ;).

- bunch of compilation issues (older clang, -D_NDEBUG from llvm-config
being used for all of postgres, ...) have been fixed.

- Two bigger prerequisite patches have been merged.

- lotsa smaller stuff.

Regards,

Andres

	max_parallel_workers_per_gather=0	max_parallel_workers_per_gather=4
jit=on	17500	5730
jit=off	25100	7550