Thread: meson vs. llvm bitcode files
The meson build currently does not produce llvm bitcode (.bc) files. AFAIK, this is the last major regression for using meson for production builds. Is anyone working on that? I vaguely recall that some in-progress code was shared a couple of years ago, but I haven't seen anything since. It would be great if we could collect any existing code and notes to maybe get this moving again.
Hi, On Thu, 5 Sept 2024 at 11:56, Peter Eisentraut <peter@eisentraut.org> wrote: > > The meson build currently does not produce llvm bitcode (.bc) files. > AFAIK, this is the last major regression for using meson for production > builds. > > Is anyone working on that? I vaguely recall that some in-progress code > was shared a couple of years ago, but I haven't seen anything since. It > would be great if we could collect any existing code and notes to maybe > get this moving again. I found that Andres shared a patch (v17-0021-meson-Add-LLVM-bitcode-emission.patch) a while ago [1]. [1] https://www.postgresql.org/message-id/20220927011951.j3h4o7n6bhf7dwau%40awork3.anarazel.de -- Regards, Nazir Bilal Yavuz Microsoft
Hi, On Thu, 5 Sept 2024 at 12:24, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote: > > I found that Andres shared a patch > (v17-0021-meson-Add-LLVM-bitcode-emission.patch) a while ago [1]. Andres and I continued to work on that. I think the patches are in sharable state now and I wanted to hear opinions before proceeding further. After applying the patches, bitcode files should be installed into $pkglibdir/bitcode/ directory if the llvm is found. There are 6 patches attached: v1-0001-meson-Add-generated-header-stamps: This patch is trivial. Instead of having targets depending directly on the generated headers, have them depend on a stamp file. The benefit of using a stamp file is that it makes ninja.build smaller and meson setup faster. ---------- v1-0002-meson-Add-postgresql-extension.pc-for-building-extension-libraries: This patch is for generating postgresql-extension.pc file which can be used for building extensions libraries. Normally, there is no need to use this .pc file for generating bitcode files. However, since there is no clear way to get all include paths for building bitcode files, this .pc file is later used for this purpose (by running pkg-config --cflags-only-I postgresql-extension-uninstalled.pc) [1]. ---------- v1-0003-meson-Test-building-extensions-by-using-postgresql-extension.pc: [Not needed for generating bitcode files] This is a patch for testing if extensions can be built by using postgresql-extension.pc. I added that commit as an example of using postgresql-extension.pc to build extensions. ---------- v1-0004-meson-WIP-Add-docs-for-postgresql-extension.pc: [Not needed for generating bitcode files] I added this patch in case we recommend people to use postgresql-extension.pc to build extension libraries. I am not sure if we want to do that because there are still TODOs about postgresql-extension.pc like running test suites. I just wanted to show my plan, dividing 'Extension Building Infrastructure' into two, 'PGXS' and 'postgresql-extension.pc'. ---------- v1-0005-meson-Add-LLVM-bitcode-emission: This patch adds required infrastructure to generate bitcode files and uses postgresql-extension-uninstalled.pc to get include paths for generating bitcode files [1]. ---------- v1-0006-meson-Generate-bitcode-files-of-contrib-extension.patch: This patch adds manually selected contrib libraries to generate their bitcode files. These libraries are selected manually, depending on - If they have SQL callable functions - If the library functions are short enough (the performance gain from bitcode files is too minimal compared to the function's run time, so this type of libraries are omitted). Any kind of feedback would be appreciated. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
- v1-0001-meson-Add-generated-header-stamps.patch
- v1-0002-meson-Add-postgresql-extension.pc-for-building-ex.patch
- v1-0003-meson-Test-building-extensions-by-using-postgresq.patch
- v1-0004-meson-WIP-Add-docs-for-postgresql-extension.pc.patch
- v1-0005-meson-Add-LLVM-bitcode-emission.patch
- v1-0006-meson-Generate-bitcode-files-of-contrib-extension.patch
Hello,
I did a full review on the provided patches plus some tests, I was able to validate that the loading of bitcode modules is working also JIT works for both backend and contrib modules.
To test JIT on contrib modules I just lowered the costs for all jit settings and used the intarray extension, using the data/test__int.data:
CREATE EXTENSION intarray;
CREATE TABLE test__int( a int[] );1
\copy test__int from 'data/test__int.data'
\copy test__int from 'data/test__int.data'
For queries any from line 98+ on contrib/intarray/sql/_int.sql will work.
Then I added extra debug messages to llvmjit_inline.cpp on add_module_to_inline_search_path() function, also on llvm_build_inline_plan(), I was able to see many functions in this module being successfully inlined.
I'm attaching a new patch based on your original work which add further support for generating bitcode from:
- Generated backend sources: processed by flex, bison, etc.
- Generated contrib module sources,
On this patch I just included fmgrtab.c and src/backend/parser for the backend generated code.
For contrib generated sources I added contrib/cube as an example.
All relevant details about the changes are included in the patch itself.
As you may know already I also created a PR focused on llvm bitcode emission on meson, it generates bitcode for all backend and contribution modules, currently under review by some colleagues at Percona: https://github.com/percona/postgres/pull/103
I'm curious if we should get all or some of the generated backend sources compiled to bitcode, similar to contrib modules.
Please let me know your thoughts and how we can proceed to get this feature included, thank you.
Regards,
Diego Fronza
Percona
On Fri, Mar 7, 2025 at 7:52 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Hi,
On Thu, 5 Sept 2024 at 12:24, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> I found that Andres shared a patch
> (v17-0021-meson-Add-LLVM-bitcode-emission.patch) a while ago [1].
Andres and I continued to work on that. I think the patches are in
sharable state now and I wanted to hear opinions before proceeding
further. After applying the patches, bitcode files should be installed
into $pkglibdir/bitcode/ directory if the llvm is found.
There are 6 patches attached:
v1-0001-meson-Add-generated-header-stamps:
This patch is trivial. Instead of having targets depending directly on
the generated headers, have them depend on a stamp file. The benefit
of using a stamp file is that it makes ninja.build smaller and meson
setup faster.
----------
v1-0002-meson-Add-postgresql-extension.pc-for-building-extension-libraries:
This patch is for generating postgresql-extension.pc file which can be
used for building extensions libraries.
Normally, there is no need to use this .pc file for generating bitcode
files. However, since there is no clear way to get all include paths
for building bitcode files, this .pc file is later used for this
purpose (by running pkg-config --cflags-only-I
postgresql-extension-uninstalled.pc) [1].
----------
v1-0003-meson-Test-building-extensions-by-using-postgresql-extension.pc:
[Not needed for generating bitcode files]
This is a patch for testing if extensions can be built by using
postgresql-extension.pc. I added that commit as an example of using
postgresql-extension.pc to build extensions.
----------
v1-0004-meson-WIP-Add-docs-for-postgresql-extension.pc: [Not needed
for generating bitcode files]
I added this patch in case we recommend people to use
postgresql-extension.pc to build extension libraries. I am not sure if
we want to do that because there are still TODOs about
postgresql-extension.pc like running test suites. I just wanted to
show my plan, dividing 'Extension Building Infrastructure' into two,
'PGXS' and 'postgresql-extension.pc'.
----------
v1-0005-meson-Add-LLVM-bitcode-emission:
This patch adds required infrastructure to generate bitcode files and
uses postgresql-extension-uninstalled.pc to get include paths for
generating bitcode files [1].
----------
v1-0006-meson-Generate-bitcode-files-of-contrib-extension.patch:
This patch adds manually selected contrib libraries to generate their
bitcode files. These libraries are selected manually, depending on
- If they have SQL callable functions
- If the library functions are short enough (the performance gain from
bitcode files is too minimal compared to the function's run time, so
this type of libraries are omitted).
Any kind of feedback would be appreciated.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment
Hi, On Tue, 11 Mar 2025 at 01:04, Diego Fronza <diego.fronza@percona.com> wrote: > I did a full review on the provided patches plus some tests, I was able to validate that the loading of bitcode modulesis working also JIT works for both backend and contrib modules. Thank you! > To test JIT on contrib modules I just lowered the costs for all jit settings and used the intarray extension, using thedata/test__int.data: > CREATE EXTENSION intarray; > CREATE TABLE test__int( a int[] );1 > \copy test__int from 'data/test__int.data' > > For queries any from line 98+ on contrib/intarray/sql/_int.sql will work. > > Then I added extra debug messages to llvmjit_inline.cpp on add_module_to_inline_search_path() function, also on llvm_build_inline_plan(),I was able to see many functions in this module being successfully inlined. > > I'm attaching a new patch based on your original work which add further support for generating bitcode from: Thanks for doing that! > - Generated backend sources: processed by flex, bison, etc. > - Generated contrib module sources, I think we do not need to separate these two. foreach srcfile : bitcode_module['srcfiles'] - if meson.version().version_compare('>=0.59') + srcfilename = '@0@'.format(srcfile) + if srcfilename.startswith('<CustomTarget') + srcfilename = srcfile.full_path().split(meson.build_root() + '/')[1] + elif meson.version().version_compare('>=0.59') Also, checking if the string starts with '<CustomTarget' is a bit hacky, and 'srcfilename = '@0@'.format(srcfile)' causes a deprecation warning. So, instead of this we can process all generated sources like how generated backend sources are processed. I updated the patch with that. > On this patch I just included fmgrtab.c and src/backend/parser for the backend generated code. > For contrib generated sources I added contrib/cube as an example. I applied your contrib/cube example and did the same thing for the contrib/seg. > All relevant details about the changes are included in the patch itself. > > As you may know already I also created a PR focused on llvm bitcode emission on meson, it generates bitcode for all backendand contribution modules, currently under review by some colleagues at Percona: https://github.com/percona/postgres/pull/103 > I'm curious if we should get all or some of the generated backend sources compiled to bitcode, similar to contrib modules. I think we can do this. I added other backend sources like you did in the PR but attached it as another patch (0007) because I wanted to hear other people's opinions on that first. v3 is attached. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
- v3-0001-meson-Add-generated-header-stamps.patch
- v3-0002-meson-Add-postgresql-extension.pc-for-building-ex.patch
- v3-0003-meson-Test-building-extensions-by-using-postgresq.patch
- v3-0004-meson-WIP-Add-docs-for-postgresql-extension.pc.patch
- v3-0005-meson-Add-architecture-for-LLVM-bitcode-emission.patch
- v3-0006-meson-Add-LLVM-bitcode-emissions-for-contrib-libr.patch
- v3-0007-meson-Add-LLVM-bitcode-emission-for-backend-sourc.patch
Hi,
The v7 patch looks good to me, handling the bitcode modules in a uniform way and also avoiding the hacky code and warnings, much better now.
A small note about the bitcode emission for generated sources in contrib, using cube as example, currently it creates two dict entries in a list:
bc_seg_gen_sources = [{'srcfiles': [seg_scan]}]bc_seg_gen_sources += {'srcfiles': [seg_parse[0]]}
Then pass it to the bitcode_modules:
bitcode_modules += {
...
'gen_srcfiles': bc_seg_gen_sources,
}
It could be passed as a list with a single dict, since both generated sources share the same compilation flags:
bitcode_modules += {
...
'gen_srcfiles': [
{ 'srcfiles': [cube_scan, cube_parse[0]] }.
]
}
Both approaches work, the first one has the advantage of being able to pass separate additional_flags per generated source.
Thanks for your reply Nazir, also waiting for more opinions on this.
Regards,
Diego
On Wed, Mar 12, 2025 at 7:27 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
Hi,
On Tue, 11 Mar 2025 at 01:04, Diego Fronza <diego.fronza@percona.com> wrote:
> I did a full review on the provided patches plus some tests, I was able to validate that the loading of bitcode modules is working also JIT works for both backend and contrib modules.
Thank you!
> To test JIT on contrib modules I just lowered the costs for all jit settings and used the intarray extension, using the data/test__int.data:
> CREATE EXTENSION intarray;
> CREATE TABLE test__int( a int[] );1
> \copy test__int from 'data/test__int.data'
>
> For queries any from line 98+ on contrib/intarray/sql/_int.sql will work.
>
> Then I added extra debug messages to llvmjit_inline.cpp on add_module_to_inline_search_path() function, also on llvm_build_inline_plan(), I was able to see many functions in this module being successfully inlined.
>
> I'm attaching a new patch based on your original work which add further support for generating bitcode from:
Thanks for doing that!
> - Generated backend sources: processed by flex, bison, etc.
> - Generated contrib module sources,
I think we do not need to separate these two.
foreach srcfile : bitcode_module['srcfiles']
- if meson.version().version_compare('>=0.59')
+ srcfilename = '@0@'.format(srcfile)
+ if srcfilename.startswith('<CustomTarget')
+ srcfilename = srcfile.full_path().split(meson.build_root() + '/')[1]
+ elif meson.version().version_compare('>=0.59')
Also, checking if the string starts with '<CustomTarget' is a bit
hacky, and 'srcfilename = '@0@'.format(srcfile)' causes a deprecation
warning. So, instead of this we can process all generated sources like
how generated backend sources are processed. I updated the patch with
that.
> On this patch I just included fmgrtab.c and src/backend/parser for the backend generated code.
> For contrib generated sources I added contrib/cube as an example.
I applied your contrib/cube example and did the same thing for the contrib/seg.
> All relevant details about the changes are included in the patch itself.
>
> As you may know already I also created a PR focused on llvm bitcode emission on meson, it generates bitcode for all backend and contribution modules, currently under review by some colleagues at Percona: https://github.com/percona/postgres/pull/103
> I'm curious if we should get all or some of the generated backend sources compiled to bitcode, similar to contrib modules.
I think we can do this. I added other backend sources like you did in
the PR but attached it as another patch (0007) because I wanted to
hear other people's opinions on that first.
v3 is attached.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Hi, On Wed, 12 Mar 2025 at 16:39, Diego Fronza <diego.fronza@percona.com> wrote: > > Hi, > > The v7 patch looks good to me, handling the bitcode modules in a uniform way and also avoiding the hacky code and warnings,much better now. > > A small note about the bitcode emission for generated sources in contrib, using cube as example, currently it creates twodict entries in a list: > bc_seg_gen_sources = [{'srcfiles': [seg_scan]}] > bc_seg_gen_sources += {'srcfiles': [seg_parse[0]]} > > Then pass it to the bitcode_modules: > bitcode_modules += { > ... > 'gen_srcfiles': bc_seg_gen_sources, > } > > It could be passed as a list with a single dict, since both generated sources share the same compilation flags: > bitcode_modules += { > ... > 'gen_srcfiles': [ > { 'srcfiles': [cube_scan, cube_parse[0]] }. > ] > } > > Both approaches work, the first one has the advantage of being able to pass separate additional_flags per generated source. I liked the current approach as it makes bitcode_modules easier to understand but both approaches work for me as well. One thing I noticed is that gen_srcfiles['srcfiles'] seems wrong. gen_sources is a better name compared to gen_srcfiles. So, I changed it to gen_sources in v4. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
- v4-0001-meson-Add-generated-header-stamps.patch
- v4-0002-meson-Add-postgresql-extension.pc-for-building-ex.patch
- v4-0003-meson-Test-building-extensions-by-using-postgresq.patch
- v4-0004-meson-WIP-Add-docs-for-postgresql-extension.pc.patch
- v4-0005-meson-Add-architecture-for-LLVM-bitcode-emission.patch
- v4-0006-meson-Add-LLVM-bitcode-emissions-for-contrib-libr.patch
- v4-0007-meson-Add-LLVM-bitcode-emission-for-backend-sourc.patch
Hi, On Thu, 13 Mar 2025 at 13:11, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote: > One thing I noticed is that gen_srcfiles['srcfiles'] seems wrong. > gen_sources is a better name compared to gen_srcfiles. So, I changed > it to gen_sources in v4. Rebase is needed due to b1720fe63f, v5 is attached. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
- v5-0001-meson-Add-generated-header-stamps.patch
- v5-0002-meson-Add-postgresql-extension.pc-for-building-ex.patch
- v5-0003-meson-Test-building-extensions-by-using-postgresq.patch
- v5-0004-meson-WIP-Add-docs-for-postgresql-extension.pc.patch
- v5-0005-meson-Add-architecture-for-LLVM-bitcode-emission.patch
- v5-0006-meson-Add-LLVM-bitcode-emissions-for-contrib-libr.patch
- v5-0007-meson-Add-LLVM-bitcode-emission-for-backend-sourc.patch