Thread: Re: Adding NetBSD and OpenBSD to Postgres CI
Hi, Thanks for the patch! On 2024-11-01 12:17:00 +0300, Nazir Bilal Yavuz wrote: > I made these tasks triggered manually like MinGW task to save CI credits > but a related line is commented out for now to trigger CFBot. Oh, I need to pick my patch which allows repo-level config of which tasks run back up. Then we can enable these by default for cfbot, but not for individual repos... > +task: > + depends_on: SanityCheck > + # trigger_type: manual > + > + env: > + # Below are experimentally derived to be a decent choice. > + CPUS: 2 > + BUILD_JOBS: 8 > + TEST_JOBS: 8 > + > + CIRRUS_WORKING_DIR: /home/postgres/postgres Why do you need to set that? > + CCACHE_DIR: /tmp/ccache_dir > + > + PATH: /usr/sbin:$PATH > + > + # Postgres interprets LANG as a 'en_US.UTF-8' but it is 'C', then What does "Postgres interprets LANG as a 'en_US.UTF-8'" mean? > + # Postgres tries to set 'LC_COLLATE' to 'en_US.UTF-8' but it is not > + # changeable. Initdb fails because of that. So, LANG is forced to be 'C'. > + LANG: "C" > + LC_ALL: "C" > + matrix: > + - name: NetBSD - 10 - Meson > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*netbsd.*' > + env: > + IMAGE_FAMILY: pg-ci-netbsd-postgres > + INCLUDE_DIRS: -Dextra_lib_dirs=/usr/pkg/lib -Dextra_include_dirs=/usr/pkg/include > + <<: *netbsd_task_template > + > + - name: OpenBSD - 7 - Meson > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*openbsd.*' > + env: > + IMAGE_FAMILY: pg-ci-openbsd-postgres > + INCLUDE_DIRS: -Dextra_include_dirs=/usr/local/include -Dextra_lib_dirs=/usr/local/lib > + UUID: -Duuid=e2fs Shouldn't something be added to PKG_CONFIG_PATH / --pkg-config-path? For other OSs we have stanzas like setup_additional_packages_script: | #apt-get update #DEBIAN_FRONTEND=noninteractive apt-get -y install ... it'd be good to have something similar for openbsd/netbsd, given that most won't be as familiar with that. > + <<: *openbsd_task_template > + sysinfo_script: | > + locale > + id > + uname -a > + ulimit -a -H && ulimit -a -S > + env > + > + ccache_cache: > + folder: $CCACHE_DIR > + > + create_user_script: | > + useradd postgres > + chown -R postgres:users /home/postgres > + mkdir -p ${CCACHE_DIR} > + chown -R postgres:users ${CCACHE_DIR} > + > + # -Duuid=bsd is not set since 'bsd' uuid option > + # is not working on NetBSD & OpenBSD. See > + # https://www.postgresql.org/message-id/17358-89806e7420797025@postgresql.org > + # And other uuid options are not available on NetBSD. > + configure_script: | > + su postgres <<-EOF > + meson setup \ > + --buildtype debug \ I suspect it'd be good to either add -Og to cflags (as done in a bunch of other tasks) or to use debugoptimized, given that the tests on these machines are fairly slow. > + -Dcassert=true -Dssl=openssl ${UUID} \ > + -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \ > + ${INCLUDE_DIRS} \ > + build > + EOF Should probably enable injection points. > + build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS}' > + upload_caches: ccache > + > + test_world_script: | > + su postgres <<-EOF > + ulimit -c unlimited > + # Otherwise tests will fail on OpenBSD, due to the lack of enough processes. > + ulimit -p 256 > + meson test $MTEST_ARGS --num-processes ${TEST_JOBS} > + EOF > + > + on_failure: > + <<: *on_failure_meson > + > + Right now you don't seem to be collecting core files - but you're still enabling them via ulimit -c unlimited. At least we shouldn't use ulimit -c unlimited without collecting core files, but it'd probably be better to add support for collecting core files. Shouldn't be too hard. Greetings, Andres Freund
Hi, On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote: > On Fri, 1 Nov 2024 at 21:44, Andres Freund <andres@anarazel.de> wrote: > > > + CCACHE_DIR: /tmp/ccache_dir > > > + > > > + PATH: /usr/sbin:$PATH > > > + > > > + # Postgres interprets LANG as a 'en_US.UTF-8' but it is 'C', then > > > > What does "Postgres interprets LANG as a 'en_US.UTF-8'" mean? > > It was because initdb was failing on NetBSD when the LANG and LC_ALL > is not set to C. I rephrased the comment and moved this under NetBSD > task. Do you happen to have a reference to the failure? The environment variables + the exact error message would be good. Kinda feels like that shouldn't happen with a default netbsd install. > > > + matrix: > > > + - name: NetBSD - 10 - Meson > > > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*netbsd.*' > > > + env: > > > + IMAGE_FAMILY: pg-ci-netbsd-postgres > > > + INCLUDE_DIRS: -Dextra_lib_dirs=/usr/pkg/lib -Dextra_include_dirs=/usr/pkg/include > > > + <<: *netbsd_task_template > > > + > > > + - name: OpenBSD - 7 - Meson > > > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*openbsd.*' > > > + env: > > > + IMAGE_FAMILY: pg-ci-openbsd-postgres > > > + INCLUDE_DIRS: -Dextra_include_dirs=/usr/local/include -Dextra_lib_dirs=/usr/local/lib > > > + UUID: -Duuid=e2fs > > > > Shouldn't something be added to PKG_CONFIG_PATH / --pkg-config-path? > > I don't think so. Both OSes are able to find pkgconfig at > '/usr/pkg/bin/pkg-config'. Am I missing something? --pkg-config-path is about the the path to pkg-config files, not the path to the pkg-config binary. If set we shouldn't need the extra_lib_dirs/extra_include_dirs, I think. > > Right now you don't seem to be collecting core files - but you're still > > enabling them via ulimit -c unlimited. At least we shouldn't use ulimit -c > > unlimited without collecting core files, but it'd probably be better to add > > support for collecting core files. Shouldn't be too hard. > > Done. I separated this patch to make review easier. +1 > From cbea598b11e85b5c7090ca8e9cc05c35f0359f54 Mon Sep 17 00:00:00 2001 > From: Nazir Bilal Yavuz <byavuz81@gmail.com> > Date: Thu, 7 Nov 2024 15:48:31 +0300 > Subject: [PATCH v2 1/3] Fix meson could not find bsd_auth.h > > bsd_auth.h file needs to be compiled together with the 'sys/types.h' as > it has missing type definitions. > > See synopsis at https://man.openbsd.org/authenticate.3 > --- > meson.build | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/meson.build b/meson.build > index 5b0510cef78..84107955d5d 100644 > --- a/meson.build > +++ b/meson.build > @@ -551,7 +551,8 @@ test_c_args = cppflags + cflags > bsd_authopt = get_option('bsd_auth') > bsd_auth = not_found_dep > if cc.check_header('bsd_auth.h', required: bsd_authopt, > - args: test_c_args, include_directories: postgres_inc) > + args: test_c_args, prefix: '#include <sys/types.h>', > + include_directories: postgres_inc) > cdata.set('USE_BSD_AUTH', 1) > bsd_auth = declare_dependency() > endif > -- Was about to apply that, but then started to wonder if we don't need the same for configure? And it sure looks like that has the same problem? Which also confuses me some, at some point this presumably worked? > From cd5bb66a55e5226b543f5b7db9128cfa48e338e3 Mon Sep 17 00:00:00 2001 > From: Nazir Bilal Yavuz <byavuz81@gmail.com> > Date: Mon, 11 Nov 2024 13:23:22 +0300 > Subject: [PATCH v2 2/3] Add NetBSD and OpenBSD tasks to the Postgres CI > > NetBSD and OpenBSD Postgres CI images are generated [1] but their tasks > are not added to the upstream Postgres yet. This patch adds them. > > Note: These tasks will be triggered manually to save CI credits but a > related line is commented out for now to trigger CFBot. > > Author: Nazir Bilal Yavuz <byavuz81@gmail.com> > > [1] https://github.com/anarazel/pg-vm-images > --- > .cirrus.tasks.yml | 84 +++++++++++++++++++++++++++++++++++++++++++++++ > .cirrus.yml | 10 ++++++ > 2 files changed, 94 insertions(+) > > diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml > index fc413eb11ef..f338af902aa 100644 > --- a/.cirrus.tasks.yml > +++ b/.cirrus.tasks.yml > @@ -213,6 +213,90 @@ task: > cores_script: src/tools/ci/cores_backtrace.sh freebsd /tmp/cores > > > +task: > + depends_on: SanityCheck > + # trigger_type: manual > + > + env: > + # Below are experimentally derived to be a decent choice. > + CPUS: 2 For cfbot it turns out to be easier to just use 4 cpus for everything. What time difference do you get from 2 vs 4 CPUs? > + BUILD_JOBS: 8 > + TEST_JOBS: 8 8 build/test jobs for 2 cpus sounds unlikely to be close to optimal. A bit higher makes sense, but 4x? > + CIRRUS_WORKING_DIR: /home/postgres/postgres I'd add a comment explaining why we're setting this. > + CCACHE_DIR: /tmp/ccache_dir And is it ok to put the ccache dir here, given the limited size of /tmp? > + PATH: /usr/sbin:$PATH > + > + matrix: > + - name: NetBSD - 10 - Meson Given the image name doesn't include the version it seems we shouldn't include it here either... > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*netbsd.*' > + env: > + IMAGE_FAMILY: pg-ci-netbsd-postgres > + INCLUDE_DIRS: -Dextra_include_dirs=/usr/pkg/include -Dextra_lib_dirs=/usr/pkg/lib > + # initdb fails with: 'invalid locale settings' error on NetBSD. > + # Force 'LANG' and 'LC_*' variables to be 'C'. > + LANG: "C" > + LC_ALL: "C" > + setup_additional_packages_script: | > + #pkgin -y install ... > + <<: *netbsd_task_template > + > + - name: OpenBSD - 7 - Meson > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*openbsd.*' > + env: > + IMAGE_FAMILY: pg-ci-openbsd-postgres > + INCLUDE_DIRS: -Dextra_include_dirs=/usr/local/include -Dextra_lib_dirs=/usr/local/lib > + UUID: -Duuid=e2fs So for netbsd we're not using any uuid support? Ah, I see, it's documented further down. Maybe reference that, or move the comment around? > + # -Duuid=bsd is not set since 'bsd' uuid option > + # is not working on NetBSD & OpenBSD. See > + # https://www.postgresql.org/message-id/17358-89806e7420797025@postgresql.org > + # And other uuid options are not available on NetBSD. > + configure_script: | > + su postgres <<-EOF > + meson setup \ > + --buildtype=debugoptimized \ > + -Dcassert=true -Dinjection_points=true \ > + -Dssl=openssl ${UUID} \ > + -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \ > + ${INCLUDE_DIRS} \ > + build > + EOF https://cirrus-ci.com/task/4879081273032704?logs=configure#L320 [14:48:50.729] Run-time dependency krb5-gssapi found: NO (tried pkgconfig and cmake) [14:48:50.729] Library gssapi_krb5 found: NO https://cirrus-ci.com/task/6286456156585984?logs=configure#L109 [14:49:28.049] Run-time dependency krb5-gssapi found: NO (tried pkgconfig and cmake) [14:49:28.049] Library gssapi_krb5 found: NO Is gss really not available / installed on either? https://cirrus-ci.com/task/4879081273032704?logs=configure#L49-L51 [14:48:50.729] Run-time dependency tcl found: NO (tried pkgconfig and cmake) [14:48:50.729] Library tcl found: NO [14:48:50.729] Has header "tcl.h" with dependency -ltcl: NO Is TCL not available on openbsd? > + build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS}' > + upload_caches: ccache > + > + test_world_script: | > + su postgres <<-EOF > + # Otherwise tests will fail on OpenBSD, due to the lack of enough processes. > + ulimit -p 256 Should we also increase the number of semaphores? https://postgr.es/m/db2773a2-aca0-43d0-99c1-060efcd9954e%40gmail.com Perhaps it'd be better to update the system config earlier in the openbsd specific portion of the test? > build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop || true > EOF > <<: *on_failure_meson > - cores_script: src/tools/ci/cores_backtrace.sh freebsd /tmp/cores > + cores_script: src/tools/ci/cores_backtrace.sh bsd /tmp/cores Hm, what's the deal with this change? > on_failure: > <<: *on_failure_meson > + cores_script: | > + # Although OSes are forced to core dump inside ${CORE_DUMP_DIR}, they may > + # not obey this. So, move core files to the ${CORE_DUMP_DIR} directory. > + find build/ -maxdepth 1 -type f -name '*.core' -exec mv '{}' ${CORE_DUMP_DIR} \; > + src/tools/ci/cores_backtrace.sh ${OS_NAME} ${CORE_DUMP_DIR} s/OSs are forced/we try to configure the OS/ Is -maxdepth 1 really sufficient? The tests run in subdirectories, don't they? Greetings, Andres Freund
Hi, Missed something until just after I hit send: On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote: > + matrix: > + - name: NetBSD - 10 - Meson > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*netbsd.*' ... > + - name: OpenBSD - 7 - Meson > + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*openbsd.*' Think these probably should be added to src/tools/ci/README Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote: >> It was because initdb was failing on NetBSD when the LANG and LC_ALL >> is not set to C. I rephrased the comment and moved this under NetBSD >> task. > Do you happen to have a reference to the failure? The environment variables + > the exact error message would be good. Kinda feels like that shouldn't > happen with a default netbsd install. On mamba's host (pretty vanilla NetBSD 10.0): $ env | grep ^L LOGNAME=tgl $ LANG=C initdb ... works fine ... $ rm -rf $PGDATA $ locale -a | grep en_US en_US.ISO8859-1 en_US.ISO8859-15 en_US.US-ASCII en_US.UTF-8 $ LANG=en_US.UTF-8 initdb The files belonging to this database system will be owned by user "tgl". This user must also own the server process. initdb: error: invalid locale settings; check LANG and LC_* environment variables I seem to recall noticing this when I was setting up mamba, but I didn't feel like pursuing it so I just set the animal to test only C locale. > Was about to apply that, but then started to wonder if we don't need the same > for configure? And it sure looks like that has the same problem? > Which also confuses me some, at some point this presumably worked? The configure path does work on NetBSD 10, and has for some time. I've never tested it with meson though. regards, tom lane
I wrote: > Andres Freund <andres@anarazel.de> writes: >> Was about to apply that, but then started to wonder if we don't need the same >> for configure? And it sure looks like that has the same problem? >> Which also confuses me some, at some point this presumably worked? > The configure path does work on NetBSD 10, and has for some time. Ah, sorry, I'd forgotten that bsd_auth.h only exists on OpenBSD. We correctly detect that the header is not there on NetBSD. I tried it on OpenBSD 7.0, which is the only OpenBSD image I have laying about at the moment, and configure correctly finds that the "AC_CHECK_HEADER(bsd_auth.h, [], ...)" test succeeds. So it looks to me like we should not need the sys/types.h include --- unless they broke it in some more-recent release? (It looks like none of our OpenBSD BF animals are testing --with-bsd-auth, which is surely bad.) regards, tom lane
Hi, On 2024-12-17 12:44:46 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote: > >> It was because initdb was failing on NetBSD when the LANG and LC_ALL > >> is not set to C. I rephrased the comment and moved this under NetBSD > >> task. > > > Do you happen to have a reference to the failure? The environment variables + > > the exact error message would be good. Kinda feels like that shouldn't > > happen with a default netbsd install. > > On mamba's host (pretty vanilla NetBSD 10.0): > > $ env | grep ^L > LOGNAME=tgl > $ LANG=C initdb > ... works fine ... > $ rm -rf $PGDATA > $ locale -a | grep en_US > en_US.ISO8859-1 > en_US.ISO8859-15 > en_US.US-ASCII > en_US.UTF-8 > $ LANG=en_US.UTF-8 initdb > The files belonging to this database system will be owned by user "tgl". > This user must also own the server process. > > initdb: error: invalid locale settings; check LANG and LC_* environment variables > > > I seem to recall noticing this when I was setting up mamba, but > I didn't feel like pursuing it so I just set the animal to test > only C locale. Heh, I guess that's good enough for CI then too :) It'd be helpful if the error message ought to at least include the category being tested, perhaps it's just one category failing or such? Greetings, Andres Freund
Hi, On 2024-12-17 13:09:49 -0500, Tom Lane wrote: > I wrote: > > Andres Freund <andres@anarazel.de> writes: > >> Was about to apply that, but then started to wonder if we don't need the same > >> for configure? And it sure looks like that has the same problem? > >> Which also confuses me some, at some point this presumably worked? > > > The configure path does work on NetBSD 10, and has for some time. > > Ah, sorry, I'd forgotten that bsd_auth.h only exists on OpenBSD. > We correctly detect that the header is not there on NetBSD. > > I tried it on OpenBSD 7.0, which is the only OpenBSD image I have > laying about at the moment, and configure correctly finds that the > "AC_CHECK_HEADER(bsd_auth.h, [], ...)" test succeeds. So it looks > to me like we should not need the sys/types.h include --- unless > they broke it in some more-recent release? It does look like it was recently broken: https://www.postgresql.org/message-id/CAN55FZ0czTmfnfF%3DWOHJUZ0iZRiMz6Yf3FSMbPh4%3DZ5a_TDjKw%40mail.gmail.com /usr/include/bsd_auth.h:93:1: error: unknown type name 'quad_t' I'm pretty sure that it *did* work not too long ago with meson, because I had tested the meson stuff on openbsd when I wrote it. And as Peter noted: > The synopsis in https://man.openbsd.org/authenticate.3 is: > > #include <sys/types.h> > #include <login_cap.h> > #include <bsd_auth.h> Which does look a bit like they expect sys/types.h to be included first. Looking at their git mirror, I do indeed not see anything that'd include sys/types.h: https://github.com/openbsd/src/blob/master/include/bsd_auth.h first includes machine/_types.h, which I think maps to: https://github.com/openbsd/src/blob/master/sys/arch/amd64/include/_types.h#L4 and then sys/cdefs.h: https://github.com/openbsd/src/blob/master/sys/sys/cdefs.h which I think in turn includes https://github.com/openbsd/src/blob/master/sys/arch/amd64/include/cdefs.h None of those define quad_t. But none seems to have changed in a relevant way recently either. I wonder if this is due to a newer compiler having different implicit includes or something? > (It looks like none of our OpenBSD BF animals are testing > --with-bsd-auth, which is surely bad.) Agreed. Greetings, Andres Freund
Hi, On 2024-12-17 14:25:01 -0500, Andres Freund wrote: > On 2024-12-17 13:09:49 -0500, Tom Lane wrote: > > I wrote: > > > Andres Freund <andres@anarazel.de> writes: > > >> Was about to apply that, but then started to wonder if we don't need the same > > >> for configure? And it sure looks like that has the same problem? > > >> Which also confuses me some, at some point this presumably worked? > > > > > The configure path does work on NetBSD 10, and has for some time. > > > > Ah, sorry, I'd forgotten that bsd_auth.h only exists on OpenBSD. > > We correctly detect that the header is not there on NetBSD. > > > > I tried it on OpenBSD 7.0, which is the only OpenBSD image I have > > laying about at the moment, and configure correctly finds that the > > "AC_CHECK_HEADER(bsd_auth.h, [], ...)" test succeeds. So it looks > > to me like we should not need the sys/types.h include --- unless > > they broke it in some more-recent release? > > It does look like it was recently broken: > https://www.postgresql.org/message-id/CAN55FZ0czTmfnfF%3DWOHJUZ0iZRiMz6Yf3FSMbPh4%3DZ5a_TDjKw%40mail.gmail.com > /usr/include/bsd_auth.h:93:1: error: unknown type name 'quad_t' > > > I'm pretty sure that it *did* work not too long ago with meson, because I had > tested the meson stuff on openbsd when I wrote it. > > > And as Peter noted: > > > The synopsis in https://man.openbsd.org/authenticate.3 is: > > > > #include <sys/types.h> > > #include <login_cap.h> > > #include <bsd_auth.h> > > Which does look a bit like they expect sys/types.h to be included first. Gah, configure does pass - because AC_CHECK_HEADER(), if includes is not passed in, first includes what's defined in AC_INCLUDES_DEFAULT. https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Default-Includes.html#Default-Includes Expand to include-directives if defined, otherwise to: #include <stdio.h> #ifdef HAVE_SYS_TYPES_H # include <sys/types.h> #endif ... So maybe my memory is just faulty and the bsd_auth.h path never worked with openbsd + meson. Anyway, it looks like Bilal's patch to just add this to meson seems to suffice. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2024-12-17 12:44:46 -0500, Tom Lane wrote: >> On mamba's host (pretty vanilla NetBSD 10.0): >> initdb: error: invalid locale settings; check LANG and LC_* environment variables > It'd be helpful if the error message ought to at least include the category > being tested, perhaps it's just one category failing or such? Yeah, I will poke into this a little harder to see what's going on. (Good sleuthing on the bsd_auth.h question, BTW.) regards, tom lane
I wrote: > Andres Freund <andres@anarazel.de> writes: >> It'd be helpful if the error message ought to at least include the category >> being tested, perhaps it's just one category failing or such? > Yeah, I will poke into this a little harder to see what's going on. So after a little testing, setlocale(LC_COLLATE, ...) fails to set the locale to (apparently) anything other than C, but it works for other categories. The reason is explained by "man setlocale": Currently, setlocale() returns NULL and fails to change the locale when LC_COLLATE is modified independently of other values. It seems that on current NetBSD you have to use setlocale(LC_ALL, ...) if you want to set the collation category. That's quite annoying from our perspective, but it's not something I'm excited about fixing. Given other work going on, we might abandon all this logic soon anyway. regards, tom lane