Thread: Re: Adding NetBSD and OpenBSD to Postgres CI

Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

Thanks for the patch!


On 2024-11-01 12:17:00 +0300, Nazir Bilal Yavuz wrote:
> I made these tasks triggered manually like MinGW task to save CI credits
> but a related line is commented out for now to trigger CFBot.

Oh, I need to pick my patch which allows repo-level config of which tasks run
back up. Then we can enable these by default for cfbot, but not for individual
repos...


> +task:
> +  depends_on: SanityCheck
> +  # trigger_type: manual
> +
> +  env:
> +    # Below are experimentally derived to be a decent choice.
> +    CPUS: 2
> +    BUILD_JOBS: 8
> +    TEST_JOBS: 8
> +
> +    CIRRUS_WORKING_DIR: /home/postgres/postgres

Why do you need to set that?


> +    CCACHE_DIR: /tmp/ccache_dir
> +
> +    PATH: /usr/sbin:$PATH
> +
> +    # Postgres interprets LANG as a 'en_US.UTF-8' but it is 'C', then

What does "Postgres interprets LANG as a 'en_US.UTF-8'" mean?


> +    # Postgres tries to set 'LC_COLLATE' to 'en_US.UTF-8' but it is not
> +    # changeable. Initdb fails because of that. So, LANG is forced to be 'C'.
> +    LANG: "C"
> +    LC_ALL: "C"

> +  matrix:
> +    - name: NetBSD - 10 - Meson
> +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*netbsd.*'
> +      env:
> +        IMAGE_FAMILY: pg-ci-netbsd-postgres
> +        INCLUDE_DIRS: -Dextra_lib_dirs=/usr/pkg/lib -Dextra_include_dirs=/usr/pkg/include
> +      <<: *netbsd_task_template
> +
> +    - name: OpenBSD - 7 - Meson
> +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*openbsd.*'
> +      env:
> +        IMAGE_FAMILY: pg-ci-openbsd-postgres
> +        INCLUDE_DIRS: -Dextra_include_dirs=/usr/local/include -Dextra_lib_dirs=/usr/local/lib
> +        UUID: -Duuid=e2fs

Shouldn't something be added to PKG_CONFIG_PATH / --pkg-config-path?


For other OSs we have stanzas like

  setup_additional_packages_script: |
    #apt-get update
    #DEBIAN_FRONTEND=noninteractive apt-get -y install ...

it'd be good to have something similar for openbsd/netbsd, given that most
won't be as familiar with that.



> +      <<: *openbsd_task_template
> +  sysinfo_script: |
> +    locale
> +    id
> +    uname -a
> +    ulimit -a -H && ulimit -a -S
> +    env
> +
> +  ccache_cache:
> +    folder: $CCACHE_DIR
> +
> +  create_user_script: |
> +    useradd postgres
> +    chown -R postgres:users /home/postgres
> +    mkdir -p ${CCACHE_DIR}
> +    chown -R postgres:users ${CCACHE_DIR}
> +
> +  # -Duuid=bsd is not set since 'bsd' uuid option
> +  # is not working on NetBSD & OpenBSD. See
> +  # https://www.postgresql.org/message-id/17358-89806e7420797025@postgresql.org
> +  # And other uuid options are not available on NetBSD.
> +  configure_script: |
> +    su postgres <<-EOF
> +      meson setup \
> +        --buildtype debug \

I suspect it'd be good to either add -Og to cflags (as done in a bunch of
other tasks) or to use debugoptimized, given that the tests on these machines
are fairly slow.


> +        -Dcassert=true -Dssl=openssl ${UUID} \
> +        -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
> +        ${INCLUDE_DIRS} \
> +        build
> +    EOF

Should probably enable injection points.


> +  build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS}'
> +  upload_caches: ccache
> +
> +  test_world_script: |
> +    su postgres <<-EOF
> +      ulimit -c unlimited
> +      # Otherwise tests will fail on OpenBSD, due to the lack of enough processes.
> +      ulimit -p 256
> +      meson test $MTEST_ARGS --num-processes ${TEST_JOBS}
> +    EOF
> +
> +  on_failure:
> +    <<: *on_failure_meson
> +
> +

Right now you don't seem to be collecting core files - but you're still
enabling them via ulimit -c unlimited.  At least we shouldn't use ulimit -c
unlimited without collecting core files, but it'd probably be better to add
support for collecting core files. Shouldn't be too hard.


Greetings,

Andres Freund



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote:
> On Fri, 1 Nov 2024 at 21:44, Andres Freund <andres@anarazel.de> wrote:
> > > +    CCACHE_DIR: /tmp/ccache_dir
> > > +
> > > +    PATH: /usr/sbin:$PATH
> > > +
> > > +    # Postgres interprets LANG as a 'en_US.UTF-8' but it is 'C', then
> >
> > What does "Postgres interprets LANG as a 'en_US.UTF-8'" mean?
>
> It was because initdb was failing on NetBSD when the LANG and LC_ALL
> is not set to C. I rephrased the comment and moved this under NetBSD
> task.

Do you happen to have a reference to the failure?  The environment variables +
the exact error message would be good.  Kinda feels like that shouldn't
happen with a default netbsd install.



> > > +  matrix:
> > > +    - name: NetBSD - 10 - Meson
> > > +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*netbsd.*'
> > > +      env:
> > > +        IMAGE_FAMILY: pg-ci-netbsd-postgres
> > > +        INCLUDE_DIRS: -Dextra_lib_dirs=/usr/pkg/lib -Dextra_include_dirs=/usr/pkg/include
> > > +      <<: *netbsd_task_template
> > > +
> > > +    - name: OpenBSD - 7 - Meson
> > > +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*openbsd.*'
> > > +      env:
> > > +        IMAGE_FAMILY: pg-ci-openbsd-postgres
> > > +        INCLUDE_DIRS: -Dextra_include_dirs=/usr/local/include -Dextra_lib_dirs=/usr/local/lib
> > > +        UUID: -Duuid=e2fs
> >
> > Shouldn't something be added to PKG_CONFIG_PATH / --pkg-config-path?
>
> I don't think so. Both OSes are able to find pkgconfig at
> '/usr/pkg/bin/pkg-config'. Am I missing something?

--pkg-config-path is about the the path to pkg-config files, not the path to
the pkg-config binary. If set we shouldn't need the
extra_lib_dirs/extra_include_dirs, I think.



> > Right now you don't seem to be collecting core files - but you're still
> > enabling them via ulimit -c unlimited.  At least we shouldn't use ulimit -c
> > unlimited without collecting core files, but it'd probably be better to add
> > support for collecting core files. Shouldn't be too hard.
>
> Done. I separated this patch to make review easier.

+1



> From cbea598b11e85b5c7090ca8e9cc05c35f0359f54 Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81@gmail.com>
> Date: Thu, 7 Nov 2024 15:48:31 +0300
> Subject: [PATCH v2 1/3] Fix meson could not find bsd_auth.h
>
> bsd_auth.h file needs to be compiled together with the 'sys/types.h' as
> it has missing type definitions.
>
> See synopsis at https://man.openbsd.org/authenticate.3
> ---
>  meson.build | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/meson.build b/meson.build
> index 5b0510cef78..84107955d5d 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -551,7 +551,8 @@ test_c_args = cppflags + cflags
>  bsd_authopt = get_option('bsd_auth')
>  bsd_auth = not_found_dep
>  if cc.check_header('bsd_auth.h', required: bsd_authopt,
> -    args: test_c_args, include_directories: postgres_inc)
> +    args: test_c_args, prefix: '#include <sys/types.h>',
> +    include_directories: postgres_inc)
>    cdata.set('USE_BSD_AUTH', 1)
>    bsd_auth = declare_dependency()
>  endif
> --

Was about to apply that, but then started to wonder if we don't need the same
for configure? And it sure looks like that has the same problem?

Which also confuses me some, at some point this presumably worked?


> From cd5bb66a55e5226b543f5b7db9128cfa48e338e3 Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81@gmail.com>
> Date: Mon, 11 Nov 2024 13:23:22 +0300
> Subject: [PATCH v2 2/3] Add NetBSD and OpenBSD tasks to the Postgres CI
>
> NetBSD and OpenBSD Postgres CI images are generated [1] but their tasks
> are not added to the upstream Postgres yet. This patch adds them.
>
> Note: These tasks will be triggered manually to save CI credits but a
> related line is commented out for now to trigger CFBot.
>
> Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
>
> [1] https://github.com/anarazel/pg-vm-images
> ---
>  .cirrus.tasks.yml | 84 +++++++++++++++++++++++++++++++++++++++++++++++
>  .cirrus.yml       | 10 ++++++
>  2 files changed, 94 insertions(+)
>
> diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
> index fc413eb11ef..f338af902aa 100644
> --- a/.cirrus.tasks.yml
> +++ b/.cirrus.tasks.yml
> @@ -213,6 +213,90 @@ task:
>      cores_script: src/tools/ci/cores_backtrace.sh freebsd /tmp/cores
>
>
> +task:
> +  depends_on: SanityCheck
> +  # trigger_type: manual
> +
> +  env:
> +    # Below are experimentally derived to be a decent choice.
> +    CPUS: 2

For cfbot it turns out to be easier to just use 4 cpus for everything. What
time difference do you get from 2 vs 4 CPUs?


> +    BUILD_JOBS: 8
> +    TEST_JOBS: 8

8 build/test jobs for 2 cpus sounds unlikely to be close to optimal. A bit
higher makes sense, but 4x?


> +    CIRRUS_WORKING_DIR: /home/postgres/postgres

I'd add a comment explaining why we're setting this.

> +    CCACHE_DIR: /tmp/ccache_dir

And is it ok to put the ccache dir here, given the limited size of /tmp?


> +    PATH: /usr/sbin:$PATH
> +
> +  matrix:
> +    - name: NetBSD - 10 - Meson

Given the image name doesn't include the version it seems we shouldn't include
it here either...


> +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*netbsd.*'
> +      env:
> +        IMAGE_FAMILY: pg-ci-netbsd-postgres
> +        INCLUDE_DIRS: -Dextra_include_dirs=/usr/pkg/include -Dextra_lib_dirs=/usr/pkg/lib
> +        # initdb fails with: 'invalid locale settings' error on NetBSD.
> +        # Force 'LANG' and 'LC_*' variables to be 'C'.
> +        LANG: "C"
> +        LC_ALL: "C"
> +      setup_additional_packages_script: |
> +        #pkgin -y install ...
> +      <<: *netbsd_task_template
> +
> +    - name: OpenBSD - 7 - Meson
> +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*openbsd.*'
> +      env:
> +        IMAGE_FAMILY: pg-ci-openbsd-postgres
> +        INCLUDE_DIRS: -Dextra_include_dirs=/usr/local/include -Dextra_lib_dirs=/usr/local/lib
> +        UUID: -Duuid=e2fs

So for netbsd we're not using any uuid support?  Ah, I see, it's documented
further down. Maybe reference that, or move the comment around?




> +  # -Duuid=bsd is not set since 'bsd' uuid option
> +  # is not working on NetBSD & OpenBSD. See
> +  # https://www.postgresql.org/message-id/17358-89806e7420797025@postgresql.org
> +  # And other uuid options are not available on NetBSD.
> +  configure_script: |
> +    su postgres <<-EOF
> +      meson setup \
> +        --buildtype=debugoptimized \
> +        -Dcassert=true -Dinjection_points=true \
> +        -Dssl=openssl ${UUID} \
> +        -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
> +        ${INCLUDE_DIRS} \
> +        build
> +    EOF

https://cirrus-ci.com/task/4879081273032704?logs=configure#L320

[14:48:50.729] Run-time dependency krb5-gssapi found: NO (tried pkgconfig and cmake)
[14:48:50.729] Library gssapi_krb5 found: NO

https://cirrus-ci.com/task/6286456156585984?logs=configure#L109

[14:49:28.049] Run-time dependency krb5-gssapi found: NO (tried pkgconfig and cmake)
[14:49:28.049] Library gssapi_krb5 found: NO

Is gss really not available / installed on either?


https://cirrus-ci.com/task/4879081273032704?logs=configure#L49-L51
[14:48:50.729] Run-time dependency tcl found: NO (tried pkgconfig and cmake)
[14:48:50.729] Library tcl found: NO
[14:48:50.729] Has header "tcl.h" with dependency -ltcl: NO

Is TCL not available on openbsd?


> +  build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS}'
> +  upload_caches: ccache
> +
> +  test_world_script: |
> +    su postgres <<-EOF
> +      # Otherwise tests will fail on OpenBSD, due to the lack of enough processes.
> +      ulimit -p 256

Should we also increase the number of semaphores?

https://postgr.es/m/db2773a2-aca0-43d0-99c1-060efcd9954e%40gmail.com

Perhaps it'd be better to update the system config earlier in the openbsd
specific portion of the test?




>          build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop || true
>        EOF
>      <<: *on_failure_meson
> -    cores_script: src/tools/ci/cores_backtrace.sh freebsd /tmp/cores
> +    cores_script: src/tools/ci/cores_backtrace.sh bsd /tmp/cores

Hm, what's the deal with this change?


>    on_failure:
>      <<: *on_failure_meson
> +    cores_script: |
> +      # Although OSes are forced to core dump inside ${CORE_DUMP_DIR}, they may
> +      # not obey this. So, move core files to the ${CORE_DUMP_DIR} directory.
> +      find build/ -maxdepth 1 -type f -name '*.core' -exec mv '{}' ${CORE_DUMP_DIR} \;
> +      src/tools/ci/cores_backtrace.sh ${OS_NAME} ${CORE_DUMP_DIR}

s/OSs are forced/we try to configure the OS/

Is -maxdepth 1 really sufficient? The tests run in subdirectories, don't they?

Greetings,

Andres Freund



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

Missed something until just after I hit send:

On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote:
> +  matrix:
> +    - name: NetBSD - 10 - Meson
> +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*netbsd.*'
...
> +    - name: OpenBSD - 7 - Meson
> +      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*openbsd.*'

Think these probably should be added to src/tools/ci/README

Greetings,

Andres Freund



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote:
>> It was because initdb was failing on NetBSD when the LANG and LC_ALL
>> is not set to C. I rephrased the comment and moved this under NetBSD
>> task.

> Do you happen to have a reference to the failure?  The environment variables +
> the exact error message would be good.  Kinda feels like that shouldn't
> happen with a default netbsd install.

On mamba's host (pretty vanilla NetBSD 10.0):

$ env | grep ^L
LOGNAME=tgl
$ LANG=C initdb
... works fine ...
$ rm -rf $PGDATA
$ locale -a | grep en_US
en_US.ISO8859-1
en_US.ISO8859-15
en_US.US-ASCII
en_US.UTF-8
$ LANG=en_US.UTF-8 initdb
The files belonging to this database system will be owned by user "tgl".
This user must also own the server process.

initdb: error: invalid locale settings; check LANG and LC_* environment variables


I seem to recall noticing this when I was setting up mamba, but
I didn't feel like pursuing it so I just set the animal to test
only C locale.

> Was about to apply that, but then started to wonder if we don't need the same
> for configure? And it sure looks like that has the same problem?
> Which also confuses me some, at some point this presumably worked?

The configure path does work on NetBSD 10, and has for some time.
I've never tested it with meson though.

            regards, tom lane



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Tom Lane
Date:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Was about to apply that, but then started to wonder if we don't need the same
>> for configure? And it sure looks like that has the same problem?
>> Which also confuses me some, at some point this presumably worked?

> The configure path does work on NetBSD 10, and has for some time.

Ah, sorry, I'd forgotten that bsd_auth.h only exists on OpenBSD.
We correctly detect that the header is not there on NetBSD.

I tried it on OpenBSD 7.0, which is the only OpenBSD image I have
laying about at the moment, and configure correctly finds that the
"AC_CHECK_HEADER(bsd_auth.h, [], ...)" test succeeds.  So it looks
to me like we should not need the sys/types.h include --- unless
they broke it in some more-recent release?

(It looks like none of our OpenBSD BF animals are testing
--with-bsd-auth, which is surely bad.)

            regards, tom lane



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

On 2024-12-17 12:44:46 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2024-11-12 11:38:11 +0300, Nazir Bilal Yavuz wrote:
> >> It was because initdb was failing on NetBSD when the LANG and LC_ALL
> >> is not set to C. I rephrased the comment and moved this under NetBSD
> >> task.
> 
> > Do you happen to have a reference to the failure?  The environment variables +
> > the exact error message would be good.  Kinda feels like that shouldn't
> > happen with a default netbsd install.
> 
> On mamba's host (pretty vanilla NetBSD 10.0):
> 
> $ env | grep ^L
> LOGNAME=tgl
> $ LANG=C initdb
> ... works fine ...
> $ rm -rf $PGDATA
> $ locale -a | grep en_US
> en_US.ISO8859-1
> en_US.ISO8859-15
> en_US.US-ASCII
> en_US.UTF-8
> $ LANG=en_US.UTF-8 initdb
> The files belonging to this database system will be owned by user "tgl".
> This user must also own the server process.
> 
> initdb: error: invalid locale settings; check LANG and LC_* environment variables
> 
> 
> I seem to recall noticing this when I was setting up mamba, but
> I didn't feel like pursuing it so I just set the animal to test
> only C locale.

Heh, I guess that's good enough for CI then too :)

It'd be helpful if the error message ought to at least include the category
being tested, perhaps it's just one category failing or such?

Greetings,

Andres Freund



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

On 2024-12-17 13:09:49 -0500, Tom Lane wrote:
> I wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> Was about to apply that, but then started to wonder if we don't need the same
> >> for configure? And it sure looks like that has the same problem?
> >> Which also confuses me some, at some point this presumably worked?
> 
> > The configure path does work on NetBSD 10, and has for some time.
> 
> Ah, sorry, I'd forgotten that bsd_auth.h only exists on OpenBSD.
> We correctly detect that the header is not there on NetBSD.
> 
> I tried it on OpenBSD 7.0, which is the only OpenBSD image I have
> laying about at the moment, and configure correctly finds that the
> "AC_CHECK_HEADER(bsd_auth.h, [], ...)" test succeeds.  So it looks
> to me like we should not need the sys/types.h include --- unless
> they broke it in some more-recent release?

It does look like it was recently broken:
https://www.postgresql.org/message-id/CAN55FZ0czTmfnfF%3DWOHJUZ0iZRiMz6Yf3FSMbPh4%3DZ5a_TDjKw%40mail.gmail.com
/usr/include/bsd_auth.h:93:1: error: unknown type name 'quad_t'


I'm pretty sure that it *did* work not too long ago with meson, because I had
tested the meson stuff on openbsd when I wrote it.


And as Peter noted:

> The synopsis in https://man.openbsd.org/authenticate.3 is:
> 
> #include <sys/types.h>
> #include <login_cap.h>
> #include <bsd_auth.h>

Which does look a bit like they expect sys/types.h to be included first.


Looking at their git mirror, I do indeed not see anything that'd include
sys/types.h:
https://github.com/openbsd/src/blob/master/include/bsd_auth.h
first includes machine/_types.h, which I think maps to:
https://github.com/openbsd/src/blob/master/sys/arch/amd64/include/_types.h#L4

and then sys/cdefs.h:
https://github.com/openbsd/src/blob/master/sys/sys/cdefs.h
which I think in turn includes
https://github.com/openbsd/src/blob/master/sys/arch/amd64/include/cdefs.h

None of those define quad_t.  But none seems to have changed in a relevant way
recently either.


I wonder if this is due to a newer compiler having different implicit includes
or something?


> (It looks like none of our OpenBSD BF animals are testing
> --with-bsd-auth, which is surely bad.)

Agreed.

Greetings,

Andres Freund



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

On 2024-12-17 14:25:01 -0500, Andres Freund wrote:
> On 2024-12-17 13:09:49 -0500, Tom Lane wrote:
> > I wrote:
> > > Andres Freund <andres@anarazel.de> writes:
> > >> Was about to apply that, but then started to wonder if we don't need the same
> > >> for configure? And it sure looks like that has the same problem?
> > >> Which also confuses me some, at some point this presumably worked?
> > 
> > > The configure path does work on NetBSD 10, and has for some time.
> > 
> > Ah, sorry, I'd forgotten that bsd_auth.h only exists on OpenBSD.
> > We correctly detect that the header is not there on NetBSD.
> > 
> > I tried it on OpenBSD 7.0, which is the only OpenBSD image I have
> > laying about at the moment, and configure correctly finds that the
> > "AC_CHECK_HEADER(bsd_auth.h, [], ...)" test succeeds.  So it looks
> > to me like we should not need the sys/types.h include --- unless
> > they broke it in some more-recent release?
> 
> It does look like it was recently broken:
> https://www.postgresql.org/message-id/CAN55FZ0czTmfnfF%3DWOHJUZ0iZRiMz6Yf3FSMbPh4%3DZ5a_TDjKw%40mail.gmail.com
> /usr/include/bsd_auth.h:93:1: error: unknown type name 'quad_t'
> 
> 
> I'm pretty sure that it *did* work not too long ago with meson, because I had
> tested the meson stuff on openbsd when I wrote it.
> 
> 
> And as Peter noted:
> 
> > The synopsis in https://man.openbsd.org/authenticate.3 is:
> > 
> > #include <sys/types.h>
> > #include <login_cap.h>
> > #include <bsd_auth.h>
> 
> Which does look a bit like they expect sys/types.h to be included first.

Gah, configure does pass - because AC_CHECK_HEADER(), if includes is not passed
in, first includes what's defined in AC_INCLUDES_DEFAULT.

https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Default-Includes.html#Default-Includes
 Expand to include-directives if defined, otherwise to:

          #include <stdio.h>
          #ifdef HAVE_SYS_TYPES_H
          # include <sys/types.h>
          #endif
...


So maybe my memory is just faulty and the bsd_auth.h path never worked with
openbsd + meson. Anyway, it looks like Bilal's patch to just add this to meson
seems to suffice.

Greetings,

Andres Freund



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2024-12-17 12:44:46 -0500, Tom Lane wrote:
>> On mamba's host (pretty vanilla NetBSD 10.0):
>> initdb: error: invalid locale settings; check LANG and LC_* environment variables

> It'd be helpful if the error message ought to at least include the category
> being tested, perhaps it's just one category failing or such?

Yeah, I will poke into this a little harder to see what's going on.

(Good sleuthing on the bsd_auth.h question, BTW.)

            regards, tom lane



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Tom Lane
Date:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> It'd be helpful if the error message ought to at least include the category
>> being tested, perhaps it's just one category failing or such?

> Yeah, I will poke into this a little harder to see what's going on.

So after a little testing, setlocale(LC_COLLATE, ...) fails to set the
locale to (apparently) anything other than C, but it works for other
categories.  The reason is explained by "man setlocale":

     Currently, setlocale() returns NULL and fails to change the locale when
     LC_COLLATE is modified independently of other values.

It seems that on current NetBSD you have to use setlocale(LC_ALL, ...)
if you want to set the collation category.  That's quite annoying
from our perspective, but it's not something I'm excited about fixing.
Given other work going on, we might abandon all this logic soon
anyway.

            regards, tom lane



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

I finally pushed this. The meson fix backpatched to 16.

I did some very minor polishing, reordering the OS lists to stay alphabetical,
instead of adding netbsd/openbsd somewhere to the front of lists.

Thanks for the patches!


Obviously not your fault, but I do think it's pretty crazy that with the same
available resources, netbsd and openbsd take considerably longer than linux
and freebsd, which both do a lot more (linux tests 32 and 64 bit with ubsan
with autoconf and asan with meson, freebsd tests a bunch of debugging
options).  I wonder what is going wrong.  I suspect we might be waiting for
the filesystem a lot, according to cirrus-ci's CPU usage graph we're not CPU
bound during the test phase.  Or maybe they just scale badly.



Any chance you're interested in rebasing and expanding
https://postgr.es/m/20240413021221.hg53rvqlvldqh57i%40awork3.anarazel.de

It'd be nice if we could enable these tasks on cfbot, where we bring our own
compute, while leaving them on manual for everyone else.

Greetings,

Andres Freund



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Nazir Bilal Yavuz
Date:
Hi,

On Wed, 12 Feb 2025 at 17:49, Andres Freund <andres@anarazel.de> wrote:
> I finally pushed this. The meson fix backpatched to 16.
>
> I did some very minor polishing, reordering the OS lists to stay alphabetical,
> instead of adding netbsd/openbsd somewhere to the front of lists.

Thanks!

> Obviously not your fault, but I do think it's pretty crazy that with the same
> available resources, netbsd and openbsd take considerably longer than linux
> and freebsd, which both do a lot more (linux tests 32 and 64 bit with ubsan
> with autoconf and asan with meson, freebsd tests a bunch of debugging
> options).  I wonder what is going wrong.  I suspect we might be waiting for
> the filesystem a lot, according to cirrus-ci's CPU usage graph we're not CPU
> bound during the test phase.  Or maybe they just scale badly.

Yes, I could not find the reason why either.

> Any chance you're interested in rebasing and expanding
> https://postgr.es/m/20240413021221.hg53rvqlvldqh57i%40awork3.anarazel.de
>
> It'd be nice if we could enable these tasks on cfbot, where we bring our own
> compute, while leaving them on manual for everyone else.

Sure, I will work on this. Thank you for mentioning it.

--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Thomas Munro
Date:
On Fri, Feb 14, 2025 at 2:02 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> On Wed, 12 Feb 2025 at 17:49, Andres Freund <andres@anarazel.de> wrote:
> > Obviously not your fault, but I do think it's pretty crazy that with the same
> > available resources, netbsd and openbsd take considerably longer than linux
> > and freebsd, which both do a lot more (linux tests 32 and 64 bit with ubsan
> > with autoconf and asan with meson, freebsd tests a bunch of debugging
> > options).  I wonder what is going wrong.  I suspect we might be waiting for
> > the filesystem a lot, according to cirrus-ci's CPU usage graph we're not CPU
> > bound during the test phase.  Or maybe they just scale badly.
>
> Yes, I could not find the reason why either.

I don't know much about NetBSD and OpenBSD, but my guess would be that
its UFS is serialising on synchronous I/O for some file/directory
operations that we do a lot of.  Perhaps something like commit
0265e5c1 could be ported to those guys too?  I suspect that Windows
has the same problem, I just don't know how to script a RAM disk as
all the web instructions start with "open the blah GUI and click on
the ...".  I think the Linux file systems are just much better at that
stuff due to journaling and deferring I/O, but these other systems
probably issue synchronous writes, possibly with drive cache flushes
in some cases that you might not be able to turn off*.   As for macOS,
I don't know much except that it doesn't seem too interested in
flushing the write cache generally, and is currently our 2nd fastest
OS on CI anyway.

Maybe we could try this?

https://man.netbsd.org/mount_tmpfs.8
https://man.openbsd.org/mount_tmpfs.8

FreeBSD has the same thing too, but I haven't looked into how that
compares to mounting UFS on a RAM disk device as I did it.  Linux has
a tmpfs too, but we know that it refuses O_DIRECT so I wouldn't
propose that, it's essential to test those code paths.  OpenBSD is the
only system in our universe without O_DIRECT, and I don't know what
NetBSD's policy is on O_DIRECT on tmpfs and wouldn't worry about it
too much anyway.  (This is like a Zen riddle: what does it mean to
skip the page cache if the filesystem is the page cache?  Mu.  The
Linux answer is: no, go away.)

Thanks for working on these!  I'll have another go at my workaround
for the SysV IPC resource limits on those two OSes, which is relevant
to the I/O worker stuff.

*In my hobbyist OS hacking I had an on/off switch for that for
throwaway FreeBSD CI systems where you don't care about durability,
though I didn't go anywhere with it after the RAM disk did the trick.



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Thomas Munro
Date:
On Tue, Feb 18, 2025 at 11:33 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> Maybe we could try this?
>
> https://man.netbsd.org/mount_tmpfs.8
> https://man.openbsd.org/mount_tmpfs.8

NetBSD's test_world: 10:30 -> 3:23
OpenBSD test_world: 15:45 - >9:10

I think NetBSD would finish around 2nd place if turned on by default
like that.  The OpenBSD part needs more work though, see attached...

Attachment

Re: Adding NetBSD and OpenBSD to Postgres CI

From
Nazir Bilal Yavuz
Date:
Hi,

On Fri, 28 Feb 2025 at 19:26, Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Tue, Feb 18, 2025 at 11:33 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > Maybe we could try this?
> >
> > https://man.netbsd.org/mount_tmpfs.8
> > https://man.openbsd.org/mount_tmpfs.8
>
> NetBSD's test_world: 10:30 -> 3:23
> OpenBSD test_world: 15:45 - >9:10

Thank you for working on this! These are very nice speedups.

> I think NetBSD would finish around 2nd place if turned on by default
> like that.  The OpenBSD part needs more work though, see attached...

+    #umount /usr/obj  # unused 5G on /dev/sd0j
+    #swapon /dev/sd0j # fails with ENXIO... what am I missing?

I think I found the problem, sd0j's fstype is not a swap. It worked like that:

$ disklabel -E sd0
$ umount /usr/obj
$ disklabel -E sd0 # prompts are: m -> j -> \n -> \n -> swap -> w -> q
$ disklabel -E sd0
$ swapon /dev/sd0j # runs successfully

--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Thomas Munro
Date:
On Sat, Mar 1, 2025 at 6:24 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> I think I found the problem, sd0j's fstype is not a swap. It worked like that:
>
> $ disklabel -E sd0
> $ umount /usr/obj
> $ disklabel -E sd0 # prompts are: m -> j -> \n -> \n -> swap -> w -> q
> $ disklabel -E sd0
> $ swapon /dev/sd0j # runs successfully

Thanks!  I just piped those characters in and it worked, shaving
another couple of minutes off.  The script is probably now about as
short and sweet as it needs to be?  And the times are pretty good.

https://cirrus-ci.com/build/5275349266726912

Attachment

Re: Adding NetBSD and OpenBSD to Postgres CI

From
Thomas Munro
Date:
On Sun, Mar 2, 2025 at 1:37 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Sat, Mar 1, 2025 at 6:24 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> > I think I found the problem, sd0j's fstype is not a swap. It worked like that:
> >
> > $ disklabel -E sd0
> > $ umount /usr/obj
> > $ disklabel -E sd0 # prompts are: m -> j -> \n -> \n -> swap -> w -> q
> > $ disklabel -E sd0
> > $ swapon /dev/sd0j # runs successfully
>
> Thanks!  I just piped those characters in and it worked, shaving
> another couple of minutes off.  The script is probably now about as
> short and sweet as it needs to be?  And the times are pretty good.
>
> https://cirrus-ci.com/build/5275349266726912

And pushed.

I propose to turn these on by default tomorrow, unless someone speaks
up with reasons why we shouldn't.  One non-reason to hold off is a
one-off report[1] from Jacob about an unexplained failure of the oauth
tests on NetBSD, but I'd prefer to see it flap if it's going to so we
can collect clues and tackle it.  Oauth test coverage is coming on in
the build farm slowly anyway, no reason to hide it from our first line
of defence...

[1] https://www.postgresql.org/message-id/CAOYmi%2Bkr-Eh7utEDZBFq8Mc9byLLRF3ZNc0TaHt78SKwB4T5KA%40mail.gmail.com



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Jacob Champion
Date:
On Mon, Mar 3, 2025 at 2:36 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> One non-reason to hold off is a
> one-off report[1] from Jacob about an unexplained failure of the oauth
> tests on NetBSD, but I'd prefer to see it flap if it's going to so we
> can collect clues and tackle it.

Agreed.

--Jacob



Re: Adding NetBSD and OpenBSD to Postgres CI

From
Andres Freund
Date:
Hi,

On 2025-03-04 11:35:30 +1300, Thomas Munro wrote:
> On Sun, Mar 2, 2025 at 1:37 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > On Sat, Mar 1, 2025 at 6:24 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> > > I think I found the problem, sd0j's fstype is not a swap. It worked like that:
> > >
> > > $ disklabel -E sd0
> > > $ umount /usr/obj
> > > $ disklabel -E sd0 # prompts are: m -> j -> \n -> \n -> swap -> w -> q
> > > $ disklabel -E sd0
> > > $ swapon /dev/sd0j # runs successfully
> >
> > Thanks!  I just piped those characters in and it worked, shaving
> > another couple of minutes off.  The script is probably now about as
> > short and sweet as it needs to be?  And the times are pretty good.
> >
> > https://cirrus-ci.com/build/5275349266726912
> 
> And pushed.
> 
> I propose to turn these on by default tomorrow, unless someone speaks
> up with reasons why we shouldn't.  One non-reason to hold off is a
> one-off report[1] from Jacob about an unexplained failure of the oauth
> tests on NetBSD, but I'd prefer to see it flap if it's going to so we
> can collect clues and tackle it.  Oauth test coverage is coming on in
> the build farm slowly anyway, no reason to hide it from our first line
> of defence...

I'm worried about enabling them by default eating up CI credits even
faster. Not for cfbot, where we bring our own compute resources, but for folks
own repos.

I quite evidently lost track of that patch, but I think we should instead pick
https://postgr.es/m/20240413021221.hg53rvqlvldqh57i%40awork3.anarazel.de back
up, with that we could enable these tasks by default for cfbot, without
enabling them for everyone.

Greetings,

Andres Freund