Re: CI and test improvements - Mailing list pgsql-hackers

From Andres Freund
Subject Re: CI and test improvements
Date
Msg-id 20221105015946.yrxijqb7h4rqhp6d@awork3.anarazel.de
Whole thread Raw
In response to Re: CI and test improvements  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: CI and test improvements
List pgsql-hackers
Hi,

On 2022-11-04 18:54:12 -0500, Justin Pryzby wrote:
> Subject: [PATCH 1/8] meson: PROVE is not required
> Subject: [PATCH 3/8] meson: rename 'main' tasks to 'regress' and 'isolation'

Pushed, thanks for the patches.


> Subject: [PATCH 2/8] meson: other fixes for cygwin
>
> XXX: what about HAVE_BUGGY_STRTOF ?

What are you wondering about here? Shouldn't that continue to be set via
src/include/port/cygwin.h?


> diff --git a/src/test/regress/meson.build b/src/test/regress/meson.build
> index 3dcfc11278f..6ec3c77af53 100644
> --- a/src/test/regress/meson.build
> +++ b/src/test/regress/meson.build
> @@ -10,7 +10,7 @@ regress_sources = pg_regress_c + files(
>  # patterns like ".*-.*-mingw.*". We probably can do better, but for now just
>  # replace 'gcc' with 'mingw' on windows.
>  host_tuple_cc = cc.get_id()
> -if host_system == 'windows' and host_tuple_cc == 'gcc'
> +if host_system in ['windows', 'cygwin'] and host_tuple_cc == 'gcc'
>    host_tuple_cc = 'mingw'
>  endif

This doesn't quite seem right - shouldn't it say cywin? Not that it makes a
difference right now, given the contents of resultmap:
float4:out:.*-.*-cygwin.*=float4-misrounded-input.out
float4:out:.*-.*-mingw.*=float4-misrounded-input.out


> From 0acbbd2fdd97bbafc5c4552e26f92d52c597eea9 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj@telsasoft.com>
> Date: Wed, 25 May 2022 21:53:22 -0500
> Subject: [PATCH 4/8] cirrus/windows: add compiler_warnings_script
>
> I'm not sure how to write this test in windows shell; it's also not easy to
> write it in posix sh, since windows shell is somehow interpretting && and ||...
>
> https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de
>
> See also:
> 8a1ce5e54f6d144e4f8e19af7c767b026ee0c956
> https://cirrus-ci.com/task/6241060062494720
> https://cirrus-ci.com/task/6496366607204352
>
> ci-os-only: windows
> ---
>  .cirrus.yml                            | 10 +++++++++-
>  src/tools/ci/windows-compiler-warnings | 24 ++++++++++++++++++++++++
>  2 files changed, 33 insertions(+), 1 deletion(-)
>  create mode 100755 src/tools/ci/windows-compiler-warnings
>
> diff --git a/.cirrus.yml b/.cirrus.yml
> index 9f2282471a9..99ac09dc679 100644
> --- a/.cirrus.yml
> +++ b/.cirrus.yml
> @@ -451,12 +451,20 @@ task:
>
>    build_script: |
>      vcvarsall x64
> -    ninja -C build
> +    ninja -C build |tee build/meson-logs/build.txt
> +    REM Since pipes lose exit status of the preceding command, rerun compilation,
> +    REM without the pipe exiting now if it fails, rather than trying to run checks
> +    ninja -C build > nul

This seems mighty grotty :(. but I guess it's quick enough not worry about,
and I can't come up with a better plan.

It doesn't seem quite right to redirect into meson-logs/ to me, my
interpretation is that that's "meson's namespace".  Why not just store it in
build/?



> From e85fe83fd8a4b4c79a96d2bf66cd6a5e1bdfcd1e Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj@telsasoft.com>
> Date: Sat, 26 Feb 2022 19:34:35 -0600
> Subject: [PATCH 5/8] cirrus: build docs as a separate task..
>
> This will run the doc build if any docs have changed, even if Linux
> fails, to allow catch doc build failures.
>
> This'll automatically show up as a separate "column" on cfbot.
>
> Also, in the future, this will hopefully upload each patch's changed HTML docs
> as an artifact, for easy review.
>
> Note that this is currently building docs with both autoconf and meson.
>
> ci-os-only: html
> ---
>  .cirrus.yml | 62 +++++++++++++++++++++++++++++++++++++----------------
>  1 file changed, 44 insertions(+), 18 deletions(-)
>
> diff --git a/.cirrus.yml b/.cirrus.yml
> index 99ac09dc679..37fd79e5b77 100644
> --- a/.cirrus.yml
> +++ b/.cirrus.yml
> @@ -472,6 +472,9 @@ task:
>        type: text/plain
>
>
> +###
> +# Test that code can be built with gcc/clang without warnings
> +###
>  task:
>    name: CompilerWarnings
>
> @@ -515,10 +518,6 @@ task:
>      #apt-get update
>      #DEBIAN_FRONTEND=noninteractive apt-get -y install ...
>
> -  ###
> -  # Test that code can be built with gcc/clang without warnings
> -  ###
> -

Why remove this?


>    setup_script: echo "COPT=-Werror" > src/Makefile.custom
>
>    # Trace probes have a history of getting accidentally broken. Use the
> @@ -580,20 +579,6 @@ task:
>        make -s -j${BUILD_JOBS} clean
>        time make -s -j${BUILD_JOBS} world-bin
>
> -  ###
> -  # Verify docs can be built
> -  ###
> -  # XXX: Only do this if there have been changes in doc/ since last build
> -  always:
> -    docs_build_script: |
> -      time ./configure \
> -        --cache gcc.cache \
> -        CC="ccache gcc" \
> -        CXX="ccache g++" \
> -        CLANG="ccache clang"
> -      make -s -j${BUILD_JOBS} clean
> -      time make -s -j${BUILD_JOBS} -C doc
> -
>    ###
>    # Verify headerscheck / cpluspluscheck succeed
>    #
> @@ -617,3 +602,44 @@ task:
>
>    always:
>      upload_caches: ccache
> +
> +
> +###
> +# Verify docs can be built
> +# changesInclude() will skip this task if none of the commits since
> +# CIRRUS_LAST_GREEN_CHANGE touched any relevant files. The comparison appears
> +# to be like "git log a..b -- ./file", not "git diff a..b -- ./file"
> +###
> +
> +task:
> +  name: Documentation
> +
> +  env:
> +    CPUS: 1
> +    BUILD_JOBS: 1
> +
> +  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~
'.*\nci-os-only:[^\n]*(docs|html).*'
> +  skip: "!changesInclude('.cirrus.yml', 'doc/**')"

Perhaps we should introduce something other than ci-os-only if we want that to
include things like "docs and html".  At least this should update
src/tools/ci/README.


> +  sysinfo_script: |
> +    id
> +    uname -a
> +    cat /proc/cmdline
> +    ulimit -a -H && ulimit -a -S
> +    export

I think we can skip this here.


> +  # Exercise HTML and other docs:
> +  ninja_docs_build_script: |
> +    mkdir build.ninja
> +    cd build.ninja

Perhaps build-ninja instead? build.ninja is the filename for ninja's build
instructions, so it seems a bit confusing.


> From adebe93a4409990e929f2775d45c6613134a4243 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj@telsasoft.com>
> Date: Tue, 26 Jul 2022 20:30:02 -0500
> Subject: [PATCH 6/8] cirrus/ccache: add explicit cache keys..
>
> Since otherwise, building with ci-os-only will probably fail to use the
> normal cache, since the cache key is computed using both the task name
> and its *index* in the list of caches (internal/executor/cache.go:184).

Seems like this would potentially better addressed by reporting a bug to the
cirrus folks?


>    ccache_cache:
>      folder: ${CCACHE_DIR}
> +    fingerprint_key: ccache/linux
> +    reupload_on_changes: true

There's enough copies of this to make it worth deduplicating. If we use
something like
   fingerprint_script: echo ccache/$CIRRUS_BRANCH/$CIRRUS_OS
we can use a yaml ref?


I think you experimented with creating a 'base' ccache dir (e.g. on the master
branch) and then using branch specific secondar caches? How did that turn out?
I think cfbot's caches constantly get removed due to overrunning the global
space.


> From f16739bc5d2087847129baf663aa25fa9edb8449 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj@telsasoft.com>
> Date: Sun, 3 Apr 2022 00:10:20 -0500
> Subject: [PATCH 7/8] cirrus/ccache: disable compression and show stats

> Since v4.0, ccache enables zstd compression by default, saving roughly
> 2x-3x.  But, cirrus caches are compressed as tar.gz, so we could disable
> ccache compression, allowing cirrus to gzip the uncompressed data
> (better than ccache's default of zstd-1).
>
> With default compression enabled (https://cirrus-ci.com/task/6692342840164352):
> linux/debian/bullseye has 4.2; 99MB cirrus cache; cache_size_kibibyte    109616
> macos has 4.5.1: 47MB cirrus cache; cache_size_kibibyte    52500
> freebsd has 3.7.12: 42MB cirrus cache; cache_size_kibibyte    134064
> XXX windows has 4.7.2; 180MB cirrus cache; cache_size_kibibyte    51179
> todo: compiler warnings
>
> With compression disabled (https://cirrus-ci.com/task/4614182514458624):
> linux: 91MB cirrus cache; cache_size_kibibyte    316136
> macos: 41MB cirrus cache; cache_size_kibibyte    118068
> windows: 42MB cirrus cache; cache_size_kibibyte    134064
> freebsd is the same

I'm still somewhat doubtful this is a good idea. The mingw cache is huge, for
example, and all that additional IO and memory usage is bound to show up.


> The stats should be shown and/or logged.
> ccache --show-stats shows the *cumulative* stats (including prior
> compilations)
> ccache --zero-stats clears out not only the global stats, but the
> per-file cache stats (from which the global stats are derived) - which
> obviously makes the cache work poorly.
>
> Note that ccache 4.4 supports CCACHE_STATSLOG, which seems ideal.
> The log should be written *outside* the ccache folder - it shouldn't be
> preserved across cirrusci task invocations.

I assume we don't have a new enough ccache everywhere yet?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Direct I/O
Next
From: Andres Freund
Date:
Subject: Re: Add tracking of backend memory allocated to pg_stat_activity