Thread: Windows CFBot is broken because ecpg dec_test.c error
Since about ~11 hours ago the ecpg test is consistently failing on Window with this error[1]: > Could not open file C:/cirrus/build/src/interfaces/ecpg/test/compat_informix/dec_test.c for reading I took a quick look at possible causes but couldn't find a clear winner. My current guess is that there's some dependency rule missing in the meson file and due to some infra changes files now get compiled in the wrong order. One recent suspicious commit seems to be: 7819a25cd101b574f5422edb00fe3376fbb646da But there are a bunch of successful changes that include that commit, so it seems to be a red herring. (CC-ed Noah anyway) [1]: https://cirrus-ci.com/task/6305422665056256?logs=check_world#L143
Hi, On January 28, 2025 7:13:16 AM EST, Jelte Fennema-Nio <postgres@jeltef.nl> wrote: >Since about ~11 hours ago the ecpg test is consistently failing on >Window with this error[1]: > >> Could not open file C:/cirrus/build/src/interfaces/ecpg/test/compat_informix/dec_test.c for reading > >I took a quick look at possible causes but couldn't find a clear >winner. My current guess is that there's some dependency rule missing >in the meson file and due to some infra changes files now get compiled >in the wrong order. > >One recent suspicious commit seems to be: >7819a25cd101b574f5422edb00fe3376fbb646da >But there are a bunch of successful changes that include that commit, >so it seems to be a red herring. (CC-ed Noah anyway) I think it's due to a new version of meson. Seems we under specified test dependencies. I'll write up a patch. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hi, On Tue, 28 Jan 2025 at 17:02, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On January 28, 2025 7:13:16 AM EST, Jelte Fennema-Nio <postgres@jeltef.nl> wrote: > >Since about ~11 hours ago the ecpg test is consistently failing on > >Window with this error[1]: > > > >> Could not open file C:/cirrus/build/src/interfaces/ecpg/test/compat_informix/dec_test.c for reading > > > >I took a quick look at possible causes but couldn't find a clear > >winner. My current guess is that there's some dependency rule missing > >in the meson file and due to some infra changes files now get compiled > >in the wrong order. > > > >One recent suspicious commit seems to be: > >7819a25cd101b574f5422edb00fe3376fbb646da > >But there are a bunch of successful changes that include that commit, > >so it seems to be a red herring. (CC-ed Noah anyway) > > I think it's due to a new version of meson. Seems we under specified test dependencies. I'll write up a patch. The cause is that meson fixed a bug [1] in v.1.7.0. Before meson v1.7.0; although --no-rebuild is used while running tests, meson was building all targets. This is fixed with v.1.7.0. The change below fixes the problem: diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml index 18e944ca89d..c7a94ff6471 100644 --- a/.cirrus.tasks.yml +++ b/.cirrus.tasks.yml @@ -17,7 +17,7 @@ env: CHECK: check-world PROVE_FLAGS=$PROVE_FLAGS CHECKFLAGS: -Otarget PROVE_FLAGS: --timer - MTEST_ARGS: --print-errorlogs --no-rebuild -C build + MTEST_ARGS: --print-errorlogs -C build PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance And I think this is the correct approach. It builds all of the not-yet-built targets before running the tests. Another solution might be manually building ecpg target before running tests but I think the former approach is more suitable for the CI. CI run after this change applied: https://cirrus-ci.com/build/6264369203380224 [1] https://mesonbuild.com/Release-notes-for-1-7-0.html#test-targets-no-longer-built-by-default -- Regards, Nazir Bilal Yavuz Microsoft
Hi, On 2025-01-29 18:24:45 +0300, Nazir Bilal Yavuz wrote: > On Tue, 28 Jan 2025 at 17:02, Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > > > On January 28, 2025 7:13:16 AM EST, Jelte Fennema-Nio <postgres@jeltef.nl> wrote: > > >Since about ~11 hours ago the ecpg test is consistently failing on > > >Window with this error[1]: > > > > > >> Could not open file C:/cirrus/build/src/interfaces/ecpg/test/compat_informix/dec_test.c for reading > > > > > >I took a quick look at possible causes but couldn't find a clear > > >winner. My current guess is that there's some dependency rule missing > > >in the meson file and due to some infra changes files now get compiled > > >in the wrong order. > > > > > >One recent suspicious commit seems to be: > > >7819a25cd101b574f5422edb00fe3376fbb646da > > >But there are a bunch of successful changes that include that commit, > > >so it seems to be a red herring. (CC-ed Noah anyway) > > > > I think it's due to a new version of meson. Seems we under specified test dependencies. I'll write up a patch. Sorry, got distracted with somewhat pressing matters. > The cause is that meson fixed a bug [1] in v.1.7.0. Before meson > v1.7.0; although --no-rebuild is used while running tests, meson was > building all targets. This is fixed with v.1.7.0. > > [1] https://mesonbuild.com/Release-notes-for-1-7-0.html#test-targets-no-longer-built-by-default That's not *quite* right - it wasn't that the targets were built when --no-rebuild was specified, it's that the default build target (for just 'ninja'), built all test dependencies. > The change below fixes the problem: > > diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml > index 18e944ca89d..c7a94ff6471 100644 > --- a/.cirrus.tasks.yml > +++ b/.cirrus.tasks.yml > @@ -17,7 +17,7 @@ env: > CHECK: check-world PROVE_FLAGS=$PROVE_FLAGS > CHECKFLAGS: -Otarget > PROVE_FLAGS: --timer > - MTEST_ARGS: --print-errorlogs --no-rebuild -C build > + MTEST_ARGS: --print-errorlogs -C build > PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests > TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf > PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance > > And I think this is the correct approach. It builds all of the > not-yet-built targets before running the tests. Another solution might > be manually building ecpg target before running tests but I think the > former approach is more suitable for the CI. > > CI run after this change applied: https://cirrus-ci.com/build/6264369203380224 I don't think that's the entirety of the issue. Our dependencies aren't quite airtight enough. With a sufficiently modern meson, try doing e.g. rm -rf tmp_install/ && ninja clean && meson test --suite setup --suite ecpg It'll fail, because the dependencies of the tests are insufficient. See the set of patches at https://www.postgresql.org/message-id/qh4c5tvkgjef7jikjig56rclbcdrrotngnwpycukd2n3k25zi2%4044hxxvtwmgum I think the only reason your patch on its own suffices, is that the "all" target, that we ran separately beforehand, actually has sufficient dependencies to make things work. The nice thing is that with this meson improvement we should be able to get rid of the "setup" test suite and instead generate the test install via dependencies. Obviously we either have to wait a fair bit or do it depending on the meson version... Greetings, Andres Freund
Hi, On Wed, 29 Jan 2025 at 19:50, Andres Freund <andres@anarazel.de> wrote: > > On 2025-01-29 18:24:45 +0300, Nazir Bilal Yavuz wrote: > > > The cause is that meson fixed a bug [1] in v.1.7.0. Before meson > > v1.7.0; although --no-rebuild is used while running tests, meson was > > building all targets. This is fixed with v.1.7.0. > > > > [1] https://mesonbuild.com/Release-notes-for-1-7-0.html#test-targets-no-longer-built-by-default > > That's not *quite* right - it wasn't that the targets were built when > --no-rebuild was specified, it's that the default build target (for just > 'ninja'), built all test dependencies. > > > > The change below fixes the problem: > > > > diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml > > index 18e944ca89d..c7a94ff6471 100644 > > --- a/.cirrus.tasks.yml > > +++ b/.cirrus.tasks.yml > > @@ -17,7 +17,7 @@ env: > > CHECK: check-world PROVE_FLAGS=$PROVE_FLAGS > > CHECKFLAGS: -Otarget > > PROVE_FLAGS: --timer > > - MTEST_ARGS: --print-errorlogs --no-rebuild -C build > > + MTEST_ARGS: --print-errorlogs -C build > > PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests > > TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf > > PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance > > > > And I think this is the correct approach. It builds all of the > > not-yet-built targets before running the tests. Another solution might > > be manually building ecpg target before running tests but I think the > > former approach is more suitable for the CI. > > > > CI run after this change applied: https://cirrus-ci.com/build/6264369203380224 > > I don't think that's the entirety of the issue. > > Our dependencies aren't quite airtight enough. With a sufficiently modern > meson, try doing e.g. > > rm -rf tmp_install/ && ninja clean && meson test --suite setup --suite ecpg > > It'll fail, because the dependencies of the tests are insufficient. > > See the set of patches at > https://www.postgresql.org/message-id/qh4c5tvkgjef7jikjig56rclbcdrrotngnwpycukd2n3k25zi2%4044hxxvtwmgum > > > I think the only reason your patch on its own suffices, is that the "all" > target, that we ran separately beforehand, actually has sufficient > dependencies to make things work. Yes, you are right. I agree that what you said is the correct solution and that should be the ultimate goal. What I shared could be a band-aid fix to make the Windows CI task happy until the patches you shared get committed. Another solution might be to downgrade the meson version in the Windows images at the CI repository [1], that would be better for the commit history. [1] https://github.com/anarazel/pg-vm-images -- Regards, Nazir Bilal Yavuz Microsoft