Thread: test/isolation/expected/stats_1.out broken for me
Running installcheck-world on an unrelated patch, I noticed a failure here in test/isolation/expected/stats_1.out (this is line 3102): step s1_slru_check_stats: SELECT current.blks_zeroed > before.value FROM test_slru_stats before INNER JOIN pg_stat_slru current ON before.slru = current.name WHERE before.stat = 'blks_zeroed'; ?column? -------- t (1 row) This is built from bab588c. On my amd64/linux box the result is f. The same mismatch is present if I build from 6392f2a (i.e., just before a2f433f pgstat: add alternate output for stats.spec), along with a bunch of others. So a2f433f seems to have silenced all the rest of those, but not this one. If I build from ad40166, installcheck-world passes. That's as far as I have pursued it. Regards, -Chap
chap@anastigmatix.net writes: > Running installcheck-world on an unrelated patch, I noticed a failure > here in test/isolation/expected/stats_1.out (this is line 3102): So what non-default build options are you using? The only isolationcheck failure remaining in the buildfarm is prion's, which I can reproduce here by building with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE as it does. Looking at the nature of the diffs, this is not too surprising; the expected output appears to rely on a cache flush not happening quickly in s2. regards, tom lane
Hi, On 2022-04-07 12:49:07 -0400, Tom Lane wrote: > chap@anastigmatix.net writes: > > Running installcheck-world on an unrelated patch, I noticed a failure > > here in test/isolation/expected/stats_1.out (this is line 3102): > > So what non-default build options are you using? > > The only isolationcheck failure remaining in the buildfarm is > prion's, which I can reproduce here by building with > -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE as it does. > Looking at the nature of the diffs, this is not too surprising; > the expected output appears to rely on a cache flush not happening > quickly in s2. Yea :(. I tested debug_discard_caches, but not -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE. Not quite sure what to do about it - it's intentionally trying to test the case of no invalidations being processed, as that's an annoying edge case with functions. Perhaps wrapping the function call of the "already dropped" function in another function that catches the error would do the trick? It'd be more easily silently broken, but still be better than not having the test. Greetings, Andres Freund
On 2022-04-07 12:49, Tom Lane wrote: > So what non-default build options are you using? The command that I've just been reusing from my bash_history without thinking about it for some years is: configure --enable-cassert --enable-tap-tests \ --with-libxml --enable-debug \ CFLAGS='-ggdb -Og -g3 -fno-omit-frame-pointer' Regards, -Chap
Hi, On 2022-04-07 13:16:53 -0400, chap@anastigmatix.net wrote: > The command that I've just been reusing from my bash_history without > thinking about it for some years is: > > configure --enable-cassert --enable-tap-tests \ > --with-libxml --enable-debug \ > CFLAGS='-ggdb -Og -g3 -fno-omit-frame-pointer' Hm, that's similar to what I use without seeing the problem. IIUC you ran installcheck - did you set any non-default config options in the postgres instance that runs against? Is anything else running on that instance? Do you use any -j setting when running installcheck-world? Is the failure reproducible, or a one-off? Greetings, Andres Freund
Hi, On 2022-04-07 10:29:10 -0700, Andres Freund wrote: > On 2022-04-07 13:16:53 -0400, chap@anastigmatix.net wrote: > > The command that I've just been reusing from my bash_history without > > thinking about it for some years is: > > > > configure --enable-cassert --enable-tap-tests \ > > --with-libxml --enable-debug \ > > CFLAGS='-ggdb -Og -g3 -fno-omit-frame-pointer' > > Hm, that's similar to what I use without seeing the problem. > > IIUC you ran installcheck - did you set any non-default config options in the > postgres instance that runs against? Is anything else running on that > instance? Do you use any -j setting when running installcheck-world? > > Is the failure reproducible, or a one-off? I've now reproduced this, albeit not reliably yet. Looking. Greetings, Andres Freund
Hi, On 2022-04-07 11:02:41 -0700, Andres Freund wrote: > I've now reproduced this, albeit not reliably yet. Looking. Caused by me misremembering when deduplication happens - somehow recalled that deduplication didn't happen when payloads. So the statement that was supposed to guarantee needing more than one page: SELECT pg_notify('stats_test_use', repeat('0', current_setting('block_size')::int / 2)) FROM generate_series(1, 3); didn't actually guarantee that. It just failed to fail by chance. When regression tests and isolation test run in sequence against the same freshly started cluster, the offsets when starting out are just right to not need another page in the first test. I'll change it to use distinct payloads.. Greetings, Andres Freund
Hi, On 2022-04-07 11:54:08 -0700, Andres Freund wrote: > I'll change it to use distinct payloads.. And done. Chap, could you confirm this fixes the issue for you? Greetings, Andres Freund
On 2022-04-07 15:04, Andres Freund wrote: > And done. Chap, could you confirm this fixes the issue for you? Looks good from here. One installcheck-world with no failures; previously, it failed for me every time. Regards, -Chap
Hi, On 2022-04-07 09:57:09 -0700, Andres Freund wrote: > Yea :(. I tested debug_discard_caches, but not -DRELCACHE_FORCE_RELEASE > -DCATCACHE_FORCE_RELEASE. > > Not quite sure what to do about it - it's intentionally trying to test the > case of no invalidations being processed, as that's an annoying edge case with > functions. Perhaps wrapping the function call of the "already dropped" > function in another function that catches the error would do the trick? It'd > be more easily silently broken, but still be better than not having the test. Anybody got a better idea? - Andres
Andres Freund <andres@anarazel.de> writes: > On 2022-04-07 09:57:09 -0700, Andres Freund wrote: >> Yea :(. I tested debug_discard_caches, but not -DRELCACHE_FORCE_RELEASE >> -DCATCACHE_FORCE_RELEASE. >> >> Not quite sure what to do about it - it's intentionally trying to test the >> case of no invalidations being processed, as that's an annoying edge case with >> functions. Perhaps wrapping the function call of the "already dropped" >> function in another function that catches the error would do the trick? It'd >> be more easily silently broken, but still be better than not having the test. > Anybody got a better idea? Maybe if the wrapper function checks for exactly the two expected behaviors, it'd be robust enough? regards, tom lane
Hi, On 2022-04-07 18:31:35 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2022-04-07 09:57:09 -0700, Andres Freund wrote: > >> Yea :(. I tested debug_discard_caches, but not -DRELCACHE_FORCE_RELEASE > >> -DCATCACHE_FORCE_RELEASE. > >> > >> Not quite sure what to do about it - it's intentionally trying to test the > >> case of no invalidations being processed, as that's an annoying edge case with > >> functions. Perhaps wrapping the function call of the "already dropped" > >> function in another function that catches the error would do the trick? It'd > >> be more easily silently broken, but still be better than not having the test. > > > Anybody got a better idea? > > Maybe if the wrapper function checks for exactly the two expected > behaviors, it'd be robust enough? Seems to work. If I break the code it's trying to test, it still fails... Of course only when none of debug_discard_caches, RELCACHE_FORCE_RELEASE, CATCACHE_FORCE_RELEASE are used, but that seems unavoidable / harmless. Let's see what the buildfarm thinks. Greetings, Andres Freund