Hi,
On Thu, 17 Jul 2025 at 03:08, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jul 16, 2025 at 02:32:53PM +0300, Nazir Bilal Yavuz wrote:
> > On Wed, 16 Jul 2025 at 04:39, Michael Paquier <michael@paquier.xyz> wrote:
> >> Hmm. Is that actually useful as we know that the standby has been
> >> stalen down when running the test? Even if we report something, we
> >> could always trim the output, like the standby case.
>
> My fingers and brain have flipped here. I meant likely
> s/stalen/taken/.
>
> > I guess you are right. Also, I tried killing standby while the
> > 027_stream_regress test was running and the test did not fail; instead
> > it continued until timeout. If that is always the case, then it makes
> > sense to report if and only if both primary and standby are alive.
>
> Okay. Understood. I'm pretty sure that somebody around you has also
> run the test and crashed the standby on replay, to note that the
> pattern on HEAD was bad.
Updated, now error reporting happens only if regression tests are
failed and both primary and standby are alive.
> > system_log() logs which command is run. I actually do not want to log
> > the command because it needs to be run after the regression tests and
> > before the 'regression tests pass'. If I log the command it looks like
> > this:
> >
> > # All 228 tests passed.
> > # Running: pg_isready -h /tmp/EXCcSldGZE -p 10493 >/dev/null 2>&1
> > # Running: pg_isready -h /tmp/EXCcSldGZE -p 10494 >/dev/null 2>&1
> > (21.939s) ok 2 - regression tests pass
> > (0.000s) ok 3 - primary is alive after the regression tests
> > (0.000s) ok 4 - standby is alive after the regression tests
> >
> > I think it looks like these pg_isready commands run randomly.
>
> FWIW, I'm usually in favor of a bit more logging, to understand what
> happens in the test script under-the-hood. That's less guessing when
> calling these routines. If I'm outvoted, that's fine.
I don't have a strong opinion on this. My point was 'pg_isready' being
on top of 'regression test pass' and not being on top of
'primary|standby is alive after the regression tests' gives a false
impression to me. However, if you say this is okay; then I can update
it.
> >> We could use that as a routine of its own in Utils.pm? It's not
> >> really something only for diff files, more something that we can use
> >> to trim the contents of a file, which could be the tail of a server
> >> log file as well.. It's always difficult to measure what a "good"
> >> number of lines is, but perhaps we could use an environment variable
> >> to make that configurable rather than enforce a policy of 50 lines
> >> because we think it's good enough?
> >
> > I think this is a good idea. How about a function called file_trim()
> > with three arguments: the file name, a mode ('head' or 'tail'), and
> > the number of lines to trim from that end. This approach may require
> > reading the file more than once, but it is easy to use.
>
> Sounds fine to me. Thanks!
I added that as 0001. I used a shifting method for the 'tail'
direction to not use too much memory. I found that there is
'File::ReadBackwards' in Perl but you need to install it, so I didn't
use it.
Now patch is divided into three:
0001: 'Add trim_file() helper to Utils.pm' -> Which effectively does
nothing, just adds a function to be used for a subsequent patch. This
function accepts 'line_count' as an argument but
'PG_TEST_FILE_TRIM_LINES' environment variable overrides it. Should I
document 'PG_TEST_FILE_TRIM_LINES' somewhere?
0002: 'Improve error reporting in 027_stream_regress test' -> Uses
trim_file() function to improve error reporting by including the head
and tail of regression.diffs directly in the failure message.
0003: 'Check primary and standby are alive after regression tests in
027_stream_regress' -> Add test for checking status of primary and
standby after the regression tests in 027_stream_regress. Also, it
makes error reporting happen only if regression tests are failed and
both primary and standby are alive.
--
Regards,
Nazir Bilal Yavuz
Microsoft