Re: pgbench logging broken by time logic changes - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: pgbench logging broken by time logic changes
Date
Msg-id YNPsPAUoVDCpPOGk@paquier.xyz
Whole thread Raw
In response to Re: pgbench logging broken by time logic changes  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: pgbench logging broken by time logic changes
List pgsql-hackers
On Wed, Jun 23, 2021 at 10:01:28PM +0200, Fabien COELHO wrote:
> Thanks for the hint! Why not, having the ability to collect data is a good
> thing, so attached v10 does that. If something go wrong, the TODO section
> could be extended around all calls.

+check_pgbench_logs($bdir, '001_pgbench_log_1', $nthreads, 1, 3,
+   qr{^\d{10,} \d{1,2} \d+ \d+ \d+ \d+ \d+ \d+ \d+ \d+ \d+$});
FWIW, I am still seeing problems with the regex pattern you are using
here, because this fails to detect the number of fields we should have
generated here, for one.  If you are not convinced, just run your new
test and extend or reduce the amount of data generated by logAgg() in
your patch: the tests will still pass.

So I have investigated this stuff in details.  The regular expressions
are correctly shaped, but the use of grep() in check_pgbench_logs()
seems to be incorrect.

For example, let's take an aggregate report generated by your new
test:
"1624498086 13 27632 60597490 1683 2853 3227 883179 120 386 123"
Here are some extra ones, shorter and longer:
"1624498086 13 27632 60597490 1683 2853 3227 8831";
"1624498086 13 27632 60597490 1683 2853 3227 883179 120 386 123 123";

Using grep() with "$re" results in all the fields matching.  Using on
the contrary "/$re/" in grep(), like list_files(), would only match
the first one, which is correct.  Please see attached a small script
to show my point, called perl_grep.pl.

With this issue fixed, I have bumped into what looks like a different
bug in the tests.  001_pgbench_log_2 uses pgbench with 2 clients, but
expects only patterns in the logs where the first column value uses
only 0.  With two clients, those first values can be either 0 or 1 due
to the client ID set.

It seems to me that we had better fix this issue and back-patch where
this has been introduced so as we have exact match checks with the log
formarts, no?  Please see the attached.

Thoughts?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Deadlock risk while inserting directly into partition?
Next
From: Amit Kapila
Date:
Subject: Re: [bug?] Missed parallel safety checks, and wrong parallel safety