pgsql: Fix pattern matching logic for logs in TAP tests of pgbench
Fix pattern matching logic for logs in TAP tests of pgbench
The logic checking for the format of per-thread logs used grep() with
directly "$re", which would cause the test to consider all the logs as
a match without caring about their format at all. Using "/$re/" makes
grep() perform a regex test, which is what we want here.
While on it, improve some of the tests to be more picky with the
patterns expected and add more comments to describe the tests.
Issue discovered while digging into a separate patch.
Author: Fabien Coelho, Michael Paquier
Discussion: /messages/by-id/YNPsPAUoVDCpPOGk@paquier.xyz
Backpatch-through: 11
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/c13585fe9e55813cf9feac67fe7b65d3a78fff92
Modified Files
--------------
src/bin/pgbench/t/001_pgbench_with_server.pl | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
On Thu, Jun 24, 2021 at 09:54:51PM +0000, Michael Paquier wrote:
Fix pattern matching logic for logs in TAP tests of pgbench
The logic checking for the format of per-thread logs used grep() with
directly "$re", which would cause the test to consider all the logs as
a match without caring about their format at all. Using "/$re/" makes
grep() perform a regex test, which is what we want here.While on it, improve some of the tests to be more picky with the
patterns expected and add more comments to describe the tests.Issue discovered while digging into a separate patch.
fairywren does not like that:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2021-06-24%2022%3A36%3A14
not ok 302 - transaction format for 001_pgbench_log_2
not ok 311 - transaction format for 001_pgbench_log_3
I am not sure if this is a pre-existing bug in pgbench itself
regarding the way we generate the logs or an issue with fairywren's
perl installation.
Andrew, what's the format of the per-thread logs generated on this
host when running the commands? I'd bet that the checks for the
client IDs are right, but that some parts of the regex are too picky
when it comes to this host.
--
Michael
(Forgot to add Andrew in CC, now done)
On Fri, Jun 25, 2021 at 08:51:19AM +0900, Michael Paquier wrote:
Andrew, what's the format of the per-thread logs generated on this
host when running the commands? I'd bet that the checks for the
client IDs are right, but that some parts of the regex are too picky
when it comes to this host.
I have tested that on my own Windows host using MSVC and Active Perl,
and those tests pass. I am not sure what's happening here. A simple
way to figure out what's going on would be to make the tests more
talkative and show up the logs that don't match.
--
Michael
On 6/24/21 8:26 PM, Michael Paquier wrote:
(Forgot to add Andrew in CC, now done)
On Fri, Jun 25, 2021 at 08:51:19AM +0900, Michael Paquier wrote:
Andrew, what's the format of the per-thread logs generated on this
host when running the commands? I'd bet that the checks for the
client IDs are right, but that some parts of the regex are too picky
when it comes to this host.I have tested that on my own Windows host using MSVC and Active Perl,
and those tests pass. I am not sure what's happening here. A simple
way to figure out what's going on would be to make the tests more
talkative and show up the logs that don't match.
That's not really an equivalent test. I'm taking a look
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Thu, Jun 24, 2021 at 09:36:53PM -0400, Andrew Dunstan wrote:
That's not really an equivalent test. I'm taking a look
Thanks!
--
Michael
On 6/24/21 9:53 PM, Michael Paquier wrote:
On Thu, Jun 24, 2021 at 09:36:53PM -0400, Andrew Dunstan wrote:
That's not really an equivalent test. I'm taking a look
Thanks!
There's a whole lot wrong with this code. To start with, why is that
unchecked eval there. And why is it reading in log files on its own
instead of using TestLib::slurp_file, which, among other things,
normalizes line endings? There's a very good chance that this latter is
the issue. It only affects msys which is why you didn't see an issue on
MSVC. And also, why does it carefully unlink the log files so that any
trace of what's gone wrong is deleted?
Based on the little I've seen this file needs a serious code review.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 6/24/21 10:12 PM, Andrew Dunstan wrote:
On 6/24/21 9:53 PM, Michael Paquier wrote:
On Thu, Jun 24, 2021 at 09:36:53PM -0400, Andrew Dunstan wrote:
That's not really an equivalent test. I'm taking a look
Thanks!
There's a whole lot wrong with this code. To start with, why is that
unchecked eval there. And why is it reading in log files on its own
instead of using TestLib::slurp_file, which, among other things,
normalizes line endings? There's a very good chance that this latter is
the issue. It only affects msys which is why you didn't see an issue on
MSVC. And also, why does it carefully unlink the log files so that any
trace of what's gone wrong is deleted?Based on the little I've seen this file needs a serious code review.
... and there's the error:
check_pgbench_logs($bdir, '001_pgbench_log_2', 1, 8, 92,
��� qr{^[01] \d{1,2} \d+ \d \d+ \d+$});
and one further down the same.
Since the file isn't read in using slurp_file, that $ won't match
because the lines will end \r\n instead of \n.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Thu, Jun 24, 2021 at 10:26:18PM -0400, Andrew Dunstan wrote:
Since the file isn't read in using slurp_file, that $ won't match
because the lines will end \r\n instead of \n.
I did not remember this one with Msys, thanks. I am not sure that
there is any need for an eval block here actually once you remove
open()? What do you think about something like the attached?
--
Michael
Attachments:
pgbench-tap-fix.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index ff5b31d4df..a136e18b0e 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1193,28 +1193,29 @@ sub check_pgbench_logs
my $log_number = 0;
for my $log (sort @logs)
{
- eval {
- open my $fh, '<', $log or die "$@";
- my @contents = <$fh>;
- my $clen = @contents;
- ok( $min <= $clen && $clen <= $max,
- "transaction count for $log ($clen)");
- my $clen_match = grep(/$re/, @contents);
- ok($clen_match == $clen, "transaction format for $prefix");
- # Show more information if some logs don't match
- # to help with debugging.
- if ($clen_match != $clen)
+ # Check the contents of each log file.
+ my $contents_raw = slurp_file($log);
+
+ # On Msys, filter out any CRLF.
+ $contents_raw =~ s/\r\n/\n/g if $Config{osname} eq 'msys';
+ my @contents = split("\n", $contents_raw);
+ my $clen = @contents;
+ ok( $min <= $clen && $clen <= $max,
+ "transaction count for $log ($clen)");
+ my $clen_match = grep(/$re/, @contents);
+ ok($clen_match == $clen, "transaction format for $prefix");
+
+ # Show more information if some logs don't match
+ # to help with debugging.
+ if ($clen_match != $clen)
+ {
+ foreach my $log (@contents)
{
- foreach my $log (@contents)
- {
- print "# Log entry not matching: $log\n"
- unless $log =~ /$re/;
- }
+ print "# Log entry not matching: $log\n"
+ unless $log =~ /$re/;
}
- close $fh or die "$@";
- };
+ }
}
- ok(unlink(@logs), "remove log files");
return;
}
On 6/24/21 11:08 PM, Michael Paquier wrote:
On Thu, Jun 24, 2021 at 10:26:18PM -0400, Andrew Dunstan wrote:
Since the file isn't read in using slurp_file, that $ won't match
because the lines will end \r\n instead of \n.I did not remember this one with Msys, thanks. I am not sure that
there is any need for an eval block here actually once you remove
open()? What do you think about something like the attached?
+��� ��� # On Msys, filter out any CRLF.
+��� ��� $contents_raw =~ s/\r\n/\n/g if $Config{osname} eq 'msys';
This is completely redundant. The whole point is that slurp_file does
exactly this for you.
+��� ��� my @contents = split("\n", $contents_raw);
Probably more idiomatic to write split(/\n/,$contents_raw), or
split(/^/, $contents_raw) if you want to keep the line feeds.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Fri, Jun 25, 2021 at 06:12:09AM -0400, Andrew Dunstan wrote:
+ # On Msys, filter out any CRLF. + $contents_raw =~ s/\r\n/\n/g if $Config{osname} eq 'msys';This is completely redundant. The whole point is that slurp_file does
exactly this for you.
Thanks. I have managed to duplicate that.
+ my @contents = split("\n", $contents_raw);
Probably more idiomatic to write split(/\n/,$contents_raw), or
split(/^/, $contents_raw) if you want to keep the line feeds.
I have gone with the solution that removes the newlines. This does
not change the pattern checks, and that makes printing entries not
matching a bit cleaner.
Thanks a lot for the investigation!
--
Michael
Hello Andrew & Michaᅵl,
My 0.02ᅵ:
There's a whole lot wrong with this code. To start with, why is that
unchecked eval there.
Yep. The idea was that other tests would go on being collected eg if the
file is not found, but it should have been checked anyway.
And why is it reading in log files on its own instead of using
TestLib::slurp_file, which, among other things, normalizes line endings?
Indeed.
However, if slurp_file fails it raises an exception and aborts the whole
TAP unexpectedly, which is pretty unclean. So I'd suggest to keep the
eval, as attached. I tested it by changing the file name so that the slurp
fails.
There's a very good chance that this latter is the issue. It only
affects msys which is why you didn't see an issue on MSVC. And also, why
does it carefully unlink the log files so that any trace of what's gone
wrong is deleted?
Based on the little I've seen this file needs a serious code review.
Probably: My very old perl expertise is fading away because I'm not using
it much these days. Cannot say I miss it:-)
--
Fabien.
Attachments:
pgbench-tap-eval-1.patchtext/x-diff; NAME=pgbench-tap-eval-1.patchDownload
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 781cc08fb1..6a82a3494d 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1190,29 +1190,32 @@ sub check_pgbench_logs
ok(@logs == $nb, "number of log files");
ok(grep(/\/$prefix\.\d+(\.\d+)?$/, @logs) == $nb, "file name format");
- my $log_number = 0;
for my $log (sort @logs)
{
- # Check the contents of each log file.
- my $contents_raw = slurp_file($log);
+ eval {
+ # Check the contents of each log file.
+ my $contents_raw = slurp_file($log);
- my @contents = split(/\n/, $contents_raw);
- my $clen = @contents;
- ok( $min <= $clen && $clen <= $max,
- "transaction count for $log ($clen)");
- my $clen_match = grep(/$re/, @contents);
- ok($clen_match == $clen, "transaction format for $prefix");
+ my @contents = split(/\n/, $contents_raw);
+ my $clen = @contents;
+ ok( $min <= $clen && $clen <= $max,
+ "transaction count for $log ($clen)");
+ my $clen_match = grep(/$re/, @contents);
+ ok($clen_match == $clen, "transaction format for $prefix");
- # Show more information if some logs don't match
- # to help with debugging.
- if ($clen_match != $clen)
- {
- foreach my $log (@contents)
+ # Show more information if some logs don't match
+ # to help with debugging.
+ if ($clen_match != $clen)
{
- print "# Log entry not matching: $log\n"
- unless $log =~ /$re/;
+ foreach my $log (@contents)
+ {
+ print "# Log entry not matching: $log\n"
+ unless $log =~ /$re/;
+ }
}
- }
+ };
+ # Tell if an exception was raised
+ ok(0, "Error while checking log file \"$log\": $@") if $@;
}
return;
}
On 6/26/21 2:47 AM, Fabien COELHO wrote:
Hello Andrew & Michaᅵl,
My 0.02ᅵ:
There's a whole lot wrong with this code. To start with, why is that
unchecked eval there.Yep. The idea was that other tests would go on being collected eg if
the file is not found, but it should have been checked anyway.And why is it reading in log files on its own instead of using
TestLib::slurp_file, which, among other things, normalizes line endings?Indeed.
However, if slurp_file fails it raises an exception and aborts the
whole TAP unexpectedly, which is pretty unclean. So I'd suggest to
keep the eval, as attached. I tested it by changing the file name so
that the slurp fails.
Seem quite unnecessary. We haven't found that to be an issue elsewhere
in the code where slurp_file is used. And in the present case we know
the file exists because we got its name from list_files().
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
However, if slurp_file fails it raises an exception and aborts the
whole TAP unexpectedly, which is pretty unclean. So I'd suggest to
keep the eval, as attached. I tested it by changing the file name so
that the slurp fails.Seem quite unnecessary. We haven't found that to be an issue elsewhere
in the code where slurp_file is used. And in the present case we know
the file exists because we got its name from list_files().
Fine with me!
--
Fabien.
On Sat, Jun 26, 2021 at 11:01:07AM -0400, Andrew Dunstan wrote:
On 6/26/21 2:47 AM, Fabien COELHO wrote:
However, if slurp_file fails it raises an exception and aborts the
whole TAP unexpectedly, which is pretty unclean. So I'd suggest to
keep the eval, as attached. I tested it by changing the file name so
that the slurp fails.Seem quite unnecessary. We haven't found that to be an issue elsewhere
in the code where slurp_file is used. And in the present case we know
the file exists because we got its name from list_files().
Agreed. That's an exchange between a hard failure mid-test and a
failure while letting the whole test run. Here, we expect the test to
find the log file all the time, so a hard failure does not sound like
a bad thing to me either.
--
Michael
Seem quite unnecessary. We haven't found that to be an issue elsewhere
in the code where slurp_file is used. And in the present case we know
the file exists because we got its name from list_files().Agreed. That's an exchange between a hard failure mid-test and a
failure while letting the whole test run. Here, we expect the test to
find the log file all the time, so a hard failure does not sound like
a bad thing to me either.
Ok, fine with me!
--
Fabien.