[PATCH] pgbench tap tests fail if the path contains a perl special character
Hi,
As mentioned in the subject, if the $PATH where you are building postgres
contains a perl's special character [1], for example a `+`, pgbench's tap
tests will fail.
The outputs looks something like this (full at [2]):
```
# Failed test 'file name format'
# at t/001_pgbench_with_server.pl line 805.
# Failed test 'file name format'
# at t/001_pgbench_with_server.pl line 805.
# Looks like you failed 2 tests of 312.
t/001_pgbench_with_server.pl ..
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/312 subtests
t/002_pgbench_no_server.pl .... ok
```
This error affects both PG11 and master, and a way to reproduce it is by
renaming the source folder from `postgresql` to `postgresql+XXX`.
I'm attaching a patch to fix this by using quotemeta [3]. It's generated
from master and also applies to REL_11_STABLE.
1 - http://jkorpela.fi/perl/regexp.html
2 - https://launchpadlibrarian.net/406634533/buildlog_ubuntu-xenial-amd64.postgresql-11_11.1.1+carto-2_BUILDING.txt.gz
3 - https://perldoc.perl.org/functions/quotemeta.html
Regards,
--
Raúl Marín Rodríguez
carto.com
Attachments:
=?UTF-8?B?UmHDumwgTWFyw61uIFJvZHLDrWd1ZXo=?= <rmrodriguez@carto.com> writes:
As mentioned in the subject, if the $PATH where you are building postgres
contains a perl's special character [1], for example a `+`, pgbench's tap
tests will fail.
Fun. But is that really the only place that fails?
I'm attaching a patch to fix this by using quotemeta [3].
This does not look right to me: glob has a different set of special
characters than regexes, no?
regards, tom lane
Hi,
Fun. But is that really the only place that fails?
Yes, other than this it builds flawlessly.
This does not look right to me: glob has a different set of special
characters than regexes, no?
The issue itself manifests in the grep call, not in glob, but I tried
using `quotemeta` only for grep without success (probably because I didn't
know how to properly call `quotemeta` over @logs.
This is the first time I do anything in perl beyond adding some simple tap
tests, so this could be me not knowing how to deal with it properly. I used
this small script to understand perl's behaviour:
```
$ cat a.pl
#!/usr/bin/perl
use strict;
use warnings;
my @prefix = '~/postgres+XXXX/001_pgbench_log_2.20745';
my @quoted_prefix = quotemeta(@prefix);
my @found1 = grep(/^@prefix\.\d+(\.\d+)?$/, "@prefix.20745");
my @found2 = grep(/^@prefix\.\d+(\.\d+)?$/, "@quoted_prefix.20745");
my @found3 = grep(/^@quoted_prefix\.\d+(\.\d+)?$/, "@prefix.20745");
my @found4 = grep(/^@quoted_prefix\.\d+(\.\d+)?$/, "@quoted_prefix.20745");
print "1: @found1\n";
print "2: @found2\n";
print "3: @found3\n";
print "4: @found4\n";
$ perl a.pl
1:
2:
3:
4: 1.20745
```
Regards,
--
Raúl Marín Rodríguez
carto.com
On 2019-Jan-18, Raúl MarÃn RodrÃguez wrote:
Hi,
Fun. But is that really the only place that fails?
Yes, other than this it builds flawlessly.
This does not look right to me: glob has a different set of special
characters than regexes, no?The issue itself manifests in the grep call, not in glob, but I tried
using `quotemeta` only for grep without success (probably because I didn't
know how to properly call `quotemeta` over @logs.
Perhaps you can do
@quoted = map { quotemeta($_) } @logs;
--
Ãlvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
This does not look right to me: glob has a different set of special
characters than regexes, no?
Perhaps you can do
@quoted = map { quotemeta($_) } @logs;
Actually, playing with it here, it seems that quotemeta() quotes
anything that's a special character for either purpose (in fact,
it looks like it quotes anything that's not alphanumeric).
So the patch as given should work.
regards, tom lane
I wrote:
So the patch as given should work.
... well, modulo the use of "@foo" where you should've used "$foo".
I don't know enough Perl to understand why that seemed to work, but
I do know it's not the right thing here.
Pushed with that correction.
regards, tom lane
Pushed with that correction.
Thanks a lot!
--
Raúl Marín Rodríguez
carto.com
=?UTF-8?B?UmHDumwgTWFyw61uIFJvZHLDrWd1ZXo=?= <rmrodriguez@carto.com> writes:
Pushed with that correction.
Thanks a lot!
Hm, so bowerbird (a Windows machine) has been failing the pgbench tests
since this went in, cf
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-01-20%2004%3A57%3A01
I'm just guessing, but I suspect that bowerbird is using a path spec that
includes a backslash directory separator and that's somehow bollixing
things. If so, we might be able to fix it by converting backslashes to
forward slashes before applying quotemeta().
It's also possible that on Windows, "glob" handles backslashes
differently than it does elsewhere, which would be harder to fix
--- that would bring back my original fear that the appropriate
quoting rules are different for "glob" than for regexes.
Andrew, any insight?
regards, tom lane
Hello Tom,
Hm, so bowerbird (a Windows machine) has been failing the pgbench tests
since this went in, cfhttps://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-01-20%2004%3A57%3A01
I'm just guessing, but I suspect that bowerbird is using a path spec that
includes a backslash directory separator and that's somehow bollixing
things.
This point is unclear from the log, where plain slashes are used in the
log prefix path, and furthermore no strange characters appear in the
log prefix path:
# Running: pgbench -n -S -t 50 -c 2 --log --log-prefix=G:/prog/bf/root/HEAD/pgsql.build/src/bin/pgbench/tmp_check/t_001_pgbench_with_server_main_data/001_pgbench_log_2 --sampling-rate=0.5
ok 345 - pgbench logs status (got 0 vs expected 0)
ok 346 - pgbench logs stdout /(?^:select only)/
ok 347 - pgbench logs stdout /(?^:processed: 100/100)/
ok 348 - pgbench logs stderr /(?^:^$)/
not ok 349 - number of log files
If so, we might be able to fix it by converting backslashes to
forward slashes before applying quotemeta().It's also possible that on Windows, "glob" handles backslashes differently than it does elsewhere, which would be harder to fix --- that would bring back my original fear that the appropriate quoting rules are different for "glob" than for regexes.
I'm afraid it could indeed be due to platform-specific behavior, so
testing on the target machine to understand the actual behavior would
help.
I just tested the current version on my laptop within a directory
containing spaces and other misc chars: Pgbench TAP tests are currently
broken in this context on master, and this may be used to a collection of
issues, not just one, eg pgbench function splits parameters on spaces
which breaks if there are spaces in bdir.
I'm going to investigate, possibly over next week-end, so please be
patient.
About windows-specific issues, from File::Glob man page:
""" On DOSISH systems, backslash is a valid directory separator character.
In this case, use of backslash as a quoting character (via GLOB_QUOTE)
interferes with the use of backslash as a directory separator. ... """
It seems to suggest that quotemeta backslash-on-nearly-everything approach
is not appropriate.
--
Fabien.
On 1/21/19 4:50 AM, Fabien COELHO wrote:
Hello Tom,
Hm, so bowerbird (a Windows machine) has been failing the pgbench tests
since this went in, cfhttps://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-01-20%2004%3A57%3A01
I'm just guessing, but I suspect that bowerbird is using a path spec
that
includes a backslash directory separator and that's somehow bollixing
things.This point is unclear from the log, where plain slashes are used in
the log prefix path, and furthermore no strange characters appear in
the log prefix path:# Running: pgbench -n -S -t 50 -c 2 --log
--log-prefix=G:/prog/bf/root/HEAD/pgsql.build/src/bin/pgbench/tmp_check/t_001_pgbench_with_server_main_data/001_pgbench_log_2
--sampling-rate=0.5
ok 345 - pgbench logs status (got 0 vs expected 0)
ok 346 - pgbench logs stdout /(?^:select only)/
ok 347 - pgbench logs stdout /(?^:processed: 100/100)/
ok 348 - pgbench logs stderr /(?^:^$)/
not ok 349 - number of log filesIf so, we might be able to fix it by converting backslashes to
forward slashes before applying quotemeta().It's also possible that on Windows, "glob" handles backslashes differently than it does elsewhere, which would be harder to fix --- that would bring back my original fear that the appropriate quoting rules are different for "glob" than for regexes.I'm afraid it could indeed be due to platform-specific behavior, so
testing on the target machine to understand the actual behavior would
help.I just tested the current version on my laptop within a directory
containing spaces and other misc chars: Pgbench TAP tests are
currently broken in this context on master, and this may be used to a
collection of issues, not just one, eg pgbench function splits
parameters on spaces which breaks if there are spaces in bdir.I'm going to investigate, possibly over next week-end, so please be
patient.About windows-specific issues, from File::Glob man page:
""" On DOSISH systems, backslash is a valid directory separator
character. In this case, use of backslash as a quoting character (via
GLOB_QUOTE) interferes with the use of backslash as a directory
separator. ... """It seems to suggest that quotemeta backslash-on-nearly-everything
approach is not appropriate.
File globbing is not the same thing as regex matching. For example, '.'
is a literal character in a glob pattern but a metacharacter in a regex,
while ' ' is a literal character in a regex but a globbing metacharacter
(but see below), and '*' is a metacharacter in both but has different
meanings. quotemeta is intended for regexes but being used here on an
expression used for globbing.
Perhaps it would be OK we changed back the glob line to use $prefix
instead of $qprefix, but kept the use of $qprefix in the later regex.
To deal with the issue of spaces in file names (not an issue eher ob
bowerbird), we should consider adding this:
use File::Glob ':bsd_glob';
This removes the globbing metcharacter nature of the space character,
although it might have other effects that cause pain. See `perldoc
File::Glob`
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
Perhaps it would be OK we changed back the glob line to use $prefix
instead of $qprefix, but kept the use of $qprefix in the later regex.
Well, glob does have some metacharacters, so not doing any quoting
at all still leaves us with failure modes.
I did a little bit of googling on this subject last night, and it
seems that at least some people believe that the answer is to not
use glob, period, but read the directory for yourself. Maybe there's
a better way, but the breakage around backslashes doesn't really
leave me hopeful.
As a short-term move to un-break the buildfarm, I'm just going to
revert that patch altogether. We can reapply it once we've
figured out how to do the glob part correctly.
regards, tom lane
Hello Andrew,
About windows-specific issues, from File::Glob man page:
""" On DOSISH systems, backslash is a valid directory separator
character. In this case, use of backslash as a quoting character (via
GLOB_QUOTE) interferes with the use of backslash as a directory
separator. ... """It seems to suggest that quotemeta backslash-on-nearly-everything
approach is not appropriate.File globbing is not the same thing as regex matching. For example, '.'
is a literal character in a glob pattern but a metacharacter in a regex,
while ' ' is a literal character in a regex but a globbing metacharacter
(but see below), and '*' is a metacharacter in both but has different
meanings. quotemeta is intended for regexes but being used here on an
expression used for globbing.Perhaps it would be OK we changed back the glob line to use $prefix
instead of $qprefix, but kept the use of $qprefix in the later regex.
Yep, possibly. I'll have to test, though.
To deal with the issue of spaces in file names (not an issue eher ob
bowerbird), we should consider adding this:Â Â Â use File::Glob ':bsd_glob';
I was planning that so that the behavior is the same on all systems.
This removes the globbing metcharacter nature of the space character,
although it might have other effects that cause pain. See `perldoc
File::Glob`
Yep. Thanks for doing my homework:-) I'll test starting with these
options.
--
Fabien.
Hello Tom,
Well, glob does have some metacharacters, so not doing any quoting
at all still leaves us with failure modes.I did a little bit of googling on this subject last night, and it
seems that at least some people believe that the answer is to not
use glob, period, but read the directory for yourself.
Yep, would work, it can then be filtered with standard regexpr, although
it will take a few lines (opendir/grep on readdir/closedir).
Maybe there's a better way, but the breakage around backslashes doesn't
really leave me hopeful.
Indeed. I was thinking of defining my own quote function, but without
being convinced that it is such a good idea.
As a short-term move to un-break the buildfarm, I'm just going to
revert that patch altogether.
I was considering to suggest that, so I'm ok with that.
We can reapply it once we've figured out how to do the glob part
correctly.
Yep. No need to spend time much on that, I'll have a look at it, although
not right now, allow me a few days.
--
Fabien.
On 1/21/19 11:18 AM, Fabien COELHO wrote:
Hello Tom,
Well, glob does have some metacharacters, so not doing any quoting
at all still leaves us with failure modes.I did a little bit of googling on this subject last night, and it
seems that at least some people believe that the answer is to not
use glob, period, but read the directory for yourself.Yep, would work, it can then be filtered with standard regexpr,
although it will take a few lines (opendir/grep on readdir/closedir).
Sure, probably the best solution. Given that Perl has
opendir/readdir/closedir, it should be only a handful of lines.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Tom,
I did a little bit of googling on this subject last night, and it
seems that at least some people believe that the answer is to not
use glob, period, but read the directory for yourself.As a short-term move to un-break the buildfarm, I'm just going to
revert that patch altogether.We can reapply it once we've figured out how to do the glob part correctly.
Here is a proposal patch which:
- works around pgbench command splitting on spaces,
if postgres sources are in a strangely named directoryâ¦
I tested within a directory named "pg .* dir".
- works aroung "glob" lack of portability by reading the directory
directly and filtering with a standard re (see list_files).
- adds a few comments
- removes a spurious i option on an empty re
--
Fabien.
Attachments:
Hi,
- works around pgbench command splitting on spaces,
if postgres sources are in a strangely named directory…
I tested within a directory named "pg .* dir".
I've also tested it with the initial case (including a +) and it works.
--
Raúl Marín Rodríguez
carto.com
- works around pgbench command splitting on spaces,
if postgres sources are in a strangely named directoryâ¦
I tested within a directory named "pg .* dir".I've also tested it with the initial case (including a +) and it works.
Good, thanks for checking!
--
Fabien.
Hi,
Any way I can help to get this commited? Should we add this to the commitfest?
Regards,
--
Raúl Marín Rodríguez
carto.com
On 8 Feb 2019, at 13:33, Raúl Marín Rodríguez <rmrodriguez@carto.com> wrote:
Any way I can help to get this commited? Should we add this to the commitfest?
Please do, it’s a good way to ensure that the patch is tracked and not
forgotten about.
cheers ./daniel
The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested
Tested `pgbench-tap-glob-fix-1.patch` with a "+" in the path and the tap tests now pass.
On 2/8/19 3:30 PM, Raúl MarÃn wrote:
The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: not tested
Spec compliant: not tested
Documentation: not testedTested `pgbench-tap-glob-fix-1.patch` with a "+" in the path and the tap tests now pass.
Is this now ready for committer or is there anything else Fabien needs
to do?
Marking the patch as either Ready for Committer or Waiting on Author
will help move things along.
Regards,
--
-David
david@pgmasters.net
Marking the patch as either Ready for Committer or Waiting on Author
will help move things along.
Thanks for the heads-up, I've retested it again and it applies cleanly on
top of master and addresses the issue.
I've moved it to `Ready for Committer`.
--
Raúl Marín Rodríguez
carto.com