pg_regress cleans up tablespace twice.
Hello.
I saw a failure of vcregress check with the following message several
times, on a machine under a heavy load and maybe with realtime virus
scanning.
pg_regress: could not create directory ".../testtablespace": Permission denied.
I found that pg_regress repeats the sequence
rmtree(tablespace)->make_directory(tablespace) twice under
initialize_environment. So it should be THE DELETE_PENDING. It is
because the code is in convert_sourcefiles_in, which is called
succssively twice in convert_sourcefiles.
But in the first place it comes from [1]/messages/by-id/11718.1200684807@sss.pgh.pa.us and the comment says:
* XXX it would be better if pg_regress.c had nothing at all to do with
* testtablespace, and this were handled by a .BAT file or similar on
* Windows. See pgsql-hackers discussion of 2008-01-18.
Is there any reason not to do that in vcregress.pl? I think the
commands other than 'check' don't needs this.
[1]: /messages/by-id/11718.1200684807@sss.pgh.pa.us
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
vcregress_cleanup_testtablespace.patchtext/x-patch; charset=us-asciiDownload+7-22
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
But in the first place it comes from [1] and the comment says:
* XXX it would be better if pg_regress.c had nothing at all to do with
* testtablespace, and this were handled by a .BAT file or similar on
* Windows. See pgsql-hackers discussion of 2008-01-18.
Is there any reason not to do that in vcregress.pl? I think the
commands other than 'check' don't needs this.
I think the existing coding dates from before we had a Perl driver for
this, or else we had it but there were other less-functional ways to
replace "make check" on Windows. +1 for taking the code out of
pg_regress.c --- but I'm not in a position to say whether the other
part of your patch is sufficient.
regards, tom lane
On Wed, Feb 19, 2020 at 04:06:33PM -0500, Tom Lane wrote:
I think the existing coding dates from before we had a Perl driver for
this, or else we had it but there were other less-functional ways to
replace "make check" on Windows. +1 for taking the code out of
pg_regress.c --- but I'm not in a position to say whether the other
part of your patch is sufficient.
Removing this code from pg_regress.c makes also sense to me. Now, the
patch breaks "vcregress installcheck" as this is missing to patch
installcheck_internal() for the tablespace path creation. I would
also recommend using a full path for the directory location to avoid
any potential issues if this code is refactored or moved around, the
patch now relying on the current path used.
--
Michael
At Thu, 20 Feb 2020 14:23:22 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Wed, Feb 19, 2020 at 04:06:33PM -0500, Tom Lane wrote:
I think the existing coding dates from before we had a Perl driver for
this, or else we had it but there were other less-functional ways to
replace "make check" on Windows. +1 for taking the code out of
pg_regress.c --- but I'm not in a position to say whether the other
part of your patch is sufficient.Removing this code from pg_regress.c makes also sense to me. Now, the
patch breaks "vcregress installcheck" as this is missing to patch
installcheck_internal() for the tablespace path creation. I would
also recommend using a full path for the directory location to avoid
any potential issues if this code is refactored or moved around, the
patch now relying on the current path used.
Hmm. Right. I confused database directory and tablespace
directory. Tablespace directory should be provided by the test script,
even though the database directory is preexisting in installcheck. To
avoid useless future failure, I would do that that for all
subcommands, as regress/GNUmakefile does.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Fri, 21 Feb 2020 17:05:07 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Thu, 20 Feb 2020 14:23:22 +0900, Michael Paquier <michael@paquier.xyz> wrote in
Removing this code from pg_regress.c makes also sense to me. Now, the
patch breaks "vcregress installcheck" as this is missing to patch
installcheck_internal() for the tablespace path creation. I would
also recommend using a full path for the directory location to avoid
any potential issues if this code is refactored or moved around, the
patch now relying on the current path used.Hmm. Right. I confused database directory and tablespace
directory. Tablespace directory should be provided by the test script,
even though the database directory is preexisting in installcheck. To
avoid useless future failure, I would do that that for all
subcommands, as regress/GNUmakefile does.
Tablespace directory cleanup is not done for all testing
targets. Actually it is not done for the tools under bin/ except
pg_upgrade.
On the other hand, it was done every by pg_regress run for Windows
build. So I made vcregress.pl do the same with that. Spcecifically to
set up tablespace always before pg_regress is executed.
There is a place where --outputdir is specified for pg_regress,
pg_upgrade/test.sh. It is explained as the follows.
# Send installcheck outputs to a private directory. This avoids conflict when
# check-world runs pg_upgrade check concurrently with src/test/regress check.
# To retrieve interesting files after a run, use pattern tmp_check/*/*.diffs.
outputdir="$temp_root/regress"
EXTRA_REGRESS_OPTS="$EXTRA_REGRESS_OPTS --outputdir=$outputdir"
Where the $temp_root is $(TOP)/src/bin/pg_upgrade/tmp_check/regress.
Thus the current regress/GNUMakefile does break this consideration and
the current vc_regress (of Windows build) does the right thing in the
light of the comment. Don't we need to avoid cleaning up
"$(TOP)/src/test/regress/tablesapce" in that case? (the second patch
attached)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, May 11, 2020 at 05:13:54PM +0900, Kyotaro Horiguchi wrote:
Tablespace directory cleanup is not done for all testing
targets. Actually it is not done for the tools under bin/ except
pg_upgrade.
Let's first take one problem at a time, as I can see that your patch
0002 is modifying a portion of what you added in 0001, and so let's
try to remove this WIN32 stuff from pg_regress.c.
+sub CleanupTablespaceDirectory
+{
+ my $tablespace = 'testtablespace';
+
+ rmtree($tablespace) if (-e $tablespace);
+ mkdir($tablespace);
+}
This check should use "-d" and not "-e" as it would be true for a file
as well. Also, in pg_regress.c, we remove the existing tablespace
test directory in --outputdir, which is "." by default but it can be a
custom one. Shouldn't you do the same logic in this new routine? So
we should have an optional argument for the output directory that
defaults to `pwd` if not defined, no? This means passing down the
argument only for upgradecheck() in vcregress.pl.
sub isolationcheck
{
chdir "../isolation";
+ CleanupTablespaceDirectory();
copy("../../../$Config/isolationtester/isolationtester.exe",
"../../../$Config/pg_isolation_regress");
my @args = (
[...]
print "============================================================\n";
print "Checking $module\n";
+ CleanupTablespaceDirectory();
my @args = (
"$topdir/$Config/pg_regress/pg_regress",
"--bindir=${topdir}/${Config}/psql",
I would put that just before the system() calls for consistency with
the rest.
--
Michael
On Fri, May 15, 2020 at 11:58:55AM +0900, Michael Paquier wrote:
Let's first take one problem at a time, as I can see that your patch
0002 is modifying a portion of what you added in 0001, and so let's
try to remove this WIN32 stuff from pg_regress.c.
(Please note that this is not v13 material.)
--
Michael
Thank you for looking this!
At Fri, 15 May 2020 11:58:55 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Mon, May 11, 2020 at 05:13:54PM +0900, Kyotaro Horiguchi wrote:
Tablespace directory cleanup is not done for all testing
targets. Actually it is not done for the tools under bin/ except
pg_upgrade.Let's first take one problem at a time, as I can see that your patch
0002 is modifying a portion of what you added in 0001, and so let's
try to remove this WIN32 stuff from pg_regress.c.
Yes, 0001 and 0001+0002 are alternatives. They should be merged if we
are going to fix the pg_upgrade test. I take this as we go on 0001+0002.
+sub CleanupTablespaceDirectory +{ + my $tablespace = 'testtablespace'; + + rmtree($tablespace) if (-e $tablespace); + mkdir($tablespace); +} This check should use "-d" and not "-e" as it would be true for a file as well. Also, in pg_regress.c, we remove the existing tablespace
That was intentional so that a file with the name don't stop
testing. Actually pg_regress is checking only for a directory in other
place and it's not that bad since no-one can create a file with that
name while running test. On the other hand, is there any reason for
refraining from removing if it weren't a directory but a file?
Changed to -d in the attached.
as well. Also, in pg_regress.c, we remove the existing tablespace
test directory in --outputdir, which is "." by default but it can be a
custom one. Shouldn't you do the same logic in this new routine? So
we should have an optional argument for the output directory that
defaults to `pwd` if not defined, no? This means passing down the
argument only for upgradecheck() in vcregress.pl.
I thought of that but didn't in the patch. I refrained from doing
that because the output directory is dedicatedly created at the only
place (pg_upgrade test) where the --outputdir is specified. (I think I
tend to do too-much.)
It is easy in perl scripts, but rather complex for makefiles. The
attached is using a perl one-liner to extract outputdir from
EXTRA_REGRESS_OPTS. I don't like that but I didn't come up with better
alternatives. On the other hand ActivePerl (with default
installation) doesn't seem to know Getopt::Long::GetOptions and
friends. In the attached vcregress.pl parses --outputdir not using
GetOpt::Long...
sub isolationcheck
{
chdir "../isolation";
+ CleanupTablespaceDirectory();
copy("../../../$Config/isolationtester/isolationtester.exe",
"../../../$Config/pg_isolation_regress");
my @args = (
[...]
print "============================================================\n";
print "Checking $module\n";
+ CleanupTablespaceDirectory();
my @args = (
"$topdir/$Config/pg_regress/pg_regress",
"--bindir=${topdir}/${Config}/psql",
I would put that just before the system() calls for consistency with
the rest.
Right. That's just an mistake. Fixed along with subdircheck.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Move-tablespace-cleanup-out-of-pg_regress.patchtext/x-patch; charset=us-asciiDownload+29-25
On Fri, May 15, 2020 at 05:25:08PM +0900, Kyotaro Horiguchi wrote:
I thought of that but didn't in the patch. I refrained from doing
that because the output directory is dedicatedly created at the only
place (pg_upgrade test) where the --outputdir is specified. (I think I
tend to do too-much.)
So, I have reviewed the patch aimed at removing the cleanup of
testtablespace done with WIN32, and finished with the attached to
clean up things. I simplified the logic, to not have to parse
REGRESS_OPTS for --outputdir (no need for a regex, leaving
EXTRA_REGRESS_OPTS alone), and reworked the code so as the tablespace
cleanup only happens only where we need to: check, installcheck and
upgradecheck. No need for that with contribcheck, modulescheck,
plcheck and ecpgcheck.
Note that after I changed my patch, this converged with a portion of
patch 0002 you have posted here:
/messages/by-id/20200511.171354.514381788845037011.horikyota.ntt@gmail.com
Now about 0002, I tend to agree that we should try to do something
about pg_upgrade test creating removing and then creating an extra
testtablespace path that is not necessary as pg_upgrade test uses its
own --outputdir. I have not actually seen this stuff being a problem
in practice as the main regression test suite runs first, largely
before pg_upgrade test even with parallel runs so they have a low
probability of conflict. I'll try to think about a couple of options,
one of them I have in mind now being that we could finally switch the
upgrade tests to TAP and let test.sh go to the void. This is an
independent problem, so let's tackle both issues separately.
--
Michael
Attachments:
win32-testtablespace-v3.patchtext/x-diff; charset=us-asciiDownload+14-23
Thanks for working on this.
At Wed, 17 Jun 2020 16:12:07 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Fri, May 15, 2020 at 05:25:08PM +0900, Kyotaro Horiguchi wrote:
I thought of that but didn't in the patch. I refrained from doing
that because the output directory is dedicatedly created at the only
place (pg_upgrade test) where the --outputdir is specified. (I think I
tend to do too-much.)So, I have reviewed the patch aimed at removing the cleanup of
testtablespace done with WIN32, and finished with the attached to
clean up things. I simplified the logic, to not have to parse
REGRESS_OPTS for --outputdir (no need for a regex, leaving
EXTRA_REGRESS_OPTS alone), and reworked the code so as the tablespace
cleanup only happens only where we need to: check, installcheck and
upgradecheck. No need for that with contribcheck, modulescheck,
plcheck and ecpgcheck.
It look good to me as the Windows part. I agree that vcregress.pl
don't need to parse EXTRA_REGRESS_OPTS by allowing a bit more tight
bond between the caller sites of pg_regress and pg_regress.
Note that after I changed my patch, this converged with a portion of
patch 0002 you have posted here:
/messages/by-id/20200511.171354.514381788845037011.horikyota.ntt@gmail.comNow about 0002, I tend to agree that we should try to do something
about pg_upgrade test creating removing and then creating an extra
testtablespace path that is not necessary as pg_upgrade test uses its
own --outputdir. I have not actually seen this stuff being a problem
in practice as the main regression test suite runs first, largely
before pg_upgrade test even with parallel runs so they have a low
probability of conflict. I'll try to think about a couple of options,
Agreed on probability.
one of them I have in mind now being that we could finally switch the
upgrade tests to TAP and let test.sh go to the void. This is an
independent problem, so let's tackle both issues separately.
Chaning to TAP sounds nice as a goal.
As the next step we need to amend GNUmakefile not to cleanup
tablespace for the four test items. Then somehow treat tablespaces at
non-default place?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Jun 17, 2020 at 05:02:31PM +0900, Kyotaro Horiguchi wrote:
It look good to me as the Windows part. I agree that vcregress.pl
don't need to parse EXTRA_REGRESS_OPTS by allowing a bit more tight
bond between the caller sites of pg_regress and pg_regress.
Thanks, applied this part to HEAD then after more testing.
Chaining to TAP sounds nice as a goal.
I submitted a patch for that, but we had no clear agreements about how
to handle major upgrades, as this involves a somewhat large
refactoring of PostgresNode.pm so as you register a path to the
binaries used by a given node registered.
As the next step we need to amend GNUmakefile not to cleanup
tablespace for the four test items. Then somehow treat tablespaces at
non-default place?
Ah, you mean to not reset testtablespace where that's not necessary in
the tests by reworking the rules? Yeah, perhaps we could do something
like that. Not sure yet how to shape that in term of code but if you
have a clear idea, please feel free to submit it. I think that this
may be better if discussed on a different thread though.
--
Michael
On Thu, Jun 18, 2020 at 1:42 PM Michael Paquier <michael@paquier.xyz> wrote:
Thanks, applied this part to HEAD then after more testing.
Hmm, somehow this (well I guess it's this commit based on timing and
the area it touches, not sure exactly why) made cfbot's Windows build
fail, like this:
--- C:/projects/postgresql/src/test/regress/expected/tablespace.out
2020-06-19 21:26:24.661817000 +0000
+++ C:/projects/postgresql/src/test/regress/results/tablespace.out
2020-06-19 21:26:28.613257500 +0000
@@ -2,83 +2,78 @@
CREATE TABLESPACE regress_tblspacewith LOCATION
'C:/projects/postgresql/src/test/regress/testtablespace' WITH
(some_nonexistent_parameter = true); -- fail
ERROR: unrecognized parameter "some_nonexistent_parameter"
CREATE TABLESPACE regress_tblspacewith LOCATION
'C:/projects/postgresql/src/test/regress/testtablespace' WITH
(random_page_cost = 3.0); -- ok
+ERROR: could not set permissions on directory
"C:/projects/postgresql/src/test/regress/testtablespace": Permission
denied
Any ideas? Here's what it does:
On Sat, Jun 20, 2020 at 09:33:26AM +1200, Thomas Munro wrote:
Hmm, somehow this (well I guess it's this commit based on timing and
the area it touches, not sure exactly why) made cfbot's Windows build
fail, like this:--- C:/projects/postgresql/src/test/regress/expected/tablespace.out 2020-06-19 21:26:24.661817000 +0000 +++ C:/projects/postgresql/src/test/regress/results/tablespace.out 2020-06-19 21:26:28.613257500 +0000 @@ -2,83 +2,78 @@ CREATE TABLESPACE regress_tblspacewith LOCATION 'C:/projects/postgresql/src/test/regress/testtablespace' WITH (some_nonexistent_parameter = true); -- fail ERROR: unrecognized parameter "some_nonexistent_parameter" CREATE TABLESPACE regress_tblspacewith LOCATION 'C:/projects/postgresql/src/test/regress/testtablespace' WITH (random_page_cost = 3.0); -- ok +ERROR: could not set permissions on directory "C:/projects/postgresql/src/test/regress/testtablespace": Permission deniedAny ideas? Here's what it does:
I am not sure, and I am not really familiar with this stuff. Your
code does a simple vcregress check, and that should take care of
automatically cleaning up the testtablespace path. The buildfarm uses
this code for MSVC builds and does not complain, nor do my own VMs
complain. A difference in the processing after 2b2a070d is that the
tablespace cleanup/creation does not happen while holding a restricted
token [1]https://docs.microsoft.com/en-us/windows/win32/secauthz/restricted-tokens -- Michael anymore because it got out of pg_regress.c. Are there any
kind of restrictions applied to the user running appveyor on Windows?
[1]: https://docs.microsoft.com/en-us/windows/win32/secauthz/restricted-tokens -- Michael
--
Michael
On Sat, Jun 20, 2020 at 2:42 PM Michael Paquier <michael@paquier.xyz> wrote:
+ERROR: could not set permissions on directory
"C:/projects/postgresql/src/test/regress/testtablespace": Permission
deniedAny ideas? Here's what it does:
I am not sure, and I am not really familiar with this stuff. Your
code does a simple vcregress check, and that should take care of
automatically cleaning up the testtablespace path. The buildfarm uses
this code for MSVC builds and does not complain, nor do my own VMs
complain. A difference in the processing after 2b2a070d is that the
tablespace cleanup/creation does not happen while holding a restricted
token [1] anymore because it got out of pg_regress.c. Are there any
kind of restrictions applied to the user running appveyor on Windows?
Thanks for the clue. Appveyor runs your build script as a privileged
user (unlike, I assume, the build farm animals), and that has caused a
problem with this test in the past, though I don't know the details.
I might go and teach it to skip that test until a fix can be found.
On Sat, Jun 20, 2020 at 03:01:36PM +1200, Thomas Munro wrote:
Thanks for the clue. Appveyor runs your build script as a privileged
user (unlike, I assume, the build farm animals), and that has caused a
problem with this test in the past, though I don't know the details.
I might go and teach it to skip that test until a fix can be found.
Thanks, I was not aware of that. Is it a fix that involves your code
or something else? How long do you think it would take to address
that? Another strategy that we could do is also a revert of 2b2a070
for now to allow the cfbot to go through and then register this thread
in the CF app to allow the bot to pick it up and test it, so as there
is more room to get a fix. The next CF is in ten days, so it would be
annoying to reduce the automatic test coverage the cfbot provides :/
--
Michael
On Sat, Jun 20, 2020 at 6:46 PM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Jun 20, 2020 at 03:01:36PM +1200, Thomas Munro wrote:
Thanks for the clue. Appveyor runs your build script as a privileged
user (unlike, I assume, the build farm animals), and that has caused a
problem with this test in the past, though I don't know the details.
I might go and teach it to skip that test until a fix can be found.Thanks, I was not aware of that. Is it a fix that involves your code
or something else? How long do you think it would take to address
that? Another strategy that we could do is also a revert of 2b2a070
for now to allow the cfbot to go through and then register this thread
in the CF app to allow the bot to pick it up and test it, so as there
is more room to get a fix. The next CF is in ten days, so it would be
annoying to reduce the automatic test coverage the cfbot provides :/
I'm not sure what needs to change, but in the meantime I told it to
comment out the offending test from the schedule files:
+before_test:
+ - 'perl -p -i.bak -e "s/^test: tablespace/#test: tablespace/"
src/test/regress/serial_schedule'
+ - 'perl -p -i.bak -e "s/^test: tablespace/#test: tablespace/"
src/test/regress/parallel_schedule'
Now the results are slowly turning green again.
On Sun, Jun 21, 2020 at 12:08:37PM +1200, Thomas Munro wrote:
I'm not sure what needs to change, but in the meantime I told it to
comment out the offending test from the schedule files:+before_test: + - 'perl -p -i.bak -e "s/^test: tablespace/#test: tablespace/" src/test/regress/serial_schedule' + - 'perl -p -i.bak -e "s/^test: tablespace/#test: tablespace/" src/test/regress/parallel_schedule'Now the results are slowly turning green again.
Thanks, and sorry for the trouble. What actually happened back in
2018? I can see c2ff3c68 in the git history of the cfbot code, but it
does not give much details.
--
Michael
On Sun, Jun 21, 2020 at 8:42 PM Michael Paquier <michael@paquier.xyz> wrote:
On Sun, Jun 21, 2020 at 12:08:37PM +1200, Thomas Munro wrote:
I'm not sure what needs to change, but in the meantime I told it to
comment out the offending test from the schedule files:+before_test: + - 'perl -p -i.bak -e "s/^test: tablespace/#test: tablespace/" src/test/regress/serial_schedule' + - 'perl -p -i.bak -e "s/^test: tablespace/#test: tablespace/" src/test/regress/parallel_schedule'Now the results are slowly turning green again.
Thanks, and sorry for the trouble. What actually happened back in
2018? I can see c2ff3c68 in the git history of the cfbot code, but it
does not give much details.
commit ce5d3424d6411f7a7228fd4463242cb382af3e0c
Author: Andrew Dunstan <andrew@dunslane.net>
Date: Sat Oct 20 09:02:36 2018 -0400
Lower privilege level of programs calling regression_main
On Windows this mean that the regression tests can now safely and
successfully run as Administrator, which is useful in situations like
Appveyor. Elsewhere it's a no-op.
Backpatch to 9.5 - this is harder in earlier branches and not worth the
trouble.
Discussion:
/messages/by-id/650b0c29-9578-8571-b1d2-550d7f89f307@2ndQuadrant.com
On Sun, Jun 21, 2020 at 10:38:22PM +1200, Thomas Munro wrote:
On Sun, Jun 21, 2020 at 8:42 PM Michael Paquier <michael@paquier.xyz> wrote:
Thanks, and sorry for the trouble. What actually happened back in
2018? I can see c2ff3c68 in the git history of the cfbot code, but it
does not give much details.commit ce5d3424d6411f7a7228fd4463242cb382af3e0c
Author: Andrew Dunstan <andrew@dunslane.net>
Date: Sat Oct 20 09:02:36 2018 -0400Lower privilege level of programs calling regression_main
On Windows this mean that the regression tests can now safely and
successfully run as Administrator, which is useful in situations like
Appveyor. Elsewhere it's a no-op.Backpatch to 9.5 - this is harder in earlier branches and not worth the
trouble.Discussion:
/messages/by-id/650b0c29-9578-8571-b1d2-550d7f89f307@2ndQuadrant.com
Thanks for the reference. This also means that as much as I'd like to
keep the recreation of testtablespace out of pg_regress for
consistency, 2b2a070 has also broken a case we have claimed to support
since ce5d342.
A bit of digging around I have found this case from a guy of Yandex,
visibly running our regression test suite:
https://help.appveyor.com/discussions/questions/1888-running-tests-with-reduced-privileges
And the conclusion seems like it is not really possible to do that
within appveyor, using a trick with openssh to manipulate privileges
as wanted, as referenced here:
https://github.com/yandex-qatools/postgresql-embedded
At the end of the day, it looks more simple to me to just revert
2b2a070 if we just want to keep your stuff running without extra
workload from your side. Extra opinions are welcome.
--
Michael
Thomas Munro <thomas.munro@gmail.com> writes:
On Thu, Jun 18, 2020 at 1:42 PM Michael Paquier <michael@paquier.xyz> wrote:
Thanks, applied this part to HEAD then after more testing.
Hmm, somehow this (well I guess it's this commit based on timing and
the area it touches, not sure exactly why) made cfbot's Windows build
fail, like this:
Should now be possible to undo whatever hack you had to use ...
regards, tom lane