010_pg_basebackup.pl vs multiple filesystems

Started by Andres Freundover 1 year ago9 messages
#1Andres Freund
andres@anarazel.de

Hi,

While working on [1]/messages/by-id/CA+OCxowQhMHFNRLTsXNuJpC96KRtSPHYKJuOS=b-Zrwmy-P4-g@mail.gmail.com I encountered the issue that, on github-actions,
010_pg_basebackup.pl fails on windows.

The reason for that is that github actions uses two drives, with TMP/TEMP
located on c:, the tested code on d:. This causes the following code to fail:

# Create a temporary directory in the system location.
my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

# On Windows use the short location to avoid path length issues.
# Elsewhere use $tempdir to avoid file system boundary issues with moving.
my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
or BAIL_OUT "could not move $pgdata/pg_replslot";
dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
or BAIL_OUT "could not symlink to $pgdata/pg_replslot";

It's not possible to move (vs copy) a file between two filesystems, on most
operating systems. Which thus leads to:

[23:02:03.114](0.288s) Bail out! could not move D:\a\winpgbuild\winpgbuild\postgresql-17beta2\build/testrun/pg_basebackup/010_pg_basebackup\data/t_010_pg_basebackup_main_data/pgdata/pg_replslot

This logic was added in

commit e213de8e785aac4e2ebc44282b8dc0fcc74834e8
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2023-07-08 11:21:58 -0400

Use shorter location for pg_replslot in pg_basebackup test

and revised in

commit e9f15bc9db7564a29460d089c0917590bc13fffc
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2023-07-08 12:34:25 -0400

Fix tmpdir issues with commit e213de8e78

The latter deals with precisely this issue, for !windows.

I've temporarily worked around this by setting TMP/TEMP to something else, but
ISTM we need a better solution.

Greetings,

Andres Freund

[1]: /messages/by-id/CA+OCxowQhMHFNRLTsXNuJpC96KRtSPHYKJuOS=b-Zrwmy-P4-g@mail.gmail.com
[2]: /messages/by-id/666ac55b-3400-fb2c-2cea-0281bf36a53c@dunslane.net

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#1)
Re: 010_pg_basebackup.pl vs multiple filesystems

On 2024-07-07 Su 3:02 AM, Andres Freund wrote:

Hi,

While working on [1] I encountered the issue that, on github-actions,
010_pg_basebackup.pl fails on windows.

The reason for that is that github actions uses two drives, with TMP/TEMP
located on c:, the tested code on d:. This causes the following code to fail:

# Create a temporary directory in the system location.
my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

# On Windows use the short location to avoid path length issues.
# Elsewhere use $tempdir to avoid file system boundary issues with moving.
my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
or BAIL_OUT "could not move $pgdata/pg_replslot";
dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
or BAIL_OUT "could not symlink to $pgdata/pg_replslot";

It's not possible to move (vs copy) a file between two filesystems, on most
operating systems. Which thus leads to:

[23:02:03.114](0.288s) Bail out! could not move D:\a\winpgbuild\winpgbuild\postgresql-17beta2\build/testrun/pg_basebackup/010_pg_basebackup\data/t_010_pg_basebackup_main_data/pgdata/pg_replslot

This logic was added in

commit e213de8e785aac4e2ebc44282b8dc0fcc74834e8
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2023-07-08 11:21:58 -0400

Use shorter location for pg_replslot in pg_basebackup test

and revised in

commit e9f15bc9db7564a29460d089c0917590bc13fffc
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2023-07-08 12:34:25 -0400

Fix tmpdir issues with commit e213de8e78

The latter deals with precisely this issue, for !windows.

I've temporarily worked around this by setting TMP/TEMP to something else, but
ISTM we need a better solution.

I'll be happy to hear of one. I agree it's a mess.  Maybe we could test
that the temp directory is on the same device on Windows and skip the
test if not? You could still get the test to run by setting TMPDIR
and/or friends.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#2)
Re: 010_pg_basebackup.pl vs multiple filesystems

On 2024-07-07 Su 7:28 AM, Andrew Dunstan wrote:

On 2024-07-07 Su 3:02 AM, Andres Freund wrote:

Hi,

While working on [1] I encountered the issue that, on github-actions,
010_pg_basebackup.pl fails on windows.

The reason for that is that github actions uses two drives, with
TMP/TEMP
located on c:, the tested code on d:.  This causes the following code
to fail:

   # Create a temporary directory in the system location.
   my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

   # On Windows use the short location to avoid path length issues.
   # Elsewhere use $tempdir to avoid file system boundary issues with
moving.
   my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

   rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
     or BAIL_OUT "could not move $pgdata/pg_replslot";
   dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
     or BAIL_OUT "could not symlink to $pgdata/pg_replslot";

It's not possible to move (vs copy) a file between two filesystems,
on most
operating systems. Which thus leads to:

[23:02:03.114](0.288s) Bail out!  could not move
D:\a\winpgbuild\winpgbuild\postgresql-17beta2\build/testrun/pg_basebackup/010_pg_basebackup\data/t_010_pg_basebackup_main_data/pgdata/pg_replslot

This logic was added in

commit e213de8e785aac4e2ebc44282b8dc0fcc74834e8
Author: Andrew Dunstan <andrew@dunslane.net>
Date:   2023-07-08 11:21:58 -0400

     Use shorter location for pg_replslot in pg_basebackup test

and revised in

commit e9f15bc9db7564a29460d089c0917590bc13fffc
Author: Andrew Dunstan <andrew@dunslane.net>
Date:   2023-07-08 12:34:25 -0400

     Fix tmpdir issues with commit e213de8e78

The latter deals with precisely this issue, for !windows.

I've temporarily worked around this by setting TMP/TEMP to something
else, but
ISTM we need a better solution.

I'll be happy to hear of one. I agree it's a mess.  Maybe we could
test that the temp directory is on the same device on Windows and skip
the test if not? You could still get the test to run by setting TMPDIR
and/or friends.

Maybe we should just not try to rename the directory. Looking at the
test I'm pretty sure the directory should be empty. Instead of trying to
move it, let's just remove it, and recreate it in the tmp location.

Something like

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl 
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 489dde4adf..c0c334c6fc 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -363,8 +363,8 @@ my $sys_tempdir = 
PostgreSQL::Test::Utils::tempdir_short;
  # Elsewhere use $tempdir to avoid file system boundary issues with moving.
  my $tmploc = $windows_os ? $sys_tempdir : $tempdir;
-rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
-  or BAIL_OUT "could not move $pgdata/pg_replslot";
+rmtree("$pgdata/pg_replslot");
+mkdir ("$tmploc/pg_replslot", 0700);
  dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
    or BAIL_OUT "could not symlink to $pgdata/pg_replslot";

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#4Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#3)
Re: 010_pg_basebackup.pl vs multiple filesystems

Hi,

On 2024-07-07 09:10:48 -0400, Andrew Dunstan wrote:

On 2024-07-07 Su 7:28 AM, Andrew Dunstan wrote:

I'll be happy to hear of one. I agree it's a mess.� Maybe we could test
that the temp directory is on the same device on Windows and skip the
test if not? You could still get the test to run by setting TMPDIR
and/or friends.

Maybe we should just not try to rename the directory. Looking at the test
I'm pretty sure the directory should be empty. Instead of trying to move it,
let's just remove it, and recreate it in the tmp location.

Good catch, yes, that'd be much better!

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 489dde4adf..c0c334c6fc 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -363,8 +363,8 @@ my $sys_tempdir =
PostgreSQL::Test::Utils::tempdir_short;
�# Elsewhere use $tempdir to avoid file system boundary issues with moving.
�my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

The comment would need a bit of editing, I guess. I think we should consider
just getting rid of the os-dependant switch now, it shouldn't be needed
anymore?

-rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
-� or BAIL_OUT "could not move $pgdata/pg_replslot";
+rmtree("$pgdata/pg_replslot");

Should this perhaps be an rmdir, to ensure that we're not removing something
we don't want (e.g. somebody adding an earlier test for slots that then gets
broken by the rmtree)?

Greetings,

Andres Freund

#5Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#1)
Re: 010_pg_basebackup.pl vs multiple filesystems

On Sun, Jul 7, 2024 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:

While working on [1] I encountered the issue that, on github-actions,
010_pg_basebackup.pl fails on windows.

The reason for that is that github actions uses two drives, with TMP/TEMP
located on c:, the tested code on d:. This causes the following code to fail:

# Create a temporary directory in the system location.
my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

Whatever we end up doing about this, it would be a good idea to check
the other places that use tempdir_short and see if they also need
adjustment.

--
Robert Haas
EDB: http://www.enterprisedb.com

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#5)
Re: 010_pg_basebackup.pl vs multiple filesystems

On 2024-07-08 Mo 4:45 PM, Robert Haas wrote:

On Sun, Jul 7, 2024 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:

While working on [1] I encountered the issue that, on github-actions,
010_pg_basebackup.pl fails on windows.

The reason for that is that github actions uses two drives, with TMP/TEMP
located on c:, the tested code on d:. This causes the following code to fail:

# Create a temporary directory in the system location.
my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

Whatever we end up doing about this, it would be a good idea to check
the other places that use tempdir_short and see if they also need
adjustment.

I don't think it's a problem. There are lots of systems that have
tempdir on a different device. That's why we previously saw lots of
errors from this code, resulting in the present incomplete workaround.
The fact that we haven't seen such errors from other tests means we
should be ok.

cheers

andrew

--

Andrew Dunstan
EDB: https://www.enterprisedb.com

#7Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#5)
Re: 010_pg_basebackup.pl vs multiple filesystems

Hi,

On 2024-07-08 16:45:42 -0400, Robert Haas wrote:

On Sun, Jul 7, 2024 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:

While working on [1] I encountered the issue that, on github-actions,
010_pg_basebackup.pl fails on windows.

The reason for that is that github actions uses two drives, with TMP/TEMP
located on c:, the tested code on d:. This causes the following code to fail:

# Create a temporary directory in the system location.
my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

Whatever we end up doing about this, it would be a good idea to check
the other places that use tempdir_short and see if they also need
adjustment.

FWIW, this was the only test that failed due to TMP being on a separate
partition.

I couldn't quite enable all tests (PG_TEST_EXTRA including libpq_encryption
fails due to some gssapi issue, kerberos fails due to paths being wrong / not
set up for windows, load_balance because I haven't set up hostnames), but I
don't think any of the issues around those tests are related.

I did also grep for other uses of tmpdir_short, they all seem to be
differently motivated.

I also looked for uses of rename() in tests. While
src/bin/pg_verifybackup/t/007_wal.pl does move stuff around, it afaict is
always going to be on the same filesystem.

Greetings,

Andres Freund

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#4)
Re: 010_pg_basebackup.pl vs multiple filesystems

On 2024-07-08 Mo 4:31 PM, Andres Freund wrote:

Hi,

On 2024-07-07 09:10:48 -0400, Andrew Dunstan wrote:

On 2024-07-07 Su 7:28 AM, Andrew Dunstan wrote:

I'll be happy to hear of one. I agree it's a mess.  Maybe we could test
that the temp directory is on the same device on Windows and skip the
test if not? You could still get the test to run by setting TMPDIR
and/or friends.

Maybe we should just not try to rename the directory. Looking at the test
I'm pretty sure the directory should be empty. Instead of trying to move it,
let's just remove it, and recreate it in the tmp location.

Good catch, yes, that'd be much better!

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 489dde4adf..c0c334c6fc 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -363,8 +363,8 @@ my $sys_tempdir =
PostgreSQL::Test::Utils::tempdir_short;
 # Elsewhere use $tempdir to avoid file system boundary issues with moving.
 my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

The comment would need a bit of editing, I guess. I think we should consider
just getting rid of the os-dependant switch now, it shouldn't be needed
anymore?

-rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
-  or BAIL_OUT "could not move $pgdata/pg_replslot";
+rmtree("$pgdata/pg_replslot");

Should this perhaps be an rmdir, to ensure that we're not removing something
we don't want (e.g. somebody adding an earlier test for slots that then gets
broken by the rmtree)?

OK, done like this.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#9Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#8)
Re: 010_pg_basebackup.pl vs multiple filesystems

On 2024-07-08 17:45:16 -0400, Andrew Dunstan wrote:

OK, done like this.

Thanks!