pg_rewind fails with in-place tablespace

Started by 赵锐(惜元)over 2 years ago6 messages
#1赵锐(惜元)
xiyuan.zr@alibaba-inc.com
1 attachment(s)

Hello postgres hackers,
Recently I encountered an issue: pg_rewind fails when dealing with in-place tablespace. The problem seems to be that pg_rewind is treating in-place tablespace as symbolic link, while in fact it should be treated as directory.
Here is the output of pg_rewind:
pg_rewind: error: file "pg_tblspc/16385" is of different type in source and target
To help reproduce the failure, I have attached a tap test. And I am pleased to say that I have also identified a solution for this problem, which I have included in the patch.
Thank you for your attention to this matter.
Best regards,
Rui Zhao

Attachments:

0001-pg_rewind-fails-with-in-place-tablespace.patchapplication/octet-streamDownload
diff --git a/src/bin/pg_rewind/libpq_source.c b/src/bin/pg_rewind/libpq_source.c
index 0d8e9ee2d1..f449b8887e 100644
--- a/src/bin/pg_rewind/libpq_source.c
+++ b/src/bin/pg_rewind/libpq_source.c
@@ -298,7 +298,16 @@ libpq_traverse_files(rewind_source *source, process_file_callback_t callback)
 		link_target = PQgetvalue(res, i, 3);
 
 		if (link_target[0])
-			type = FILE_TYPE_SYMLINK;
+		{
+			/*
+			 * For in-place tablespace, link_target is a directory which starts
+			 * with "pg_tblspc/".
+			 */
+			if (strncmp(link_target, "pg_tblspc/", strlen("pg_tblspc/")) == 0)
+				type = FILE_TYPE_DIRECTORY;
+			else
+				type = FILE_TYPE_SYMLINK;
+		}
 		else if (isdir)
 			type = FILE_TYPE_DIRECTORY;
 		else
diff --git a/src/bin/pg_rewind/t/001_basic.pl b/src/bin/pg_rewind/t/001_basic.pl
index 031594e14e..e1ee7d21e1 100644
--- a/src/bin/pg_rewind/t/001_basic.pl
+++ b/src/bin/pg_rewind/t/001_basic.pl
@@ -18,6 +18,9 @@ sub run_test
 	RewindTest::setup_cluster($test_mode);
 	RewindTest::start_primary();
 
+	# Create an in-place tablespace
+	primary_psql("CREATE TABLESPACE test LOCATION ''");
+
 	# Create a test table and insert a row in primary.
 	primary_psql("CREATE TABLE tbl1 (d text)");
 	primary_psql("INSERT INTO tbl1 VALUES ('in primary')");
diff --git a/src/bin/pg_rewind/t/RewindTest.pm b/src/bin/pg_rewind/t/RewindTest.pm
index 4957791e94..8fbbd521cb 100644
--- a/src/bin/pg_rewind/t/RewindTest.pm
+++ b/src/bin/pg_rewind/t/RewindTest.pm
@@ -131,6 +131,7 @@ sub setup_cluster
 	$node_primary->append_conf(
 		'postgresql.conf', qq(
 wal_keep_size = 320MB
+allow_in_place_tablespaces = on
 ));
 	return;
 }
#2Michael Paquier
michael@paquier.xyz
In reply to: 赵锐(惜元) (#1)
Re: pg_rewind fails with in-place tablespace

On Wed, Jul 19, 2023 at 09:31:35PM +0800, 赵锐(惜元) wrote:

Recently I encountered an issue: pg_rewind fails when dealing with
in-place tablespace. The problem seems to be that pg_rewind is
treating in-place tablespace as symbolic link, while in fact it
should be treated as directory.
Here is the output of pg_rewind:
pg_rewind: error: file "pg_tblspc/16385" is of different type in
source and target
To help reproduce the failure, I have attached a tap test. And I am
pleased to say that I have also identified a solution for this
problem, which I have included in the patch.
Thank you for your attention to this matter.

Issue reproduced here, and agreed that we'd better do something about
that. I am not sure if your patch is right for the job though, but
I'll try to study that a bit more.
--
Michael

#3Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#2)
1 attachment(s)
Re: pg_rewind fails with in-place tablespace

On Tue, Jul 25, 2023 at 04:36:42PM +0900, Michael Paquier wrote:

On Wed, Jul 19, 2023 at 09:31:35PM +0800, 赵锐(惜元) wrote:

To help reproduce the failure, I have attached a tap test. And I am
pleased to say that I have also identified a solution for this
problem, which I have included in the patch.
Thank you for your attention to this matter.

Issue reproduced here, and agreed that we'd better do something about
that. I am not sure if your patch is right for the job though, but
I'll try to study that a bit more.

It took me some time to remember that for the case of a local source
we'd finish by using recurse_dir() and consider the in-place
tablespace as a regular directory, so a fix located in
libpq_traverse_files() sounds good to me.

+           if (strncmp(link_target, "pg_tblspc/", strlen("pg_tblspc/")) == 0)
+               type = FILE_TYPE_DIRECTORY;
+           else
+               type = FILE_TYPE_SYMLINK;

However this is not consistent with the other places where we detect
if an in-place tablespace is used, like pg_basebackup.c, where we rely
on the fact that the tablespace path is a relative path, using
is_absolute_path() to make the difference between a normal and
in-place tablespace. I would choose consistency and do the same here,
checking if we have an absolute or relative path, depending on the
result of pg_tablespace_location().

Testing only for the creation of the tablespace is fine for the sake
of the report, but I would slightly more here and create a table on
this tablespace with some data, and a check_query() once pg_rewind is
done.

I am finishing with the attached. Thoughts?
--
Michael

Attachments:

0001-Fix-pg_rewind-with-in-place-tablespaces-when-source-.patchtext/x-diff; charset=us-asciiDownload
From 728bd093959f0d51c234f88729798165176fc0f6 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 28 Jul 2023 16:53:21 +0900
Subject: [PATCH] Fix pg_rewind with in-place tablespaces when source is remote

---
 src/bin/pg_rewind/libpq_source.c  | 11 ++++++++++-
 src/bin/pg_rewind/t/001_basic.pl  | 20 ++++++++++++++++++++
 src/bin/pg_rewind/t/RewindTest.pm |  1 +
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_rewind/libpq_source.c b/src/bin/pg_rewind/libpq_source.c
index 0d8e9ee2d1..417c74cfef 100644
--- a/src/bin/pg_rewind/libpq_source.c
+++ b/src/bin/pg_rewind/libpq_source.c
@@ -298,7 +298,16 @@ libpq_traverse_files(rewind_source *source, process_file_callback_t callback)
 		link_target = PQgetvalue(res, i, 3);
 
 		if (link_target[0])
-			type = FILE_TYPE_SYMLINK;
+		{
+			/*
+			 * In-place tablespaces are directories located in pg_tblspc/ with
+			 * relative paths.
+			 */
+			if (is_absolute_path(link_target))
+				type = FILE_TYPE_SYMLINK;
+			else
+				type = FILE_TYPE_DIRECTORY;
+		}
 		else if (isdir)
 			type = FILE_TYPE_DIRECTORY;
 		else
diff --git a/src/bin/pg_rewind/t/001_basic.pl b/src/bin/pg_rewind/t/001_basic.pl
index 031594e14e..c7b48255a7 100644
--- a/src/bin/pg_rewind/t/001_basic.pl
+++ b/src/bin/pg_rewind/t/001_basic.pl
@@ -18,6 +18,12 @@ sub run_test
 	RewindTest::setup_cluster($test_mode);
 	RewindTest::start_primary();
 
+	# Create an in-place tablespace with some data on it.
+	primary_psql("CREATE TABLESPACE space_test LOCATION ''");
+	primary_psql("CREATE TABLE space_tbl (d text) TABLESPACE space_test");
+	primary_psql(
+		"INSERT INTO space_tbl VALUES ('in primary, before promotion')");
+
 	# Create a test table and insert a row in primary.
 	primary_psql("CREATE TABLE tbl1 (d text)");
 	primary_psql("INSERT INTO tbl1 VALUES ('in primary')");
@@ -78,6 +84,13 @@ sub run_test
 		"insert into drop_tbl values ('in primary, after promotion')");
 	primary_psql("DROP TABLE drop_tbl");
 
+	# Insert some data in the in-place tablespace for the old primary and
+	# the standby.
+	primary_psql(
+		"INSERT INTO space_tbl VALUES ('in primary, after promotion')");
+	standby_psql(
+		"INSERT INTO space_tbl VALUES ('in standby, after promotion')");
+
 	# Before running pg_rewind, do a couple of extra tests with several
 	# option combinations.  As the code paths taken by those tests
 	# do not change for the "local" and "remote" modes, just run them
@@ -145,6 +158,13 @@ sub run_test
 
 	RewindTest::run_pg_rewind($test_mode);
 
+	check_query(
+		'SELECT * FROM space_tbl ORDER BY d',
+		qq(in primary, before promotion
+in standby, after promotion
+),
+		'table content');
+
 	check_query(
 		'SELECT * FROM tbl1',
 		qq(in primary
diff --git a/src/bin/pg_rewind/t/RewindTest.pm b/src/bin/pg_rewind/t/RewindTest.pm
index 4957791e94..8fbbd521cb 100644
--- a/src/bin/pg_rewind/t/RewindTest.pm
+++ b/src/bin/pg_rewind/t/RewindTest.pm
@@ -131,6 +131,7 @@ sub setup_cluster
 	$node_primary->append_conf(
 		'postgresql.conf', qq(
 wal_keep_size = 320MB
+allow_in_place_tablespaces = on
 ));
 	return;
 }
-- 
2.40.1

#4Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#3)
Re: pg_rewind fails with in-place tablespace

On Fri, Jul 28, 2023 at 04:54:56PM +0900, Michael Paquier wrote:

I am finishing with the attached. Thoughts?

Applied this one as bf22792 on HEAD, without a backpatch as in-place
tablespaces are around for developers. If there are opinions in favor
of a backpatch, feel free of course.
--
Michael

#5Rui Zhao
xiyuan.zr@alibaba-inc.com
In reply to: Michael Paquier (#4)
回复:pg_rewind fails with in-place tablespace

Sorry for the delay in responding to this matter as I have been waiting for another similar subject to approved by a moderator.
Upon review, I am satisfied with the proposed solution and believe that checking absolute path is better than hard coding with "pg_tblspc/". I think we have successfully resolved this issue in the pg_rewind case.
However, I would like to bring your attention to another issue: pg_upgrade fails with in-place tablespace. Another issue is still waiting for approved. I have tested all the tools in src/bin with in-place tablespace, and I believe this is the final issue.
Thank you for your understanding and assistance.
Best regard,
Rui Zhao
------------------------------------------------------------------
发件人:Michael Paquier <michael@paquier.xyz>
发送时间:2023年7月31日(星期一) 06:49
收件人:赵锐(惜元) <xiyuan.zr@alibaba-inc.com>
抄 送:pgsql-hackers <pgsql-hackers@lists.postgresql.org>; Thomas Munro <thomas.munro@gmail.com>
主 题:Re: pg_rewind fails with in-place tablespace
On Fri, Jul 28, 2023 at 04:54:56PM +0900, Michael Paquier wrote:

I am finishing with the attached. Thoughts?

Applied this one as bf22792 on HEAD, without a backpatch as in-place
tablespaces are around for developers. If there are opinions in favor
of a backpatch, feel free of course.
--
Michael

#6Michael Paquier
michael@paquier.xyz
In reply to: Rui Zhao (#5)
Re: 回复:pg_rewind fails with in-place tablespace

On Mon, Jul 31, 2023 at 10:07:44AM +0800, Rui Zhao wrote:

However, I would like to bring your attention to another issue:
pg_upgrade fails with in-place tablespace. Another issue is still
waiting for approved. I have tested all the tools in src/bin with
in-place tablespace, and I believe this is the final issue.

No problem. Please feel free to start a new thread about that, I'm
okay to look at what you would like to propose. Adding a test in
002_pg_upgrade.pl where the pg_upgrade runs happen would be a good
thing to have, I guess.
--
Michael