pgsql: Map basebackup tablespaces using a tablespace_map file

Started by Andrew Dunstanalmost 11 years ago74 messageshackers
Jump to latest
#1Andrew Dunstan
andrew@dunslane.net

Map basebackup tablespaces using a tablespace_map file

Windows can't reliably restore symbolic links from a tar format, so
instead during backup start we create a tablespace_map file, which is
used by the restoring postgres to create the correct links in pg_tblspc.
The backup protocol also now has an option to request this file to be
included in the backup stream, and this is used by pg_basebackup when
operating in tar mode.

This is done on all platforms, not just Windows.

This means that pg_basebackup will not not work in tar mode against 9.4
and older servers, as this protocol option isn't implemented there.

Amit Kapila, reviewed by Dilip Kumar, with a little editing from me.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/72d422a5227ef6f76f412486a395aba9f53bf3f0

Modified Files
--------------
doc/src/sgml/backup.sgml | 32 +--
doc/src/sgml/func.sgml | 14 +-
doc/src/sgml/protocol.sgml | 15 +-
doc/src/sgml/ref/pg_basebackup.sgml | 14 +-
src/backend/access/transam/xlog.c | 387 ++++++++++++++++++++++++++++++--
src/backend/access/transam/xlogfuncs.c | 12 +-
src/backend/replication/basebackup.c | 138 +++++-------
src/backend/replication/repl_gram.y | 16 +-
src/backend/replication/repl_scanner.l | 1 +
src/bin/pg_basebackup/pg_basebackup.c | 5 +-
src/include/access/xlog.h | 9 +-
src/include/replication/basebackup.h | 10 +
12 files changed, 519 insertions(+), 134 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Andrew Dunstan (#1)
Re: pgsql: Map basebackup tablespaces using a tablespace_map file

On 05/12/2015 04:42 PM, Andrew Dunstan wrote:

+
+               /*
+                * Remove the existing symlink if any and Create the symlink
+                * under PGDATA.  We need to use rmtree instead of rmdir as
+                * the link location might contain directories or files
+                * corresponding to the actual path. Some tar utilities do
+                * things that way while extracting symlinks.
+                */
+               if (lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))
+               {
+                   if (!rmtree(linkloc,true))
+                       ereport(ERROR,
+                               (errcode_for_file_access(),
+                                errmsg("could not remove directory \"%s\": %m",
+                                       linkloc)));
+               }
+               else
+               {
+                   if (unlink(linkloc) < 0 && errno != ENOENT)
+                       ereport(ERROR,
+                               (errcode_for_file_access(),
+                                errmsg("could not remove symbolic link \"%s\": %m",
+                                       linkloc)));
+               }
+

That's scary. What tar utilitiy replaces the symlink with a non-empty
directory?

What if you call pg_start_backup() and take the backup with a utility
that follows symlinks? I wouldn't recommend using such a tool, but with
this patch, it will zap all the tablespaces. Before this, you at least
got a working database and could read out all the data or fix the
symlinks afterwards.

Why is the RelationCacheInitFileRemove() call moved?

XLogRecPtr
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
-                  char **labelfile)
+                  char **labelfile, DIR *tblspcdir, List **tablespaces,
+                  char **tblspcmapfile, bool infotbssize,
+                  bool needtblspcmapfile)

Why does this need to take tblspcdir as argument? Wouldn't it make more
sense to open the directory inside do_pg_start_backup?

In the BASE_BACKUP command, is there any reason to not always include
the tablespace map? IOW, do we really need the TABLESPACE_MAP option?

- Heikki

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Heikki Linnakangas (#2)
Re: pgsql: Map basebackup tablespaces using a tablespace_map file

On 05/12/2015 10:33 AM, Heikki Linnakangas wrote:

On 05/12/2015 04:42 PM, Andrew Dunstan wrote:

+
+               /*
+                * Remove the existing symlink if any and Create the 
symlink
+                * under PGDATA.  We need to use rmtree instead of 
rmdir as
+                * the link location might contain directories or files
+                * corresponding to the actual path. Some tar 
utilities do
+                * things that way while extracting symlinks.
+                */
+               if (lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))
+               {
+                   if (!rmtree(linkloc,true))
+                       ereport(ERROR,
+                               (errcode_for_file_access(),
+                                errmsg("could not remove directory 
\"%s\": %m",
+                                       linkloc)));
+               }
+               else
+               {
+                   if (unlink(linkloc) < 0 && errno != ENOENT)
+                       ereport(ERROR,
+                               (errcode_for_file_access(),
+                                errmsg("could not remove symbolic 
link \"%s\": %m",
+                                       linkloc)));
+               }
+

That's scary. What tar utilitiy replaces the symlink with a non-empty
directory?

What if you call pg_start_backup() and take the backup with a utility
that follows symlinks? I wouldn't recommend using such a tool, but
with this patch, it will zap all the tablespaces. Before this, you at
least got a working database and could read out all the data or fix
the symlinks afterwards.

Why is the RelationCacheInitFileRemove() call moved?

I will let Amit comment on this.

XLogRecPtr
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID 
*starttli_p,
-                  char **labelfile)
+                  char **labelfile, DIR *tblspcdir, List **tablespaces,
+                  char **tblspcmapfile, bool infotbssize,
+                  bool needtblspcmapfile)

Why does this need to take tblspcdir as argument? Wouldn't it make
more sense to open the directory inside do_pg_start_backup?

Good point. It does look cleaner to do that.

In the BASE_BACKUP command, is there any reason to not always include
the tablespace map? IOW, do we really need the TABLESPACE_MAP option?

I assume it was done for minimal disturbance, i.e. that's where we need
it. Amit, comments?

cheers

andrew

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#3)
Re: pgsql: Map basebackup tablespaces using a tablespace_map file

On Tue, May 12, 2015 at 9:02 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 05/12/2015 10:33 AM, Heikki Linnakangas wrote:

On 05/12/2015 04:42 PM, Andrew Dunstan wrote:

+
+               /*
+                * Remove the existing symlink if any and Create the
symlink
+                * under PGDATA.  We need to use rmtree instead of rmdir
as
+                * the link location might contain directories or files
+                * corresponding to the actual path. Some tar utilities
do
+                * things that way while extracting symlinks.
+                */
+               if (lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))
+               {
+                   if (!rmtree(linkloc,true))
+                       ereport(ERROR,
+                               (errcode_for_file_access(),
+                                errmsg("could not remove directory
\"%s\": %m",
+                                       linkloc)));
+               }
+               else
+               {
+                   if (unlink(linkloc) < 0 && errno != ENOENT)
+                       ereport(ERROR,
+                               (errcode_for_file_access(),
+                                errmsg("could not remove symbolic link
\"%s\": %m",
+                                       linkloc)));
+               }
+

That's scary. What tar utilitiy replaces the symlink with a non-empty
directory?

IIRC, it was star utility by using --copysymlinks options.
It will actually copy the symlink data at appropriate location,
but will not maintain symlink after extraction.
I don't have a link handly for it, but can again search for
it and send you if you want to have a look.

What if you call pg_start_backup() and take the backup with a utility that

follows symlinks? I wouldn't recommend using such a tool, but with this
patch, it will zap all the tablespaces. Before this, you at least got a
working database and could read out all the data or fix the symlinks
afterwards.

Yeah, but I don't think user will do such a thing without
being aware of the same and if he is aware, he will either
restore the symlinks before starting the server or would
atleast keep a copy of such a backup until he is able to
restore the database completely.

Do you think adding a note in docs makes sense?

Why is the RelationCacheInitFileRemove() call moved?

I will let Amit comment on this.

Because it assumes that tablespace directory pg_tblspc is all in
place and it tries to remove the files by reading pg_tblspc
directory as well. Now as we setup the symlinks in pg_tblspc
after reading tablespace_map file, so we should remove relcache init
file once the symlinks are setup in pg_tblspc directory.

XLogRecPtr

do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID
*starttli_p,
-                  char **labelfile)
+                  char **labelfile, DIR *tblspcdir, List **tablespaces,
+                  char **tblspcmapfile, bool infotbssize,
+                  bool needtblspcmapfile)

Why does this need to take tblspcdir as argument?

It needs to be called from perform_base_backup() which already
has access to tblspcdir.

Wouldn't it make more sense to open the directory inside

do_pg_start_backup?

Good point. It does look cleaner to do that.

If you prefer that way, I can look into rearranging the
code.

In the BASE_BACKUP command, is there any reason to not always include the
tablespace map?

The main reason is that it is not required to restore symlinks
for plain format. Also I think it can cause a problem if user
has used --tablespace-mapping option in plain format as the
tablespace_map file doesn't contain information for same and recovery
will try to restore symlinks for wrong path based on tablespace_map
file.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#4)
Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

[redirecting to -hackers]

On 05/12/2015 01:30 PM, Amit Kapila wrote:

On Tue, May 12, 2015 at 9:02 PM, Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:

On 05/12/2015 10:33 AM, Heikki Linnakangas wrote:

On 05/12/2015 04:42 PM, Andrew Dunstan wrote:

+
+               /*
+                * Remove the existing symlink if any and
Create the symlink
+                * under PGDATA.  We need to use rmtree
instead of rmdir as
+                * the link location might contain
directories or files
+                * corresponding to the actual path. Some
tar utilities do
+                * things that way while extracting symlinks.
+                */
+               if (lstat(linkloc, &st) == 0 &&
S_ISDIR(st.st_mode))
+               {
+                   if (!rmtree(linkloc,true))
+                       ereport(ERROR,
+  (errcode_for_file_access(),
+                                errmsg("could not remove
directory \"%s\": %m",
+                                       linkloc)));
+               }
+               else
+               {
+                   if (unlink(linkloc) < 0 && errno !=
ENOENT)
+                       ereport(ERROR,
+  (errcode_for_file_access(),
+                                errmsg("could not remove
symbolic link \"%s\": %m",
+                                       linkloc)));
+               }
+

That's scary. What tar utilitiy replaces the symlink with a
non-empty directory?

IIRC, it was star utility by using --copysymlinks options.
It will actually copy the symlink data at appropriate location,
but will not maintain symlink after extraction.
I don't have a link handly for it, but can again search for
it and send you if you want to have a look.

What if you call pg_start_backup() and take the backup with a
utility that follows symlinks? I wouldn't recommend using such
a tool, but with this patch, it will zap all the tablespaces.
Before this, you at least got a working database and could
read out all the data or fix the symlinks afterwards.

Yeah, but I don't think user will do such a thing without
being aware of the same and if he is aware, he will either
restore the symlinks before starting the server or would
atleast keep a copy of such a backup until he is able to
restore the database completely.

Do you think adding a note in docs makes sense?

How about if we simply abort if we find a non-symlink where we want the
symlink to be, and only remove something that is actually a symlink (or
a junction point, which is more or less the same thing)? Then it would
be up to the user to recover the situation, by moving or removing the
offending file/directory, and trying again.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#5)
Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

How about if we simply abort if we find a non-symlink where we want the

symlink to be, and only remove something that is actually a symlink (or a
junction point, which is more or less the same thing)?

We can do that way and for that I think we need to use rmdir
instead of rmtree in the code being discussed (recovery path),
OTOH we should try to minimize the errors raised during
recovery.

Then it would be up to the user to recover the situation, by moving or

removing the offending file/directory, and trying again.

Yes, I think as we only create/maintain symlinks in pg_tblspc
for tablespaces, so it seems okay even if we error out when we
find directories instead of symlinks in that path.

I can write a patch for this if you, Heikki and or others think that
is the better way to deal with this case.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#7Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#6)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Thu, May 14, 2015 at 12:12 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

How about if we simply abort if we find a non-symlink where we want the
symlink to be, and only remove something that is actually a symlink (or a
junction point, which is more or less the same thing)?

We can do that way and for that I think we need to use rmdir
instead of rmtree in the code being discussed (recovery path),
OTOH we should try to minimize the errors raised during
recovery.

I'm not sure I understand this issue in detail, but why would using
rmtree() on something you expect to be a symlink ever be a good idea?
It seems like if things are the way you expect them to be, it has no
benefit, but if they are different from what you expect, you might
blow away a ton of important data.

Maybe I am just confused.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#7)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On 05/14/2015 10:52 AM, Robert Haas wrote:

On Thu, May 14, 2015 at 12:12 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

How about if we simply abort if we find a non-symlink where we want the
symlink to be, and only remove something that is actually a symlink (or a
junction point, which is more or less the same thing)?

We can do that way and for that I think we need to use rmdir
instead of rmtree in the code being discussed (recovery path),
OTOH we should try to minimize the errors raised during
recovery.

I'm not sure I understand this issue in detail, but why would using
rmtree() on something you expect to be a symlink ever be a good idea?
It seems like if things are the way you expect them to be, it has no
benefit, but if they are different from what you expect, you might
blow away a ton of important data.

Maybe I am just confused.

The suggestion is to get rid of using rmtree. Instead, if we find a
non-symlink in pg_tblspc we'll make the user clean it up before we can
continue. So your instinct is in tune with my suggestion.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#8)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Thu, May 14, 2015 at 12:59 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

I'm not sure I understand this issue in detail, but why would using
rmtree() on something you expect to be a symlink ever be a good idea?
It seems like if things are the way you expect them to be, it has no
benefit, but if they are different from what you expect, you might
blow away a ton of important data.

Maybe I am just confused.

The suggestion is to get rid of using rmtree. Instead, if we find a
non-symlink in pg_tblspc we'll make the user clean it up before we can
continue. So your instinct is in tune with my suggestion.

Right. Maybe I should have just said "+1".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#8)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Thu, May 14, 2015 at 10:29 PM, Andrew Dunstan <andrew@dunslane.net>
wrote:

On 05/14/2015 10:52 AM, Robert Haas wrote:

On Thu, May 14, 2015 at 12:12 AM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan <andrew@dunslane.net>

wrote:

How about if we simply abort if we find a non-symlink where we want the
symlink to be, and only remove something that is actually a symlink

(or a

junction point, which is more or less the same thing)?

We can do that way and for that I think we need to use rmdir
instead of rmtree in the code being discussed (recovery path),
OTOH we should try to minimize the errors raised during
recovery.

I'm not sure I understand this issue in detail, but why would using
rmtree() on something you expect to be a symlink ever be a good idea?
It seems like if things are the way you expect them to be, it has no
benefit, but if they are different from what you expect, you might
blow away a ton of important data.

Maybe I am just confused.

The suggestion is to get rid of using rmtree. Instead, if we find a

non-symlink in pg_tblspc we'll make the user clean it up before we can
continue. So your instinct is in tune with my suggestion.

Find the patch which gets rid of rmtree usage. I have made it as
a separate function because the same code is used from
create_tablespace_directories() as well. I thought of extending the
same API for using it from destroy_tablespace_directories() as well,
but due to special handling (especially for ENOENT) in that function,
I left it as of now.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

remove_only_symlinks_during_recovery_v1.patchapplication/octet-stream; name=remove_only_symlinks_during_recovery_v1.patchDownload+37-44
#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#10)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Fri, May 15, 2015 at 11:51 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, May 14, 2015 at 10:29 PM, Andrew Dunstan <andrew@dunslane.net>

wrote:

On 05/14/2015 10:52 AM, Robert Haas wrote:

On Thu, May 14, 2015 at 12:12 AM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan <andrew@dunslane.net>

wrote:

How about if we simply abort if we find a non-symlink where we want

the

symlink to be, and only remove something that is actually a symlink

(or a

junction point, which is more or less the same thing)?

We can do that way and for that I think we need to use rmdir
instead of rmtree in the code being discussed (recovery path),
OTOH we should try to minimize the errors raised during
recovery.

I'm not sure I understand this issue in detail, but why would using
rmtree() on something you expect to be a symlink ever be a good idea?
It seems like if things are the way you expect them to be, it has no
benefit, but if they are different from what you expect, you might
blow away a ton of important data.

Maybe I am just confused.

The suggestion is to get rid of using rmtree. Instead, if we find a

non-symlink in pg_tblspc we'll make the user clean it up before we can
continue. So your instinct is in tune with my suggestion.

Find the patch which gets rid of rmtree usage. I have made it as
a separate function because the same code is used from
create_tablespace_directories() as well. I thought of extending the
same API for using it from destroy_tablespace_directories() as well,
but due to special handling (especially for ENOENT) in that function,
I left it as of now.

Does it make sense to track this in 9.5 Open Items [1]https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items?

[1]: https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#12Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#11)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On 05/23/2015 01:29 AM, Amit Kapila wrote:

On Fri, May 15, 2015 at 11:51 AM, Amit Kapila <amit.kapila16@gmail.com
<mailto:amit.kapila16@gmail.com>> wrote:

On Thu, May 14, 2015 at 10:29 PM, Andrew Dunstan

<andrew@dunslane.net <mailto:andrew@dunslane.net>> wrote:

On 05/14/2015 10:52 AM, Robert Haas wrote:

On Thu, May 14, 2015 at 12:12 AM, Amit Kapila

<amit.kapila16@gmail.com <mailto:amit.kapila16@gmail.com>> wrote:

On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan

<andrew@dunslane.net <mailto:andrew@dunslane.net>> wrote:

How about if we simply abort if we find a non-symlink where we

want the

symlink to be, and only remove something that is actually a

symlink (or a

junction point, which is more or less the same thing)?

We can do that way and for that I think we need to use rmdir
instead of rmtree in the code being discussed (recovery path),
OTOH we should try to minimize the errors raised during
recovery.

I'm not sure I understand this issue in detail, but why would using
rmtree() on something you expect to be a symlink ever be a good idea?
It seems like if things are the way you expect them to be, it has no
benefit, but if they are different from what you expect, you might
blow away a ton of important data.

Maybe I am just confused.

The suggestion is to get rid of using rmtree. Instead, if we find

a non-symlink in pg_tblspc we'll make the user clean it up before we
can continue. So your instinct is in tune with my suggestion.

Find the patch which gets rid of rmtree usage. I have made it as
a separate function because the same code is used from
create_tablespace_directories() as well. I thought of extending the
same API for using it from destroy_tablespace_directories() as well,
but due to special handling (especially for ENOENT) in that function,
I left it as of now.

Does it make sense to track this in 9.5 Open Items [1]?

[1] - https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

Sure. It's on my list of things to get to, but by all means put it there.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#12)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Sat, May 23, 2015 at 9:28 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 05/23/2015 01:29 AM, Amit Kapila wrote:

Find the patch which gets rid of rmtree usage. I have made it as
a separate function because the same code is used from
create_tablespace_directories() as well. I thought of extending the
same API for using it from destroy_tablespace_directories() as well,
but due to special handling (especially for ENOENT) in that function,
I left it as of now.

Does it make sense to track this in 9.5 Open Items [1]?

[1] - https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

Sure. It's on my list of things to get to, but by all means put it there.

Thanks for tracking. I have added to open items as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#10)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On 05/15/2015 02:21 AM, Amit Kapila wrote:

On Thu, May 14, 2015 at 10:29 PM, Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:

On 05/14/2015 10:52 AM, Robert Haas wrote:

On Thu, May 14, 2015 at 12:12 AM, Amit Kapila

<amit.kapila16@gmail.com <mailto:amit.kapila16@gmail.com>> wrote:

On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan

<andrew@dunslane.net <mailto:andrew@dunslane.net>> wrote:

How about if we simply abort if we find a non-symlink where we

want the

symlink to be, and only remove something that is actually a

symlink (or a

junction point, which is more or less the same thing)?

We can do that way and for that I think we need to use rmdir
instead of rmtree in the code being discussed (recovery path),
OTOH we should try to minimize the errors raised during
recovery.

I'm not sure I understand this issue in detail, but why would using
rmtree() on something you expect to be a symlink ever be a good idea?
It seems like if things are the way you expect them to be, it has no
benefit, but if they are different from what you expect, you might
blow away a ton of important data.

Maybe I am just confused.

The suggestion is to get rid of using rmtree. Instead, if we find a

non-symlink in pg_tblspc we'll make the user clean it up before we can
continue. So your instinct is in tune with my suggestion.

Find the patch which gets rid of rmtree usage. I have made it as
a separate function because the same code is used from
create_tablespace_directories() as well. I thought of extending the
same API for using it from destroy_tablespace_directories() as well,
but due to special handling (especially for ENOENT) in that function,
I left it as of now.

Well, it seems to me the new function is being altogether way too
trusting about the nature of what it's being asked to remove. In the
first place, the S_ISDIR/rmdir branch should only be for Windows, and
secondly in the other branch we should be checking that S_ISLNK is true.
It would actually be nice if we could test for a junction point on
Windows, but that seems to be a bit difficult. The function should
probably return a boolean to indicate any error, rather than void, so
that the caller can take action accordingly. In the present case we
should probably be stopping recovery dead in its tracks, but I certainly
don't think we should just be ignoring it.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#14)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Tue, Jun 2, 2015 at 10:26 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 05/15/2015 02:21 AM, Amit Kapila wrote:

Find the patch which gets rid of rmtree usage. I have made it as
a separate function because the same code is used from
create_tablespace_directories() as well. I thought of extending the
same API for using it from destroy_tablespace_directories() as well,
but due to special handling (especially for ENOENT) in that function,
I left it as of now.

Well, it seems to me the new function is being altogether way too trusting
about the nature of what it's being asked to remove. In the first place,
the S_ISDIR/rmdir branch should only be for Windows, and secondly in the
other branch we should be checking that S_ISLNK is true. It would actually
be nice if we could test for a junction point on Windows, but that seems to
be a bit difficult.

I think during recovery for tablespace related operations, it is
quite possible to have a directory instead of symlink in some
special cases (see function TablespaceCreateDbspace() and comments
in destroy_tablespace_directories() { ..Try to remove the symlink..}).
Also this new function is being called from create_tablespace_directories()
which uses the code as written in new function, so it doesn't make much
sense to change it Windows and non-Windows specific code.

The function should probably return a boolean to indicate any error,
rather than void, so that the caller can take action accordingly.

I think returning boolean is good if the function has Windows
and non-Windows specific code, else we might need short, int16
or something like that as there will be three return values
(0 - success, 1 - fail to remove directory, 2 - fail to remove
symlink).

On the whole this function can be mainly used for tablespace
related paths during recovery, it isn't generic enough to be
use from all paths. I think as proposed name of the new
function (rmsymlink) looks quite generic, so either we can
change it to (rm_tablespace_symlinks) or may be I can just
duplicate the code in read_tablespace_map() related check and
then we don't need this new function.

In the present case we should probably be stopping recovery dead in its
tracks, but I certainly don't think we should just be ignoring it.

Do you think we should do anything before returning error?
This operation haapens in the beginning of recovery before
replay of any records, so throwing ERROR from here shouldn't
have any impact unless we have done any operation which
can create problem when user next time starts the recovery.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#15)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On 06/02/2015 11:55 PM, Amit Kapila wrote:

On Tue, Jun 2, 2015 at 10:26 PM, Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:

On 05/15/2015 02:21 AM, Amit Kapila wrote:

Find the patch which gets rid of rmtree usage. I have made it as
a separate function because the same code is used from
create_tablespace_directories() as well. I thought of
extending the
same API for using it from destroy_tablespace_directories() as
well,
but due to special handling (especially for ENOENT) in that
function,
I left it as of now.

Well, it seems to me the new function is being altogether way too
trusting about the nature of what it's being asked to remove. In
the first place, the S_ISDIR/rmdir branch should only be for
Windows, and secondly in the other branch we should be checking
that S_ISLNK is true. It would actually be nice if we could test
for a junction point on Windows, but that seems to be a bit
difficult.

I think during recovery for tablespace related operations, it is
quite possible to have a directory instead of symlink in some
special cases (see function TablespaceCreateDbspace() and comments
in destroy_tablespace_directories() { ..Try to remove the symlink..}).
Also this new function is being called from
create_tablespace_directories()
which uses the code as written in new function, so it doesn't make much
sense to change it Windows and non-Windows specific code.

Looking at it again, this might be not as bad as I thought, but I do
think we should probably call the function something other than
rmsymlink(). That seems too generic, since it also tries to remove
directories - albeit that this will fail if the directory isn't empty.
And I still think we should add a test for S_ISLNK in the second branch.
As it stands the function could try to unlink anything that's not a
directory. That might be safe-ish in the context it's used in for the
tablespace code, but it's far from safe enough for a function that's in
src/common

Given that the function raises an error on failure, I think it will
otherwise be OK as is.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#16)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Thu, Jun 4, 2015 at 1:52 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 06/02/2015 11:55 PM, Amit Kapila wrote:

On Tue, Jun 2, 2015 at 10:26 PM, Andrew Dunstan <andrew@dunslane.net

<mailto:andrew@dunslane.net>> wrote:

Well, it seems to me the new function is being altogether way too
trusting about the nature of what it's being asked to remove. In
the first place, the S_ISDIR/rmdir branch should only be for
Windows, and secondly in the other branch we should be checking
that S_ISLNK is true. It would actually be nice if we could test
for a junction point on Windows, but that seems to be a bit
difficult.

I think during recovery for tablespace related operations, it is
quite possible to have a directory instead of symlink in some
special cases (see function TablespaceCreateDbspace() and comments
in destroy_tablespace_directories() { ..Try to remove the symlink..}).
Also this new function is being called from
create_tablespace_directories()
which uses the code as written in new function, so it doesn't make much
sense to change it Windows and non-Windows specific code.

Looking at it again, this might be not as bad as I thought, but I do think
we should probably call the function something other than rmsymlink(). That
seems too generic, since it also tries to remove directories - albeit that
this will fail if the directory isn't empty. And I still think we should
add a test for S_ISLNK in the second branch. As it stands the function
could try to unlink anything that's not a directory. That might be safe-ish
in the context it's used in for the tablespace code, but it's far from safe
enough for a function that's in src/common

Okay, as we both seem to agree that it can be mostly used in
tablespace symlinks context, so I have changed the name to
remove_tablespace_symlink() and moved the function to
tablespace.c. S_ISLINK check is used for non-windows code,
so not sure adding it here makes any real difference now that
we have made it specific to tablespace and we might need to
write small port specific code if we want to add S_ISLINK check.

Given that the function raises an error on failure, I think it will
otherwise be OK as is.

Please find an updated patch attached with this mail.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#18Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#17)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Thu, Jun 4, 2015 at 10:14 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Jun 4, 2015 at 1:52 AM, Andrew Dunstan <andrew@dunslane.net>
wrote:

On 06/02/2015 11:55 PM, Amit Kapila wrote:

On Tue, Jun 2, 2015 at 10:26 PM, Andrew Dunstan <andrew@dunslane.net

<mailto:andrew@dunslane.net>> wrote:

Well, it seems to me the new function is being altogether way too
trusting about the nature of what it's being asked to remove. In
the first place, the S_ISDIR/rmdir branch should only be for
Windows, and secondly in the other branch we should be checking
that S_ISLNK is true. It would actually be nice if we could test
for a junction point on Windows, but that seems to be a bit
difficult.

I think during recovery for tablespace related operations, it is
quite possible to have a directory instead of symlink in some
special cases (see function TablespaceCreateDbspace() and comments
in destroy_tablespace_directories() { ..Try to remove the symlink..}).
Also this new function is being called from
create_tablespace_directories()
which uses the code as written in new function, so it doesn't make much
sense to change it Windows and non-Windows specific code.

Looking at it again, this might be not as bad as I thought, but I do
think we should probably call the function something other than
rmsymlink(). That seems too generic, since it also tries to remove
directories - albeit that this will fail if the directory isn't empty. And
I still think we should add a test for S_ISLNK in the second branch. As it
stands the function could try to unlink anything that's not a directory.
That might be safe-ish in the context it's used in for the tablespace code,
but it's far from safe enough for a function that's in src/common

Okay, as we both seem to agree that it can be mostly used in
tablespace symlinks context, so I have changed the name to
remove_tablespace_symlink() and moved the function to
tablespace.c. S_ISLINK check is used for non-windows code,
so not sure adding it here makes any real difference now that
we have made it specific to tablespace and we might need to
write small port specific code if we want to add S_ISLINK check.

Given that the function raises an error on failure, I think it will
otherwise be OK as is.

Please find an updated patch attached with this mail.

Sorry, forgot to attach the patch in previous mail, now attaching.

Thanks Bruce for reminding me offlist.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

remove_only_symlinks_during_recovery_v2.patchapplication/octet-stream; name=remove_only_symlinks_during_recovery_v2.patchDownload+35-44
#19Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#17)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On 06/04/2015 12:44 AM, Amit Kapila wrote:

Given that the function raises an error on failure, I think it
will otherwise be OK as is.

Please find an updated patch attached with this mail.

No attachment.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#19)
Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

On Thu, Jun 4, 2015 at 8:43 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 06/04/2015 12:44 AM, Amit Kapila wrote:

Given that the function raises an error on failure, I think it
will otherwise be OK as is.

Please find an updated patch attached with this mail.

No attachment.

cheers

I have sent it in the next mail, but anyway sending it again
in case you have missed it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

remove_only_symlinks_during_recovery_v2.patchapplication/octet-stream; name=remove_only_symlinks_during_recovery_v2.patchDownload+35-44
#21Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#18)
#22Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#21)
#23Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#22)
#24Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#23)
#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#24)
#26Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#25)
#27Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#26)
#28Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#27)
#29Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#28)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#29)
#31Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#30)
#32Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#29)
#33Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#32)
#34Fujii Masao
masao.fujii@gmail.com
In reply to: Andrew Dunstan (#1)
#35Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#34)
#36Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#35)
#37Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#36)
#38Andrew Dunstan
andrew@dunslane.net
In reply to: Amit Kapila (#33)
#39Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#37)
#40Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#38)
#41Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#39)
#42Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#41)
#43Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#42)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#40)
#45Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#44)
#46Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#43)
#47Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Kapila (#46)
#48Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#47)
#49Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#48)
#50Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Kapila (#49)
#51Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#48)
#52Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#51)
#53Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#52)
#54Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#52)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#54)
#56Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#50)
#57Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#52)
#58Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#57)
#59Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#58)
#60Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#59)
#61Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#59)
#62Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#60)
#63Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#61)
#64Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#63)
#65Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#64)
#66Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#65)
#67Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#66)
#68Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#67)
#69Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#65)
#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#69)
#71Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#70)
#72Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#71)
#73Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#72)
#74Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#73)