Can we avoid chdir'ing in resolve_symlinks() ?

Started by Tom Laneover 3 years ago24 messages
#1Tom Lane
Tom Lane
tgl@sss.pgh.pa.us

find_my_exec() wants to obtain an absolute, symlink-free path
to the program's own executable, for what seem to me good
reasons. However, chasing down symlinks is left to its
subroutine resolve_symlinks(), which does this:

* To resolve a symlink properly, we have to chdir into its directory and
* then chdir to where the symlink points; otherwise we may fail to
* resolve relative links correctly (consider cases involving mount
* points, for example). After following the final symlink, we use
* getcwd() to figure out where the heck we're at.

and then afterwards it has to chdir back to the original cwd.
That last step is a bit of a sore spot, because sometimes
(especially in sudo situations) we may not have the privileges
necessary to do that; I think this is the cause of the complaint
at [1]/messages/by-id/CAH8yC8kOj0pmHF1RbK2Gb2t4YCcNG-5h0TwZ7yxk3Hzw6C0Otg@mail.gmail.com. Anyway the whole thing seems a bit excessively Rube
Goldbergian. I'm wondering why we couldn't just read the
symlink(s), concatenate them together, and use canonicalize_path()
to clean up any mess.

This code was mine originally (336969e49), but I sure don't
remember why I wrote it like that. I know we didn't have a
robust version of canonicalize_path() then, and that may have
been the main issue, but that offhand comment about mount
points bothers me. But I can't reconstruct precisely what
I was worried about there. The only contemporaneous discussion
thread I can find is [2]/messages/by-id/4973.1099605411@sss.pgh.pa.us, which doesn't go into coding details.

Thoughts?

regards, tom lane

[1]: /messages/by-id/CAH8yC8kOj0pmHF1RbK2Gb2t4YCcNG-5h0TwZ7yxk3Hzw6C0Otg@mail.gmail.com
[2]: /messages/by-id/4973.1099605411@sss.pgh.pa.us

#2Isaac Morland
Isaac Morland
isaac.morland@gmail.com
In reply to: Tom Lane (#1)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On Thu, 1 Sept 2022 at 19:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This code was mine originally (336969e49), but I sure don't

remember why I wrote it like that. I know we didn't have a
robust version of canonicalize_path() then, and that may have
been the main issue, but that offhand comment about mount
points bothers me. But I can't reconstruct precisely what
I was worried about there. The only contemporaneous discussion
thread I can find is [2], which doesn't go into coding details.

Does this happen in a context where we need to worried about the directory
structure changing under us, either accidentally or maliciously?

I'm wondering because I understand cd'ing through the structure can avoid
some of the related problems and might be the reason for doing it that way
originally. My impression is that the modern equivalent would be to use
openat() with O_PATH to step through the hierarchy. But then I'm not clear
on how to get back to the absolute path, given a file descriptor for the
final directory.

#3Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Isaac Morland (#2)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Isaac Morland <isaac.morland@gmail.com> writes:

On Thu, 1 Sept 2022 at 19:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This code was mine originally (336969e49), but I sure don't
remember why I wrote it like that.

Does this happen in a context where we need to worried about the directory
structure changing under us, either accidentally or maliciously?

Well, one of the reasons it'd be a good idea to not change cwd is
that then you don't have to worry about that moving while you're
messing around. But everything else that we're considering here is
either a component of PATH or a directory/symlink associated with
the PG installation. If $badguy has control of any of that,
you've already lost, so I'm not excited about worrying about it.

I'm wondering because I understand cd'ing through the structure can avoid
some of the related problems and might be the reason for doing it that way
originally.

Pretty sure I was not thinking about that. I might have been
thinking about AFS installations, which IIRC often have two nominal
paths associated with them. But I don't recall any details about how
that works, and anyway the comment says nothing about AFS.

My impression is that the modern equivalent would be to use
openat() with O_PATH to step through the hierarchy. But then I'm not clear
on how to get back to the absolute path, given a file descriptor for the
final directory.

Yeah. The point here is not to open a particular file, but to derive
a pathname string for where the file is.

What I'm thinking right at the moment is that we don't necessarily
have to have the exact path that getcwd() would report. We need
*some* path-in-absolute-form that works. This leads me to think
that both the AFS case and the mount-point case are red herrings.
But I can't shake the feeling that I'm missing something.

regards, tom lane

#4Andrew Dunstan
Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#1)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 2022-09-01 Th 19:39, Tom Lane wrote:

find_my_exec() wants to obtain an absolute, symlink-free path
to the program's own executable, for what seem to me good
reasons. However, chasing down symlinks is left to its
subroutine resolve_symlinks(), which does this:

* To resolve a symlink properly, we have to chdir into its directory and
* then chdir to where the symlink points; otherwise we may fail to
* resolve relative links correctly (consider cases involving mount
* points, for example). After following the final symlink, we use
* getcwd() to figure out where the heck we're at.

and then afterwards it has to chdir back to the original cwd.
That last step is a bit of a sore spot, because sometimes
(especially in sudo situations) we may not have the privileges
necessary to do that; I think this is the cause of the complaint
at [1]. Anyway the whole thing seems a bit excessively Rube
Goldbergian. I'm wondering why we couldn't just read the
symlink(s), concatenate them together, and use canonicalize_path()
to clean up any mess.

This code was mine originally (336969e49), but I sure don't
remember why I wrote it like that. I know we didn't have a
robust version of canonicalize_path() then, and that may have
been the main issue, but that offhand comment about mount
points bothers me. But I can't reconstruct precisely what
I was worried about there. The only contemporaneous discussion
thread I can find is [2], which doesn't go into coding details.

Thoughts?

regards, tom lane

[1] /messages/by-id/CAH8yC8kOj0pmHF1RbK2Gb2t4YCcNG-5h0TwZ7yxk3Hzw6C0Otg@mail.gmail.com
[2] /messages/by-id/4973.1099605411@sss.pgh.pa.us

These days there seem to be library functions that do this, realpath(3)
and canonicalize_file_name(3). The latter is what seems to be called by
readlink(1). Should we be using one of those? I don't know how portable
they are. I don't see them here :-(
<https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/crt-alphabetical-function-reference?view=msvc-170&gt;

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#5Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Andrew Dunstan <andrew@dunslane.net> writes:

On 2022-09-01 Th 19:39, Tom Lane wrote:

find_my_exec() wants to obtain an absolute, symlink-free path
to the program's own executable, for what seem to me good
reasons. However, chasing down symlinks is left to its
subroutine resolve_symlinks(), which does this:

These days there seem to be library functions that do this, realpath(3)
and canonicalize_file_name(3). The latter is what seems to be called by
readlink(1). Should we be using one of those?

Oh! I see realpath() in POSIX, but not canonicalize_file_name().
It does look like realpath() would be helpful here, although if
it's not present on Windows that's a problem.

Quick googling suggests that _fullpath() could be used as a substitute.

regards, tom lane

#6Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#5)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

I wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

These days there seem to be library functions that do this, realpath(3)
and canonicalize_file_name(3). The latter is what seems to be called by
readlink(1). Should we be using one of those?

Oh! I see realpath() in POSIX, but not canonicalize_file_name().
It does look like realpath() would be helpful here, although if
it's not present on Windows that's a problem.

After some surveying of man pages, I conclude that

(1) realpath() exists on all platforms of interest except Windows,
where it looks like we can use _fullpath() instead.

(2) AIX and Solaris 10 only implement the SUSv2 semantics,
where the caller must supply a buffer that it has no good way
to determine a safe size for. Annoying.

(3) The Solaris 10 man page has this interesting disclaimer:

The realpath() function might fail to return to the current
directory if an error occurs.

which implies that on that platform it's basically implemented
in the same way as our current code. Sigh.

I think we can ignore (3) though. Solaris 11 seems to have an
up-to-speed implementation of realpath(), and 10 will be EOL
in January 2024 according to Wikipedia.

As for (2), both systems promise to report EINVAL for a null
pointer, which is also what SUSv2 says. So I think what we
can do is approximately

ptr = realpath(fname, NULL);
if (ptr == NULL && errno == EINVAL)
{
ptr = pg_malloc(MAXPGPATH);
ptr = realpath(fname, ptr);
}

and just take it on faith that MAXPGPATH is enough on those
platforms.

regards, tom lane

#7Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
1 attachment(s)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Here's a draft patch for this. It seems to work on Linux,
but the Windows code is just speculation. In particular,
I did

path = _fullpath(NULL, fname, 0);
if (path == NULL)
_dosmaperr(GetLastError());

but I'm not really sure that the _dosmaperr bit is needed,
because the _fullpath man page I found makes reference to
setting "errno" [1]https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fullpath-wfullpath?view=msvc-170. It's likely to be hard to test, because
most of the possible error cases should be nigh unreachable
in our usage; we already know the input is a valid reference
to an executable file.

BTW, I noticed what seems a flat-out bug in validate_exec:

        /* Win32 requires a .exe suffix for stat() */
-       if (strlen(path) >= strlen(".exe") &&
+       if (strlen(path) < strlen(".exe") ||
                pg_strcasecmp(path + strlen(path) - strlen(".exe"), ".exe") != 0)

Nobody's noticed because none of our executables have base names
shorter than 4 characters, but it's still a bug.

regards, tom lane

[1]: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fullpath-wfullpath?view=msvc-170

Attachments:

use-realpath-in-find_my_exec-1.patchtext/x-diff; charset=us-ascii; name=use-realpath-in-find_my_exec-1.patch
#8Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#1)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 02.09.22 01:39, Tom Lane wrote:

find_my_exec() wants to obtain an absolute, symlink-free path
to the program's own executable, for what seem to me good
reasons.

I still think they are bad reasons, and we should kill all that code.
Just sayin' ...

#9Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#8)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 02.09.22 01:39, Tom Lane wrote:

find_my_exec() wants to obtain an absolute, symlink-free path
to the program's own executable, for what seem to me good
reasons.

I still think they are bad reasons, and we should kill all that code.
Just sayin' ...

Are you proposing we give up the support for relocatable installations?
I'm not here to defend that feature, but I bet somebody will. (And
doesn't "make check" depend on it?)

regards, tom lane

#10Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#9)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 12.09.22 17:33, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 02.09.22 01:39, Tom Lane wrote:

find_my_exec() wants to obtain an absolute, symlink-free path
to the program's own executable, for what seem to me good
reasons.

I still think they are bad reasons, and we should kill all that code.
Just sayin' ...

Are you proposing we give up the support for relocatable installations?
I'm not here to defend that feature, but I bet somebody will. (And
doesn't "make check" depend on it?)

I'm complaining specifically about the resolving of symlinks. Why does

$ /usr/local/opt/postgresql@13/bin/pg_config --bindir

print

/usr/local/Cellar/postgresql@13/13.8/bin

when it clearly should print

/usr/local/opt/postgresql@13/bin

This is unrelated to the support for relocatable installations, AFAICT.

#11Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#10)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 12.09.22 17:33, Tom Lane wrote:

Are you proposing we give up the support for relocatable installations?
I'm not here to defend that feature, but I bet somebody will. (And
doesn't "make check" depend on it?)

I'm complaining specifically about the resolving of symlinks. Why does

$ /usr/local/opt/postgresql@13/bin/pg_config --bindir
print
/usr/local/Cellar/postgresql@13/13.8/bin
when it clearly should print
/usr/local/opt/postgresql@13/bin

I'm not sure about your setup there, but if you mean that
/usr/local/opt/postgresql@13/bin is a symlink reading more or less
"./13.8/bin", I doubt that failing to canonicalize that is a good idea.
The point of finding the bindir is mainly to be able to navigate to its
sibling directories such as lib/, etc/, share/. There's no certainty
that a symlink leading to the bin directory will have sibling symlinks
to those other directories.

regards, tom lane

#12Andrew Dunstan
Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#11)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 2022-09-12 Mo 16:07, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 12.09.22 17:33, Tom Lane wrote:

Are you proposing we give up the support for relocatable installations?
I'm not here to defend that feature, but I bet somebody will. (And
doesn't "make check" depend on it?)

I'm complaining specifically about the resolving of symlinks. Why does
$ /usr/local/opt/postgresql@13/bin/pg_config --bindir
print
/usr/local/Cellar/postgresql@13/13.8/bin
when it clearly should print
/usr/local/opt/postgresql@13/bin

I'm not sure about your setup there, but if you mean that
/usr/local/opt/postgresql@13/bin is a symlink reading more or less
"./13.8/bin", I doubt that failing to canonicalize that is a good idea.
The point of finding the bindir is mainly to be able to navigate to its
sibling directories such as lib/, etc/, share/. There's no certainty
that a symlink leading to the bin directory will have sibling symlinks
to those other directories.

I think the discussion here is a bit tangential to the original topic.

The point you make is reasonable, but it seems a bit more likely that in
the case Peter cites the symlink is one level higher in the tree, in
which case there's probably little value in resolving the symlink. Maybe
we could compromise and check if a path exists and only resolve symlinks
if it does not?

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#13Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#12)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Andrew Dunstan <andrew@dunslane.net> writes:

I think the discussion here is a bit tangential to the original topic.

Indeed, because I just wanted to reimplement *how* we resolve the
executable path to absolute, not question whether we should do it at all.

The point you make is reasonable, but it seems a bit more likely that in
the case Peter cites the symlink is one level higher in the tree, in
which case there's probably little value in resolving the symlink. Maybe
we could compromise and check if a path exists and only resolve symlinks
if it does not?

It's non-negotiable that we apply realpath() or a handmade equivalent
if the path we find to the executable turns out to be relative, ie
you did "../../postgres/bin/psql" or the equivalent. In the case of
the server, we *will* chdir to someplace else, rendering the original
path useless. psql might chdir in response to a user command, so it
likewise had better resolve the installation location while it can.

We could maybe skip realpath() if we find what appears to be an
absolute path to the executable. However, I think that fails in
too many situations. As an example, if I do
ln -s /path/to/psql ~/bin
and then invoke psql using that symlink, we're not going to be
able to find any of the installation's supporting files unless
we resolve the symlink. The executable path we'd deduce after
examining PATH is /home/tgl/bin/psql, which is plenty absolute,
but it doesn't help us find the rest of the PG installation.
That case works today, and I think a lot of people will be
sad if we break it.

I'm not familiar with how homebrew sets up the installation
layout, but I'm suspicious that the situation Peter refers to
has a similar problem, only with a symlink for the bin directory
not the individual executable.

I think the only potentially-workable alternative design is
to forget about relocatable installations and insist that the
supporting files be found at the installation path designated
at configure time. But, again, that seems likely to break a
lot of setups that work today. And I've still not heard a
positive reason why we should change it.

regards, tom lane

#14Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#13)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 13.09.22 01:26, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I think the discussion here is a bit tangential to the original topic.

Indeed, because I just wanted to reimplement *how* we resolve the
executable path to absolute, not question whether we should do it at all.

Well, if we decided not to do it, then we could just delete the code and
not have to think about how to change it.

I'm not familiar with how homebrew sets up the installation
layout, but I'm suspicious that the situation Peter refers to
has a similar problem, only with a symlink for the bin directory
not the individual executable.

I think the two contradicting use cases are:

1) You configure and install with prefix=/usr/local/pgsql, and then
symlink ~/bin/pg_ctl -> /usr/local/pgsql/bin/pg_ctl; hoping that that
will allow pg_ctl to find the other programs it needs in
/usr/local/pgsql/bin. This is what we currently support.

2) You configure and install with prefix=/usr/local/pgsql-14, and then
symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can
then use /usr/local/pgsql as if that's where it actually is. We don't
currently support that. (Note that it would work if you made a copy of
the tree instead of using the symlink.)

I don't know if anyone uses #1 or what the details of such use are.

#2 is how Homebrew and some other packaging systems work.

#15Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#14)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

2) You configure and install with prefix=/usr/local/pgsql-14, and then
symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can
then use /usr/local/pgsql as if that's where it actually is. We don't
currently support that. (Note that it would work if you made a copy of
the tree instead of using the symlink.)

What about it does not work?

regards, tom lane

#16Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#15)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 13.09.22 17:16, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

2) You configure and install with prefix=/usr/local/pgsql-14, and then
symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can
then use /usr/local/pgsql as if that's where it actually is. We don't
currently support that. (Note that it would work if you made a copy of
the tree instead of using the symlink.)

What about it does not work?

The problem is if another package or extension uses pg_config to find,
say, libdir, includedir, or bindir and integrates it into its own build
system or its own build products. If those directories point to
/usr/local/pgsql/{bin,include,lib}, then there is no problem. But if
they point to /usr/local/pgsql-14.5/{bin,include,lib}, then the next
minor update will break those other packages.

#17Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#16)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 13.09.22 17:16, Tom Lane wrote:

What about it does not work?

The problem is if another package or extension uses pg_config to find,
say, libdir, includedir, or bindir and integrates it into its own build
system or its own build products. If those directories point to
/usr/local/pgsql/{bin,include,lib}, then there is no problem. But if
they point to /usr/local/pgsql-14.5/{bin,include,lib}, then the next
minor update will break those other packages.

That seems ... a tad far-fetched, and even more to the point,
it'd be the other package's fault not ours. We have never promised
that those directories point to anyplace that's not PG-specific.
I certainly do not buy that that's a good argument for breaking
Postgres installation setups that work today.

Also, there is nothing in that scenario that is in any way dependent
on the use of symlinks, or even absolute paths, so I don't quite
see the relevance to the current discussion.

regards, tom lane

#18Thomas Munro
Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#7)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On Sun, Sep 4, 2022 at 2:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Here's a draft patch for this. It seems to work on Linux,
but the Windows code is just speculation. In particular,
I did

path = _fullpath(NULL, fname, 0);
if (path == NULL)
_dosmaperr(GetLastError());

but I'm not really sure that the _dosmaperr bit is needed,
because the _fullpath man page I found makes reference to
setting "errno" [1]. It's likely to be hard to test, because
most of the possible error cases should be nigh unreachable
in our usage; we already know the input is a valid reference
to an executable file.

I tried lots of crazy stuff[1]https://cirrus-ci.com/task/4935917730267136?logs=main to try to get an error out of this
thing, but came up empty handed. Unlike realpath(), _fullpath()
doesn't resolve symlinks (or junctions), so I guess there's less to go
wrong. It still needs the present working directory, which is a
per-drive concept on this OS, but even bogus drives don't seem to
produce an error (despite what the manual says).

I'd still lean towards assuming errno is set, given that the manual
references errno and not GetLastError(). Typical manual pages
explicitly tell you when GetLastError() has the error (example:
GetFullPathName(), for which this might be intended as a more Unix-y
wrapper, but even if so there's nothing to say that _fullpath() can't
set errno directly itself, in which case you might clobber it that
way).

[1]: https://cirrus-ci.com/task/4935917730267136?logs=main

#19Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#18)
1 attachment(s)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Thomas Munro <thomas.munro@gmail.com> writes:

I tried lots of crazy stuff[1] to try to get an error out of this
thing, but came up empty handed. Unlike realpath(), _fullpath()
doesn't resolve symlinks (or junctions), so I guess there's less to go
wrong. It still needs the present working directory, which is a
per-drive concept on this OS, but even bogus drives don't seem to
produce an error (despite what the manual says).

Interesting.

I'd still lean towards assuming errno is set, given that the manual
references errno and not GetLastError().

Agreed. In the attached, I drop the _dosmaperr() step and instead
just do "errno = 0" before the call. That way, if we ever do manage
to hit a _fullpath() failure, we can at least tell whether the errno
that's reported is real or not.

In this version I've attempted to resolve Peter's complaint by only
applying realpath() when the executable path we've obtained is relative
or has a symlink as the last component. Things will definitely not
work right if either of those is true and we make no effort to get
a more trustworthy path. I concede that things will usually work okay
without resolving a symlink that's two or more levels up the path,
but I wonder how much we want to trust that. Suppose somebody changes
such a symlink while the server is running --- nothing very good is
likely to happen if it suddenly starts consulting some other libdir
or sharedir. Maybe we need to add a flag telling whether we want
this behavior? TBH I think that pg_config is the only place I'd
be comfortable with doing it like this. Peter, would your concerns
be satisfied if we just made pg_config do it?

regards, tom lane

Attachments:

use-realpath-in-find_my_exec-2.patchtext/x-diff; charset=us-ascii; name=use-realpath-in-find_my_exec-2.patch
#20Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#17)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 15.09.22 16:43, Tom Lane wrote:

That seems ... a tad far-fetched, and even more to the point,
it'd be the other package's fault not ours. We have never promised
that those directories point to anyplace that's not PG-specific.
I certainly do not buy that that's a good argument for breaking
Postgres installation setups that work today.

Also, there is nothing in that scenario that is in any way dependent
on the use of symlinks, or even absolute paths, so I don't quite
see the relevance to the current discussion.

Here is another variant of the same problem:

I have

$ which meson
/usr/local/bin/meson

Meson records its own path (somewhere under meson-info/ AFAICT), so it
can re-run itself when any of the meson.build files change. But since
the above is a symlink, it records its own location as
"/usr/local/Cellar/meson/0.63.1/bin/meson". So now, whenever the meson
package updates (even if it's just 0.63.0 -> 0.63.1), my build tree is
broken.

To clarify, this instance is not at all the fault of any code in
PostgreSQL. But it's another instance where resolving symlinks just
because we can causing problems.

#21Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#20)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

To clarify, this instance is not at all the fault of any code in
PostgreSQL. But it's another instance where resolving symlinks just
because we can causing problems.

[ shrug... ] *Not* resolving symlinks when we can causes its
own set of problems, which maybe we don't see very clearly
because we have been doing it like that for a couple of decades.
I remain pretty hesitant to change this behavior.

What did you think of the compromise proposal to change only
the paths that pg_config outputs? I've not tried to code that,
but I think it should be feasible.

regards, tom lane

#22Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#21)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

On 05.10.22 15:59, Tom Lane wrote:

What did you think of the compromise proposal to change only
the paths that pg_config outputs? I've not tried to code that,
but I think it should be feasible.

I don't think I understand what this proposal actually means. What
would be the behavior of pg_config and how would it be different from
before?

#23Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#22)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 05.10.22 15:59, Tom Lane wrote:

What did you think of the compromise proposal to change only
the paths that pg_config outputs? I've not tried to code that,
but I think it should be feasible.

I don't think I understand what this proposal actually means. What
would be the behavior of pg_config and how would it be different from
before?

What I had in mind was:

* server and most frontend programs keep the same behavior, that is
fully resolve their executable's path to an absolute path (and then
navigate to the rest of the installation from there); but now they'll
use realpath() to avoid chdir'ing while they do that.

* pg_config applies realpath() if its initial PATH search produces a
relative path to the executable, or if the last component of that path
is a symlink. Otherwise leave it alone, which would have the effect of
not expanding directory-level symlinks.

I think that changing pg_config's behavior would be enough to resolve
the complaints you listed, but perhaps I'm missing some fine points.

regards, tom lane

#24Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#23)
Re: Can we avoid chdir'ing in resolve_symlinks() ?

I wrote:

I think that changing pg_config's behavior would be enough to resolve
the complaints you listed, but perhaps I'm missing some fine points.

Meanwhile I've gone ahead and pushed my v1 patch (plus Munro's
recommendation about _fullpath error handling), so we can see if
the buildfarm blows up. The question of whether we can sometimes
skip replacement of symlinks seems like material for a second patch
in any case.

regards, tom lane