Let's start using setenv()

Started by Tom Laneover 5 years ago6 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Back in 2001 we decided to prefer putenv() over setenv() because the
latter wasn't in POSIX (SUSv2) at the time. That decision has been
overtaken by events: more recent editions of POSIX not only provide
setenv() but recommend it over putenv(). So I think it's time to
change our policy and prefer setenv(). We've had previous discussions
about that but nobody did the legwork yet.

The attached patch provides the infrastructure to allow using setenv().
I added a src/port/ implementation of setenv() for any machines that
haven't caught up with POSIX lately. (I've tested that code on my old
dinosaur gaur, and I would not be surprised if that is the only machine
anywhere that ever runs it. But I could be wrong.) I also extended
win32env.c to support setenv().

I haven't made any serious effort to expunge all uses of putenv() in this
version of the patch; I just wanted to exercise setenv() in both backend
and frontend. Seeing that POSIX considers putenv() to be semi-deprecated,
maybe we should try to eliminate all calls outside the (un)setenv
implementations, but first it'd be good to see if win32env.c works.

I also changed our unsetenv() emulations to make them return an int
error indicator, as per POSIX. I have no desire to change the call
sites to check for errors, but it seemed like our compatibility stubs
should be compatible with the spec. (Note: I think that unsetenv()
did return void in old BSD versions, before POSIX standardized it.
So that might be a real reason not to change the callers.)

regards, tom lane

Attachments:

support-using-setenv-1.patchtext/x-diff; charset=us-ascii; name=support-using-setenv-1.patchDownload+158-66
#2Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#1)
Re: Let's start using setenv()

On Tue, Dec 29, 2020 at 4:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Back in 2001 we decided to prefer putenv() over setenv() because the
latter wasn't in POSIX (SUSv2) at the time. That decision has been
overtaken by events: more recent editions of POSIX not only provide
setenv() but recommend it over putenv(). So I think it's time to
change our policy and prefer setenv(). We've had previous discussions
about that but nobody did the legwork yet.

The attached patch provides the infrastructure to allow using setenv().
I added a src/port/ implementation of setenv() for any machines that
haven't caught up with POSIX lately. (I've tested that code on my old
dinosaur gaur, and I would not be surprised if that is the only machine
anywhere that ever runs it. But I could be wrong.) I also extended
win32env.c to support setenv().

+1, nice modernisation.

I also changed our unsetenv() emulations to make them return an int
error indicator, as per POSIX. I have no desire to change the call
sites to check for errors, but it seemed like our compatibility stubs
should be compatible with the spec. (Note: I think that unsetenv()
did return void in old BSD versions, before POSIX standardized it.
So that might be a real reason not to change the callers.)

+1 for conformance. I suppose it's out of spec that unsetenv() can
return -1 with errno = ENOMEM (from malloc()), but that hardly
matters. Hmm... could a static buffer be used for that clobbering
trick?

For the note in parens: it looks like the 3 main BSDs all switched
from the historical void function to the POSIX one 12-18 years
ago[1]https://github.com/NetBSD/src/commit/13dee93fb3a93189a74a3705a5f7cd1c6b290125[2]https://github.com/openbsd/src/commit/471b62eeaa7f3a18c0aa98b5d605e5cec1625b62[3]https://github.com/freebsd/freebsd/commit/196b6346ba4e13a3f7679e2de3317b6aa65983df, so I wouldn't worry about that. Glibc made a similar
change 19 years ago.

[1]: https://github.com/NetBSD/src/commit/13dee93fb3a93189a74a3705a5f7cd1c6b290125
[2]: https://github.com/openbsd/src/commit/471b62eeaa7f3a18c0aa98b5d605e5cec1625b62
[3]: https://github.com/freebsd/freebsd/commit/196b6346ba4e13a3f7679e2de3317b6aa65983df

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#2)
Re: Let's start using setenv()

Thomas Munro <thomas.munro@gmail.com> writes:

On Tue, Dec 29, 2020 at 4:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I also changed our unsetenv() emulations to make them return an int
error indicator, as per POSIX. I have no desire to change the call
sites to check for errors, but it seemed like our compatibility stubs
should be compatible with the spec. (Note: I think that unsetenv()
did return void in old BSD versions, before POSIX standardized it.
So that might be a real reason not to change the callers.)

+1 for conformance. I suppose it's out of spec that unsetenv() can
return -1 with errno = ENOMEM (from malloc()), but that hardly
matters. Hmm... could a static buffer be used for that clobbering
trick?

Hm, hadn't thought about that particularly. I did think about
hacking setenv.c to try to reduce memory leakage from repeated
sets of the same variable (basically, try to overwrite the existing
environ member whenever the new value is no longer than the old).
But on the whole I think it'd all be wasted effort. Neither setenv.c
nor unsetenv.c will ever be used on any machine that anyone would care
about performance of. If I were willing to retire gaur I'd vote for
just nuking both of them.

For the note in parens: it looks like the 3 main BSDs all switched
from the historical void function to the POSIX one 12-18 years
ago[1][2][3], so I wouldn't worry about that. Glibc made a similar
change 19 years ago.

Ah, thanks for the research. I'd found the glibc change from references
in current Linux man pages, but I didn't run down the BSD history.

(My other pet dinosaur prairiedog still has void unsetenv(), btw,
presumably because macOS is based on turn-of-the-century NetBSD.
Modern macOS does follow POSIX here; not sure when they changed.)

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: Let's start using setenv()

Since the cfbot seems happy with v1, here's a v2 that runs around
and converts all putenv() uses to setenv(). In some places there's
no notational savings, but it seems to me that standardizing
on just one way to do it is beneficial.

I'm not sure if there would be any value in revising win32env.c to
treat setenv, rather than putenv, as the primary use-case (and
switching to some corresponding Windows call instead of _putenv).
Perhaps some Windows hacker would be interested in considering that.
For myself, I'm happy with this version as it stands.

regards, tom lane

Attachments:

support-using-setenv-2.patchtext/x-diff; charset=us-ascii; name=support-using-setenv-2.patchDownload+242-215
#5Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#4)
Re: Let's start using setenv()

On Wed, Dec 30, 2020 at 5:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Since the cfbot seems happy with v1, here's a v2 that runs around
and converts all putenv() uses to setenv(). In some places there's
no notational savings, but it seems to me that standardizing
on just one way to do it is beneficial.

+        if (setenv("KRB5_KTNAME", pg_krb_server_keyfile, 0) != 0)
         {
-            size_t        kt_len = strlen(pg_krb_server_keyfile) + 14;
-            char       *kt_path = malloc(kt_len);
-
-            if (!kt_path ||
-                snprintf(kt_path, kt_len, "KRB5_KTNAME=%s",
-                         pg_krb_server_keyfile) != kt_len - 2 ||
-                putenv(kt_path) != 0)
-            {
-                ereport(LOG,
-                        (errcode(ERRCODE_OUT_OF_MEMORY),
-                         errmsg("out of memory")));
-                return STATUS_ERROR;
-            }
+            /* We assume the error must be "out of memory" */
+            ereport(LOG,
+                    (errcode(ERRCODE_OUT_OF_MEMORY),
+                     errmsg("out of memory")));
+            return STATUS_ERROR;
         }

Wouldn't it be better to do "cannot set environment: %m" or similar,
and let ENOMEM do its thing if that be the cause? It's true that
POSIX only allows EINVAL or ENOMEM and we can see no reason for EINVAL
here, but still it seems unnecessary to assume.

-    if (getenv("PGLOCALEDIR") == NULL)
-    {
-        /* set for libpq to use */
-        snprintf(env_path, sizeof(env_path), "PGLOCALEDIR=%s", path);
-        canonicalize_path(env_path + 12);
-        dup_path = strdup(env_path);
-        if (dup_path)
-            putenv(dup_path);
-    }
+    /* set for libpq to use, but don't override existing setting */
+    setenv("PGLOCALEDIR", path, 0);

Did you want to drop the canonicalize_path() logic here and nearby?

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#5)
Re: Let's start using setenv()

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Dec 30, 2020 at 5:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
+            /* We assume the error must be "out of memory" */
+            ereport(LOG,
+                    (errcode(ERRCODE_OUT_OF_MEMORY),
+                     errmsg("out of memory")));

Wouldn't it be better to do "cannot set environment: %m" or similar,
and let ENOMEM do its thing if that be the cause?

That's no problem as far as the error text goes, but we lack a way to
choose a relevant errcode(). I guess I could fix it for ENOMEM
specifically, along the lines of

errcode(errno == ENOMEM ? ERRCODE_OUT_OF_MEMORY :
ERRCODE_INTERNAL_ERROR)

This is a bit trickier than it looks though, because code within an
ereport macro isn't really supposed to rely on errno still being
the same as at entry. Is it worth inventing an errcode_for_errno()
helper routine, which could look at the stashed errno? Or maybe
extending/abusing errcode_for_file_access() so it recognizes ENOMEM?

Or we could just use ERRCODE_OUT_OF_MEMORY, without being too picky
about whether that's accurate.

Did you want to drop the canonicalize_path() logic here and nearby?

Yeah, because the results of get_locale_path et al are already
canonicalized, so those canonicalize_path calls are redundant.

Thanks for looking at the patch!

regards, tom lane