clean up obsolete initdb locale handling

Started by Peter Eisentrautover 6 years ago6 messages
#1Peter Eisentraut
peter.eisentraut@2ndquadrant.com
1 attachment(s)

A long time ago, we changed LC_COLLATE and LC_CTYPE from cluster-wide to
per-database (61d967498802ab86d8897cb3c61740d7e9d712f6). There is some
leftover code from that in initdb.c and backend/main/main.c to pass
these environment variables around in the expectations that the backend
will write them to pg_control during bootstrap, which is of course all a
lie now.

The attached patch cleans that up. (Not totally sure about the WIN32
block, but the change seems good to me.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-initdb-Remove-obsolete-locale-handling.patchtext/plain; charset=UTF-8; name=0001-initdb-Remove-obsolete-locale-handling.patch; x-mac-creator=0; x-mac-type=0Download
From 51568d2923ce26ecf43f028dd4075376cfb864fb Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 8 Aug 2019 12:33:07 +0200
Subject: [PATCH] initdb: Remove obsolete locale handling

The method of passing LC_COLLATE and LC_CTYPE to the backend during
initdb is obsolete as of 61d967498802ab86d8897cb3c61740d7e9d712f6.
This can all be removed.
---
 src/backend/main/main.c | 39 ++++++++++-----------------------------
 src/bin/initdb/initdb.c | 14 --------------
 2 files changed, 10 insertions(+), 43 deletions(-)

diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 7b18f8c758..1bd849fb51 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -99,42 +99,23 @@ main(int argc, char *argv[])
 	MemoryContextInit();
 
 	/*
-	 * Set up locale information from environment.  Note that LC_CTYPE and
-	 * LC_COLLATE will be overridden later from pg_control if we are in an
-	 * already-initialized database.  We set them here so that they will be
-	 * available to fill pg_control during initdb.  LC_MESSAGES will get set
-	 * later during GUC option processing, but we set it here to allow startup
-	 * error messages to be localized.
+	 * Set up locale information
 	 */
-
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("postgres"));
 
-#ifdef WIN32
-
 	/*
-	 * Windows uses codepages rather than the environment, so we work around
-	 * that by querying the environment explicitly first for LC_COLLATE and
-	 * LC_CTYPE. We have to do this because initdb passes those values in the
-	 * environment. If there is nothing there we fall back on the codepage.
+	 * LC_COLLATE and LC_CTYPE are set at backend start to values from
+	 * pg_database.  We start out LC_COLLATE as "C" (nothing in the postmaster
+	 * should care).  But LC_CTYPE should be set to something reasonable so
+	 * that gettext can work.
 	 */
-	{
-		char	   *env_locale;
-
-		if ((env_locale = getenv("LC_COLLATE")) != NULL)
-			init_locale("LC_COLLATE", LC_COLLATE, env_locale);
-		else
-			init_locale("LC_COLLATE", LC_COLLATE, "");
-
-		if ((env_locale = getenv("LC_CTYPE")) != NULL)
-			init_locale("LC_CTYPE", LC_CTYPE, env_locale);
-		else
-			init_locale("LC_CTYPE", LC_CTYPE, "");
-	}
-#else
-	init_locale("LC_COLLATE", LC_COLLATE, "");
+	init_locale("LC_COLLATE", LC_COLLATE, "C");
 	init_locale("LC_CTYPE", LC_CTYPE, "");
-#endif
 
+	/*
+	 * LC_MESSAGES will get set later during GUC option processing, but we set
+	 * it here to allow startup error messages to be localized.
+	 */
 #ifdef LC_MESSAGES
 	init_locale("LC_MESSAGES", LC_MESSAGES, "");
 #endif
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 551d379d85..88a261d9bd 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1417,20 +1417,6 @@ bootstrap_template1(void)
 	bki_lines = replace_token(bki_lines, "LC_CTYPE",
 							  escape_quotes_bki(lc_ctype));
 
-	/*
-	 * Pass correct LC_xxx environment to bootstrap.
-	 *
-	 * The shell script arranged to restore the LC settings afterwards, but
-	 * there doesn't seem to be any compelling reason to do that.
-	 */
-	snprintf(cmd, sizeof(cmd), "LC_COLLATE=%s", lc_collate);
-	putenv(pg_strdup(cmd));
-
-	snprintf(cmd, sizeof(cmd), "LC_CTYPE=%s", lc_ctype);
-	putenv(pg_strdup(cmd));
-
-	unsetenv("LC_ALL");
-
 	/* Also ensure backend isn't confused by this environment var: */
 	unsetenv("PGCLIENTENCODING");
 
-- 
2.22.0

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: clean up obsolete initdb locale handling

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

A long time ago, we changed LC_COLLATE and LC_CTYPE from cluster-wide to
per-database (61d967498802ab86d8897cb3c61740d7e9d712f6). There is some
leftover code from that in initdb.c and backend/main/main.c to pass
these environment variables around in the expectations that the backend
will write them to pg_control during bootstrap, which is of course all a
lie now.

Well, the comments' references to pg_control are indeed obsolete, but why
wouldn't we just replace that with references to "the appropriate entry in
pg_database"? I don't see why that movement changed anything about what
should happen here. In particular, I'm concerned that this patch will
result in subtle changes in what settings get chosen during initdb.

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: clean up obsolete initdb locale handling

I wrote:

... In particular, I'm concerned that this patch will
result in subtle changes in what settings get chosen during initdb.

OK, after reviewing the code a bit more I take that back --- initdb's
choices are entirely made within initdb.

However, I don't much like the choice to set LC_COLLATE and LC_CTYPE
differently. That seems to be risking weird behavior, and for what?
I'd be inclined to just remove the WIN32 stanza, initialize all
three of these variables with "", and explain it along the lines of

* In the postmaster, absorb the environment values for LC_COLLATE
* and LC_CTYPE. Individual backends will change these later to
* settings taken from pg_database, but the postmaster cannot do
* that. If we leave these set to "C" then message localization
* might not work well in the postmaster.

That ends up being no code change in main.c, except on Windows.
I concur that we can drop the transmission of LC_COLLATE and
LC_CTYPE via environment variables.

regards, tom lane

#4Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#3)
1 attachment(s)
Re: clean up obsolete initdb locale handling

On 2019-08-08 17:51, Tom Lane wrote:

However, I don't much like the choice to set LC_COLLATE and LC_CTYPE
differently. That seems to be risking weird behavior, and for what?
I'd be inclined to just remove the WIN32 stanza, initialize all
three of these variables with "", and explain it along the lines of

* In the postmaster, absorb the environment values for LC_COLLATE
* and LC_CTYPE. Individual backends will change these later to
* settings taken from pg_database, but the postmaster cannot do
* that. If we leave these set to "C" then message localization
* might not work well in the postmaster.

OK, let's do it like that. Updated patch attached.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0001-initdb-Remove-obsolete-locale-handling.patchtext/plain; charset=UTF-8; name=v2-0001-initdb-Remove-obsolete-locale-handling.patch; x-mac-creator=0; x-mac-type=0Download
From e3deca5b7236d1f83a286f347338f320ca952147 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 12 Aug 2019 20:09:40 +0200
Subject: [PATCH v2] initdb: Remove obsolete locale handling

The method of passing LC_COLLATE and LC_CTYPE to the backend during
initdb is obsolete as of 61d967498802ab86d8897cb3c61740d7e9d712f6.
This can all be removed.

Discussion: https://www.postgresql.org/message-id/flat/eeaf2f99-a1a6-8aca-3f43-9ab0b2fb112a%402ndquadrant.com
---
 src/backend/main/main.c | 38 ++++++++++----------------------------
 src/bin/initdb/initdb.c | 14 --------------
 2 files changed, 10 insertions(+), 42 deletions(-)

diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 7b18f8c758..a9edbfd4a4 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -99,42 +99,24 @@ main(int argc, char *argv[])
 	MemoryContextInit();
 
 	/*
-	 * Set up locale information from environment.  Note that LC_CTYPE and
-	 * LC_COLLATE will be overridden later from pg_control if we are in an
-	 * already-initialized database.  We set them here so that they will be
-	 * available to fill pg_control during initdb.  LC_MESSAGES will get set
-	 * later during GUC option processing, but we set it here to allow startup
-	 * error messages to be localized.
+	 * Set up locale information
 	 */
-
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("postgres"));
 
-#ifdef WIN32
-
 	/*
-	 * Windows uses codepages rather than the environment, so we work around
-	 * that by querying the environment explicitly first for LC_COLLATE and
-	 * LC_CTYPE. We have to do this because initdb passes those values in the
-	 * environment. If there is nothing there we fall back on the codepage.
+	 * In the postmaster, absorb the environment values for LC_COLLATE and
+	 * LC_CTYPE.  Individual backends will change these later to settings
+	 * taken from pg_database, but the postmaster cannot do that.  If we leave
+	 * these set to "C" then message localization might not work well in the
+	 * postmaster.
 	 */
-	{
-		char	   *env_locale;
-
-		if ((env_locale = getenv("LC_COLLATE")) != NULL)
-			init_locale("LC_COLLATE", LC_COLLATE, env_locale);
-		else
-			init_locale("LC_COLLATE", LC_COLLATE, "");
-
-		if ((env_locale = getenv("LC_CTYPE")) != NULL)
-			init_locale("LC_CTYPE", LC_CTYPE, env_locale);
-		else
-			init_locale("LC_CTYPE", LC_CTYPE, "");
-	}
-#else
 	init_locale("LC_COLLATE", LC_COLLATE, "");
 	init_locale("LC_CTYPE", LC_CTYPE, "");
-#endif
 
+	/*
+	 * LC_MESSAGES will get set later during GUC option processing, but we set
+	 * it here to allow startup error messages to be localized.
+	 */
 #ifdef LC_MESSAGES
 	init_locale("LC_MESSAGES", LC_MESSAGES, "");
 #endif
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 551d379d85..88a261d9bd 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1417,20 +1417,6 @@ bootstrap_template1(void)
 	bki_lines = replace_token(bki_lines, "LC_CTYPE",
 							  escape_quotes_bki(lc_ctype));
 
-	/*
-	 * Pass correct LC_xxx environment to bootstrap.
-	 *
-	 * The shell script arranged to restore the LC settings afterwards, but
-	 * there doesn't seem to be any compelling reason to do that.
-	 */
-	snprintf(cmd, sizeof(cmd), "LC_COLLATE=%s", lc_collate);
-	putenv(pg_strdup(cmd));
-
-	snprintf(cmd, sizeof(cmd), "LC_CTYPE=%s", lc_ctype);
-	putenv(pg_strdup(cmd));
-
-	unsetenv("LC_ALL");
-
 	/* Also ensure backend isn't confused by this environment var: */
 	unsetenv("PGCLIENTENCODING");
 
-- 
2.22.0

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: clean up obsolete initdb locale handling

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

OK, let's do it like that. Updated patch attached.

LGTM, but I don't have the ability to test it on Windows.

regards, tom lane

#6Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: clean up obsolete initdb locale handling

On 2019-08-12 20:17, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

OK, let's do it like that. Updated patch attached.

LGTM, but I don't have the ability to test it on Windows.

Committed, after some testing on Windows.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services