orangutan seizes up during isolation-check
Buildfarm member orangutan has failed chronically on both of the branches for
which it still reports, HEAD and REL9_1_STABLE, for over two years. The
postmaster appears to jam during isolation-check. Dave, orangutan currently
has one such jammed postmaster for each branch. Could you gather some
information about the running processes? Specifically, it would be helpful to
see the output of "ps -el" and a stack trace for each running PostgreSQL
process. (If there are enough PostgreSQL processes to make stack traces
tedious to acquire, it will be almost as good to have traces for each
postmaster and one autovacuum worker per postmaster.) Thanks. Best not to
kill the processes yet, in case we need more information.
The rest of this message is just a dump my observations from the data already
available. The jammed postmasters fail to complete fast shutdown requests.
Beyond that, the symptoms are different on HEAD versus 9.1. The 2014-07-09
run is representative for HEAD. multiple-row-versions.spec failed like this
after having run for almost 21 hours:
--- 1,2 ----
Parsed test spec with 4 sessions
! Connection 2 to database failed:
\ No newline at end of file
I don't know what would cause PQconnectdb() to hang for 21 hours before
failing with a blank error message. Note that the hang duration and the spec
in which the hang falls varies from failure to failure. All subsequent specs
then fail like this:
--- 1,4 ----
Parsed test spec with 2 sessions
! Connection 0 to database failed: could not connect to server: Connection refused
! Is the server running locally and accepting
! connections on Unix domain socket "/tmp/.s.PGSQL.5678"?
One can get ECONNREFUSED from a Unix-domain socket when the listen() backlog
is full. At this point, we've made only two connection attempts since the
last successful one and only about 40 attempts since last postmaster startup.
I have no good theories remaining at the moment. The postmaster log ends in
1211 copies of this message:
WARNING: worker took too long to start; canceled.
At the default autovacuum_naptime=1min, that represents 20:11:00 of autovacuum
launch failures. The postmaster had been running about 20:55:42 by the time
we collected that log, suggesting that autovacuum was healthy until 40-45
minutes into the doomed PQconnectdb() call. I'm hypothesizing that the
postmaster ceased serving autovacuum launcher requests. A jammed postmaster
tends to explain both the ECONNREFUSED symptom and the autovacuum symptom.
In REL9_1_STABLE, isolation-check completes, but the StopDb-C:2 step that
follows isolation-check fails to stop the server. (If you go back far enough
in the history, suites other than isolation-check occasionally jam the
server.) The server log ends like this:
LOG: received fast shutdown request
LOG: aborting any active transactions
LOG: autovacuum launcher shutting down
That suggests a postmaster stuck in PM_WAIT_BACKENDS. The process data should
illuminate this situation.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Noah Misch <noah@leadboat.com> writes:
Buildfarm member orangutan has failed chronically on both of the branches for
which it still reports, HEAD and REL9_1_STABLE, for over two years. The
postmaster appears to jam during isolation-check. Dave, orangutan currently
has one such jammed postmaster for each branch. Could you gather some
information about the running processes?
What's particularly odd is that orangutan seems to be running an only
slightly out-of-date OS X release, which is hardly an unusual
configuration. My own laptop gets through isolation-check just fine.
Seems like there must be something nonstandard about orangutan's
software ... but what?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 02, 2014 at 12:25:39AM -0400, Tom Lane wrote:
Noah Misch <noah@leadboat.com> writes:
Buildfarm member orangutan has failed chronically on both of the branches for
which it still reports, HEAD and REL9_1_STABLE, for over two years. The
postmaster appears to jam during isolation-check. Dave, orangutan currently
has one such jammed postmaster for each branch. Could you gather some
information about the running processes?What's particularly odd is that orangutan seems to be running an only
slightly out-of-date OS X release, which is hardly an unusual
configuration. My own laptop gets through isolation-check just fine.
Seems like there must be something nonstandard about orangutan's
software ... but what?
Agreed. The difference is durable across OS X releases, because orangutan
showed like symptoms under 10.7.3. Dave assisted me off-list with data
collection and experimentation. Ultimately, --enable-nls was the key
distinction, the absence of which spares the other OS X buildfarm animals.
The explanation for ECONNREFUSED was more pedestrian than the reasons I had
guessed. There were no jammed postmasters running as of the above writing.
Rather, the postmasters were gone, but the socket directory entries remained.
That happens when the postmaster suffers a "kill -9", a SIGSEGV, an assertion
failure, or a similar abrupt exit.
When I reproduced the problem, CountChildren() was attempting to walk a
corrupt BackendList. Sometimes, the list had an entry such that e->next == e;
these send CountChildren() into an infinite loop. Other times, testing "if
(bp->dead_end)" prompted a segfault. That explains orangutan sometimes
failing quickly and other times hanging for hours. Every crash showed at
least two threads running in the postmaster. Multiple threads bring trouble
in the form of undefined behavior for fork() w/o exec() and for sigprocmask().
The postmaster uses sigprocmask() to block most signals when doing something
nontrivial; this allows it to do such nontrivial work in signal handlers. A
sequence of 74 buildfarm runs caught 27 cases of a secondary thread running a
signal handler, 14 cases of two signal handlers running at once, and one
user-visible postmaster failure.
libintl replaces setlocale(). Its setlocale(LC_x, "") uses OS-specific APIs
to determine the default locale when $LANG and similar environment variables
are empty, as they are during "make check NO_LOCALE=1". On OS X, it calls[1]http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/localename.c;h=78dc344bba191417855670fb751210d3608db6e6;hb=HEAD#l2883
CFLocaleCopyCurrent(), which in turn spins up a thread. See the end of this
message for the postmaster thread stacks active upon hitting a breakpoint set
at _dispatch_mgr_thread.
I see two options for fixing this in pg_perm_setlocale(LC_x, ""):
1. Fork, call setlocale(LC_x, "") in the child, pass back the effective locale
name through a pipe, and pass that name to setlocale() in the original
process. The short-lived child will get the extra threads, and the
postmaster will remain clean.
2. On OS X, check for relevant environment variables. Finding none, set
LC_x=C before calling setlocale(LC_x, ""). A variation is to raise
ereport(FATAL) if sufficient environment variables aren't in place. Either
way ensures the libintl setlocale() will never call CFLocaleCopyCurrent().
This is simpler than (1), but it entails a behavior change: "LANG= initdb"
will use LANG=C or fail rather than use the OS X user account locale.
I'm skeptical of the value of looking up locale information using other OS X
facilities when the usual environment variables are inconclusive, but I see no
clear cause to reverse that decision now. I lean toward (1).
Thanks,
nm
thread #1: tid = 0xeccea9, 0x00007fff9066b372 libsystem_notify.dylib`notify_register_check + 30, queue = 'com.apple.main-thread'
frame #0: 0x00007fff9066b372 libsystem_notify.dylib`notify_register_check + 30
frame #1: 0x00007fff987cf261 libsystem_info.dylib`__si_module_static_ds_block_invoke + 109
frame #2: 0x00007fff944d628d libdispatch.dylib`_dispatch_client_callout + 8
frame #3: 0x00007fff944d61fc libdispatch.dylib`dispatch_once_f + 79
frame #4: 0x00007fff987cf1f2 libsystem_info.dylib`si_module_static_ds + 42
frame #5: 0x00007fff987cec65 libsystem_info.dylib`si_module_with_name + 60
frame #6: 0x00007fff987cf0e7 libsystem_info.dylib`si_module_config_modules_for_category + 168
frame #7: 0x00007fff987cedbd libsystem_info.dylib`__si_module_static_search_block_invoke + 87
frame #8: 0x00007fff944d628d libdispatch.dylib`_dispatch_client_callout + 8
frame #9: 0x00007fff944d61fc libdispatch.dylib`dispatch_once_f + 79
frame #10: 0x00007fff987ced64 libsystem_info.dylib`si_module_static_search + 42
frame #11: 0x00007fff987cec65 libsystem_info.dylib`si_module_with_name + 60
frame #12: 0x00007fff987d0cf2 libsystem_info.dylib`getpwuid + 32
frame #13: 0x00007fff8dce629c CoreFoundation`CFCopyHomeDirectoryURLForUser + 124
frame #14: 0x00007fff8dce5a84 CoreFoundation`+[CFPrefsSource withSourceForIdentifier:user:byHost:container:perform:] + 372
frame #15: 0x00007fff8dce58fb CoreFoundation`-[CFPrefsSearchListSource addSourceForIdentifier:user:byHost:] + 123
frame #16: 0x00007fff8dcec1bb CoreFoundation`+[CFPrefsSearchListSource withSnapshotSearchList:] + 331
frame #17: 0x00007fff8dcec037 CoreFoundation`__CFXPreferencesCopyCurrentApplicationState + 151
frame #18: 0x00007fff8dcebb8c CoreFoundation`_CFLocaleCopyCurrentGuts + 524
frame #19: 0x00000001006e6b87 libintl.8.dylib`_nl_locale_name_default + 47
frame #20: 0x00000001006e707b libintl.8.dylib`libintl_setlocale + 367
frame #21: 0x00000001002fdb52 postgres`pg_perm_setlocale(category=1, locale=<unavailable>) + 18 at pg_locale.c:153
frame #22: 0x00000001001a6ef7 postgres`main(argc=1, argv=0x0000000100907340) + 103 at main.c:127
thread #3: tid = 0xeccec2, 0x00007fff93349e6a libsystem_kernel.dylib`__workq_kernreturn + 10
frame #0: 0x00007fff93349e6a libsystem_kernel.dylib`__workq_kernreturn + 10
frame #1: 0x00007fff8ea13f08 libsystem_pthread.dylib`_pthread_wqthread + 330
* thread #2: tid = 0xeccec3, 0x00007fff944d8102 libdispatch.dylib`_dispatch_mgr_thread, queue = 'com.apple.root.libdispatch-manager', stop reason = breakpoint 1.2
* frame #0: 0x00007fff944d8102 libdispatch.dylib`_dispatch_mgr_thread
frame #1: 0x00007fff944d7f87 libdispatch.dylib`_dispatch_root_queue_drain + 75
frame #2: 0x00007fff944d7ed2 libdispatch.dylib`_dispatch_worker_thread + 119
frame #3: 0x00007fff8ea12899 libsystem_pthread.dylib`_pthread_body + 138
frame #4: 0x00007fff8ea1272a libsystem_pthread.dylib`_pthread_start + 137
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/15/2014 07:51 AM, Noah Misch wrote:
libintl replaces setlocale(). Its setlocale(LC_x, "") uses OS-specific APIs
to determine the default locale when $LANG and similar environment variables
are empty, as they are during "make check NO_LOCALE=1". On OS X, it calls[1]
CFLocaleCopyCurrent(), which in turn spins up a thread. See the end of this
message for the postmaster thread stacks active upon hitting a breakpoint set
at _dispatch_mgr_thread.
Ugh. I'd call that a bug in libintl. setlocale() has no business to make
the process multi-threaded.
Do we have the same problem in backends? At a quick glance, aside from
postmaster we only use PG_SETMASK(&BlockSig) in signal handlers, to
prevent another signal handler from running concurrently.
I see two options for fixing this in pg_perm_setlocale(LC_x, ""):
1. Fork, call setlocale(LC_x, "") in the child, pass back the effective locale
name through a pipe, and pass that name to setlocale() in the original
process. The short-lived child will get the extra threads, and the
postmaster will remain clean.2. On OS X, check for relevant environment variables. Finding none, set
LC_x=C before calling setlocale(LC_x, ""). A variation is to raise
ereport(FATAL) if sufficient environment variables aren't in place. Either
way ensures the libintl setlocale() will never call CFLocaleCopyCurrent().
This is simpler than (1), but it entails a behavior change: "LANG= initdb"
will use LANG=C or fail rather than use the OS X user account locale.I'm skeptical of the value of looking up locale information using other OS X
facilities when the usual environment variables are inconclusive, but I see no
clear cause to reverse that decision now. I lean toward (1).
Both of those are horrible hacks. And who's to say that calling
setlocale(LC_x, "foo") won't also call some function that makes the
process multi-threaded. If not in any current OS X release, it might
still happen in a future one.
One idea would be to use an extra pthread mutex or similar, in addition
to PG_SETMASK(). Whenever you do PG_SETMASK(&BlockSig), also grab the
mutex, and release it when you do PG_SETMASK(&UnBlockSig).
It would be nice to stop doing non-trivial things in the signal handler
in the first place. It's pretty scary, even though it works when the
process is single-threaded. I believe the reason it's currently
implemented like that are the same problems that the latch code solves
with the self-pipe trick: select() is not interrupted by a signal on all
platforms, and even if it was, you would need pselect() with is not
available (or does not work correctly even if it exists) on all
platforms. I think we could use a latch in postmaster too.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 15, 2014 at 10:11:57AM +0300, Heikki Linnakangas wrote:
On 09/15/2014 07:51 AM, Noah Misch wrote:
libintl replaces setlocale(). Its setlocale(LC_x, "") uses OS-specific APIs
to determine the default locale when $LANG and similar environment variables
are empty, as they are during "make check NO_LOCALE=1". On OS X, it calls[1]
CFLocaleCopyCurrent(), which in turn spins up a thread. See the end of this
message for the postmaster thread stacks active upon hitting a breakpoint set
at _dispatch_mgr_thread.Ugh. I'd call that a bug in libintl. setlocale() has no business to
make the process multi-threaded.
Fair interpretation. libintl's options for mitigating the problem are even
more constrained, unfortunately.
Do we have the same problem in backends? At a quick glance, aside
from postmaster we only use PG_SETMASK(&BlockSig) in signal
handlers, to prevent another signal handler from running
concurrently.
In backends, we pass a specific value, not "", to pg_perm_setlocale(). (I
expect the problem in EXEC_BACKEND builds, but I doubt anyone uses that as a
production configuration on OS X.) Every frontend program does call
setlocale(LC_ALL, "") via set_pglocale_pgservice(). None use sigprocmask(),
but a few do use fork().
1. Fork, call setlocale(LC_x, "") in the child, pass back the effective locale
name through a pipe, and pass that name to setlocale() in the original
process. The short-lived child will get the extra threads, and the
postmaster will remain clean.
I'm skeptical of the value of looking up locale information using other OS X
facilities when the usual environment variables are inconclusive, but I see no
clear cause to reverse that decision now. I lean toward (1).Both of those are horrible hacks. And who's to say that calling
setlocale(LC_x, "foo") won't also call some function that makes the
process multi-threaded. If not in any current OS X release, it might
still happen in a future one.
Right. Not just setlocale(); a large body of APIs could change that way. The
CAVEATS section[1]https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/fork.2.html of the OS X fork() manual page suggests Apple had that in
mind at some point. To be 100% safe, we would write the postmaster as though
it's always multithreaded, including making[2]/messages/by-id/1349280184-sup-326@alvh.no-ip.org EXEC_BACKEND the only supported
configuration on OS X.
Assuming we don't do that anytime soon, I did have in mind to make the
postmaster raise an error if it becomes multithreaded. (Apple's
pthread_is_threaded_np() is usable for this check.) Then, at least, we'll
more-easily hear about new problems of this nature.
One idea would be to use an extra pthread mutex or similar, in
addition to PG_SETMASK(). Whenever you do PG_SETMASK(&BlockSig),
also grab the mutex, and release it when you do
PG_SETMASK(&UnBlockSig).
I had considered and tested such a thing, and it sufficed to clear orangutan's
present troubles. You still face undefined behavior after fork(). It's okay
in practice if libdispatch threads are careful to register pthread_atfork()
handlers for any relevant resources. It's a fair option, but running
setlocale(LC_x, "") in a short-lived child assumes less about the system
library implementation, seems no more of a hack to me, and isolates the
overhead at postmaster start.
[1]: https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/fork.2.html
[2]: /messages/by-id/1349280184-sup-326@alvh.no-ip.org
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 15, 2014 at 12:51:14AM -0400, Noah Misch wrote:
1. Fork, call setlocale(LC_x, "") in the child, pass back the effective locale
name through a pipe, and pass that name to setlocale() in the original
process. The short-lived child will get the extra threads, and the
postmaster will remain clean.
Here's an implementation thereof covering both backend and frontend use of
setlocale(). A setlocale() wrapper, pg_setlocale(), injects use of the child
process where necessary. I placed the new function in src/common/exec.c,
because every executable already needed that file's code and translatable
strings for set_pglocale_pgservice() and callees. (Some executables do miss
exec.c translations; I did not update them.) src/port/chklocale.c was closer
in subject matter, but libpq imports it; this function does not belong in
libpq. Also, this function would benefit from a frontend ereport()
implementation, which is likely to land in libpgcommon if it lands at all.
The runtime changes are conditional on __darwin__ but not on --enable-nls.
NLS builds are the production norm; I'd like non-NLS builds to consistently
exercise this code rather than shave its bytes and cycles.
pg_setlocale() relies on main() setting all six LC_<category> environment
variables. The second attached patch seals the cracks in main()'s
longstanding attempt to do so. This improves preexisting user-visible
behavior in weird cases: "LANG=pt_BR.utf8 LC_ALL=invalid postgres -D nosuch"
will now print untranslated text, not pt_BR text.
Documentation for each of lc_monetary, lc_numeric and lc_time says "If this
variable is set to the empty string (which is the default) then the value is
inherited from the execution environment of the server in a system-dependent
way." Not so; by design, setlocale(LC_x, "") just re-selects the last value L
passed to pg_perm_setlocale(LC_x, L). I have not touched this; I mention it
for the sake of the archives.
Attachments:
setlocale-darwin-fork-v1.patchtext/plain; charset=us-asciiDownload
commit cd27ff7 (HEAD)
Author: Noah Misch <noah@leadboat.com>
AuthorDate: Fri Oct 10 04:04:03 2014 -0400
Commit: Noah Misch <noah@leadboat.com>
CommitDate: Fri Oct 10 04:04:03 2014 -0400
When setlocale(LC_x, "") will start a thread, run it in a child process.
Only Darwin, --enable-nls builds use a setlocale() that can start a
thread. Buildfarm member orangutan experienced BackendList corruption
on account of different postmaster threads executing signal handlers
simultaneously. Furthermore, a multithreaded postmaster risks undefined
behavior from sigprocmask() and fork().
Introduce pg_setlocale(), a wrapper around setlocale(). When the
corresponding raw setlocale() call could start a thread, it forks,
retrieves the setlocale() return value from the child, and uses that
non-"" locale value to make an equivalent setlocale() call in the
original process. The short-lived child does become multithreaded, but
it uses no feature for which that poses a problem. Use of
pg_setlocale() in the frontend protects forking executables, such as
pg_dump, from undefined behavior. As the usage guideline comment
implies, whether to call setlocale() or pg_setlocale() is a mere style
question for most code sites. Opt for pg_setlocale() at indifferent
code sites, leaving raw setlocale() calls in src/interfaces, in
pg_setlocale() itself, and in callees of pg_setlocale(). Back-patch to
9.0 (all supported versions).
diff --git a/configure b/configure
index f0580ce..ee08963 100755
--- a/configure
+++ b/configure
@@ -11298,7 +11298,7 @@ fi
LIBS_including_readline="$LIBS"
LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
-for ac_func in cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l
+for ac_func in cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l
do :
as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index 527b076..2f8adfc 100644
--- a/configure.in
+++ b/configure.in
@@ -1257,7 +1257,7 @@ PGAC_FUNC_GETTIMEOFDAY_1ARG
LIBS_including_readline="$LIBS"
LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
-AC_CHECK_FUNCS([cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l])
+AC_CHECK_FUNCS([cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l])
AC_REPLACE_FUNCS(fseeko)
case $host_os in
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
index bbfcab7..524d95d 100644
--- a/contrib/pg_upgrade/check.c
+++ b/contrib/pg_upgrade/check.c
@@ -1051,7 +1051,7 @@ get_canonical_locale_name(int category, const char *locale)
char *res;
/* get the current setting, so we can restore it. */
- save = setlocale(category, NULL);
+ save = pg_setlocale(category, NULL);
if (!save)
pg_fatal("failed to get the current locale\n");
@@ -1059,7 +1059,7 @@ get_canonical_locale_name(int category, const char *locale)
save = pg_strdup(save);
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+ res = pg_setlocale(category, locale);
if (!res)
pg_fatal("failed to get system locale name for \"%s\"\n", locale);
@@ -1067,7 +1067,7 @@ get_canonical_locale_name(int category, const char *locale)
res = pg_strdup(res);
/* restore old value. */
- if (!setlocale(category, save))
+ if (!pg_setlocale(category, save))
pg_fatal("failed to restore old locale \"%s\"\n", save);
pg_free(save);
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6220a8e..0017b5f 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -87,6 +87,10 @@
#include <dns_sd.h>
#endif
+#ifdef HAVE_PTHREAD_IS_THREADED_NP
+#include <pthread.h>
+#endif
+
#include "access/transam.h"
#include "access/xlog.h"
#include "bootstrap/bootstrap.h"
@@ -1656,6 +1660,15 @@ ServerLoop(void)
last_touch_time = now;
}
+#ifdef HAVE_PTHREAD_IS_THREADED_NP
+
+ /*
+ * With assertions enabled, check regularly for appearance of
+ * additional threads. All builds check softly in ExitPostmaster().
+ */
+ Assert(pthread_is_threaded_np() == 0);
+#endif
+
/*
* If we already sent SIGQUIT to children and they are slow to shut
* down, it's time to send them SIGKILL. This doesn't happen
@@ -4733,6 +4746,22 @@ SubPostmasterMain(int argc, char *argv[])
static void
ExitPostmaster(int status)
{
+#ifdef HAVE_PTHREAD_IS_THREADED_NP
+
+ /*
+ * The postmaster calls sigprocmask() and calls fork() without an
+ * immediate exec(). Both practices have undefined behavior in a
+ * multithreaded program. Diagnose this on platforms providing a
+ * convenient means to do so. However, a multithreaded postmaster is the
+ * normal case on Windows, which offers neither fork() nor sigprocmask().
+ */
+ if (pthread_is_threaded_np() != 0)
+ ereport(LOG,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("postmaster became multithreaded"),
+ errdetail("Please report this to <pgsql-bugs@postgresql.org>.")));
+#endif
+
/* should cleanup shared memory and kill all backends */
/*
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 94bb5a4..870e7bc 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -34,13 +34,13 @@
* mind in future: on some platforms, the locale functions return pointers
* to static data that will be overwritten by any later locale function.
* Thus, for example, the obvious-looking sequence
- * save = setlocale(category, NULL);
- * if (!setlocale(category, value))
+ * save = pg_setlocale(category, NULL);
+ * if (!pg_setlocale(category, value))
* fail = true;
- * setlocale(category, save);
- * DOES NOT WORK RELIABLY: on some platforms the second setlocale() call
+ * pg_setlocale(category, save);
+ * DOES NOT WORK RELIABLY: on some platforms the second pg_setlocale() call
* will change the memory save is pointing at. To do this sort of thing
- * safely, you *must* pstrdup what setlocale returns the first time.
+ * safely, you *must* pstrdup what pg_setlocale returns the first time.
*
* FYI, The Open Group locale standard is defined here:
*
@@ -131,16 +131,16 @@ static char *IsoLocaleName(const char *); /* MSVC specific */
/*
* pg_perm_setlocale
*
- * This wraps the libc function setlocale(), with two additions. First, when
- * changing LC_CTYPE, update gettext's encoding for the current message
- * domain. GNU gettext automatically tracks LC_CTYPE on most platforms, but
- * not on Windows. Second, if the operation is successful, the corresponding
- * LC_XXX environment variable is set to match. By setting the environment
- * variable, we ensure that any subsequent use of setlocale(..., "") will
- * preserve the settings made through this routine. Of course, LC_ALL must
- * also be unset to fully ensure that, but that has to be done elsewhere after
- * all the individual LC_XXX variables have been set correctly. (Thank you
- * Perl for making this kluge necessary.)
+ * This wraps pg_setlocale(), with two additions. First, when changing
+ * LC_CTYPE, update gettext's encoding for the current message domain. GNU
+ * gettext automatically tracks LC_CTYPE on most platforms, but not on
+ * Windows. Second, if the operation is successful, the corresponding LC_XXX
+ * environment variable is set to match. By setting the environment variable,
+ * we ensure that any subsequent use of setlocale(..., "") will preserve the
+ * settings made through this routine. Of course, LC_ALL must also be unset
+ * to fully ensure that, but that has to be done elsewhere after all the
+ * individual LC_XXX variables have been set correctly. (Thank you Perl for
+ * making this kluge necessary.)
*/
char *
pg_perm_setlocale(int category, const char *locale)
@@ -150,7 +150,7 @@ pg_perm_setlocale(int category, const char *locale)
char *envbuf;
#ifndef WIN32
- result = setlocale(category, locale);
+ result = pg_setlocale(category, locale);
#else
/*
@@ -168,7 +168,7 @@ pg_perm_setlocale(int category, const char *locale)
}
else
#endif
- result = setlocale(category, locale);
+ result = pg_setlocale(category, locale);
#endif /* WIN32 */
if (result == NULL)
@@ -258,7 +258,7 @@ check_locale(int category, const char *locale, char **canonname)
if (canonname)
*canonname = NULL; /* in case of failure */
- save = setlocale(category, NULL);
+ save = pg_setlocale(category, NULL);
if (!save)
return false; /* won't happen, we hope */
@@ -266,14 +266,14 @@ check_locale(int category, const char *locale, char **canonname)
save = pstrdup(save);
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+ res = pg_setlocale(category, locale);
/* save canonical name if requested. */
if (res && canonname)
*canonname = pstrdup(res);
/* restore old value. */
- if (!setlocale(category, save))
+ if (!pg_setlocale(category, save))
elog(WARNING, "failed to restore old locale \"%s\"", save);
pfree(save);
@@ -454,11 +454,11 @@ PGLC_localeconv(void)
free_struct_lconv(&CurrentLocaleConv);
/* Save user's values of monetary and numeric locales */
- save_lc_monetary = setlocale(LC_MONETARY, NULL);
+ save_lc_monetary = pg_setlocale(LC_MONETARY, NULL);
if (save_lc_monetary)
save_lc_monetary = pstrdup(save_lc_monetary);
- save_lc_numeric = setlocale(LC_NUMERIC, NULL);
+ save_lc_numeric = pg_setlocale(LC_NUMERIC, NULL);
if (save_lc_numeric)
save_lc_numeric = pstrdup(save_lc_numeric);
@@ -486,16 +486,16 @@ PGLC_localeconv(void)
*/
/* save user's value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
+ save_lc_ctype = pg_setlocale(LC_CTYPE, NULL);
if (save_lc_ctype)
save_lc_ctype = pstrdup(save_lc_ctype);
/* use numeric to set the ctype */
- setlocale(LC_CTYPE, locale_numeric);
+ pg_setlocale(LC_CTYPE, locale_numeric);
#endif
/* Get formatting information for numeric */
- setlocale(LC_NUMERIC, locale_numeric);
+ pg_setlocale(LC_NUMERIC, locale_numeric);
extlconv = localeconv();
encoding = pg_get_encoding_from_locale(locale_numeric, true);
@@ -505,11 +505,11 @@ PGLC_localeconv(void)
#ifdef WIN32
/* use monetary to set the ctype */
- setlocale(LC_CTYPE, locale_monetary);
+ pg_setlocale(LC_CTYPE, locale_monetary);
#endif
/* Get formatting information for monetary */
- setlocale(LC_MONETARY, locale_monetary);
+ pg_setlocale(LC_MONETARY, locale_monetary);
extlconv = localeconv();
encoding = pg_get_encoding_from_locale(locale_monetary, true);
@@ -532,14 +532,14 @@ PGLC_localeconv(void)
/* Try to restore internal settings */
if (save_lc_monetary)
{
- if (!setlocale(LC_MONETARY, save_lc_monetary))
+ if (!pg_setlocale(LC_MONETARY, save_lc_monetary))
elog(WARNING, "failed to restore old locale");
pfree(save_lc_monetary);
}
if (save_lc_numeric)
{
- if (!setlocale(LC_NUMERIC, save_lc_numeric))
+ if (!pg_setlocale(LC_NUMERIC, save_lc_numeric))
elog(WARNING, "failed to restore old locale");
pfree(save_lc_numeric);
}
@@ -548,7 +548,7 @@ PGLC_localeconv(void)
/* Try to restore internal ctype settings */
if (save_lc_ctype)
{
- if (!setlocale(LC_CTYPE, save_lc_ctype))
+ if (!pg_setlocale(LC_CTYPE, save_lc_ctype))
elog(WARNING, "failed to restore old locale");
pfree(save_lc_ctype);
}
@@ -640,7 +640,7 @@ cache_locale_time(void)
elog(DEBUG3, "cache_locale_time() executed; locale: \"%s\"", locale_time);
/* save user's value of time locale */
- save_lc_time = setlocale(LC_TIME, NULL);
+ save_lc_time = pg_setlocale(LC_TIME, NULL);
if (save_lc_time)
save_lc_time = pstrdup(save_lc_time);
@@ -656,15 +656,15 @@ cache_locale_time(void)
*/
/* save user's value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
+ save_lc_ctype = pg_setlocale(LC_CTYPE, NULL);
if (save_lc_ctype)
save_lc_ctype = pstrdup(save_lc_ctype);
/* use lc_time to set the ctype */
- setlocale(LC_CTYPE, locale_time);
+ pg_setlocale(LC_CTYPE, locale_time);
#endif
- setlocale(LC_TIME, locale_time);
+ pg_setlocale(LC_TIME, locale_time);
timenow = time(NULL);
timeinfo = localtime(&timenow);
@@ -707,7 +707,7 @@ cache_locale_time(void)
/* try to restore internal settings */
if (save_lc_time)
{
- if (!setlocale(LC_TIME, save_lc_time))
+ if (!pg_setlocale(LC_TIME, save_lc_time))
elog(WARNING, "failed to restore old locale");
pfree(save_lc_time);
}
@@ -716,7 +716,7 @@ cache_locale_time(void)
/* try to restore internal ctype settings */
if (save_lc_ctype)
{
- if (!setlocale(LC_CTYPE, save_lc_ctype))
+ if (!pg_setlocale(LC_CTYPE, save_lc_ctype))
elog(WARNING, "failed to restore old locale");
pfree(save_lc_ctype);
}
@@ -945,7 +945,7 @@ lc_collate_is_c(Oid collation)
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
+ localeptr = pg_setlocale(LC_COLLATE, NULL);
if (!localeptr)
elog(ERROR, "invalid LC_COLLATE setting");
@@ -995,7 +995,7 @@ lc_ctype_is_c(Oid collation)
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
+ localeptr = pg_setlocale(LC_CTYPE, NULL);
if (!localeptr)
elog(ERROR, "invalid LC_CTYPE setting");
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 665ac10..11f46f3 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -984,7 +984,7 @@ pg_bind_textdomain_codeset(const char *domainname)
int new_msgenc;
#ifndef WIN32
- const char *ctype = setlocale(LC_CTYPE, NULL);
+ const char *ctype = pg_setlocale(LC_CTYPE, NULL);
if (pg_strcasecmp(ctype, "C") == 0 || pg_strcasecmp(ctype, "POSIX") == 0)
#endif
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index c8ff2cb..2744a75 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2487,12 +2487,12 @@ locale_date_order(const char *locale)
result = DATEORDER_MDY; /* default */
- save = setlocale(LC_TIME, NULL);
+ save = pg_setlocale(LC_TIME, NULL);
if (!save)
return result;
save = pg_strdup(save);
- setlocale(LC_TIME, locale);
+ pg_setlocale(LC_TIME, locale);
memset(&testtime, 0, sizeof(testtime));
testtime.tm_mday = 22;
@@ -2501,7 +2501,7 @@ locale_date_order(const char *locale)
res = my_strftime(buf, sizeof(buf), "%x", &testtime);
- setlocale(LC_TIME, save);
+ pg_setlocale(LC_TIME, save);
free(save);
if (res == 0)
@@ -2545,7 +2545,7 @@ check_locale_name(int category, const char *locale, char **canonname)
if (canonname)
*canonname = NULL; /* in case of failure */
- save = setlocale(category, NULL);
+ save = pg_setlocale(category, NULL);
if (!save)
{
fprintf(stderr, _("%s: setlocale() failed\n"),
@@ -2557,14 +2557,14 @@ check_locale_name(int category, const char *locale, char **canonname)
save = pg_strdup(save);
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+ res = pg_setlocale(category, locale);
/* save canonical name if requested. */
if (res && canonname)
*canonname = pg_strdup(res);
/* restore old value. */
- if (!setlocale(category, save))
+ if (!pg_setlocale(category, save))
{
fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
progname, save);
diff --git a/src/common/exec.c b/src/common/exec.c
index 037bef2..4f364d8 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -20,12 +20,17 @@
#include "postgres_fe.h"
#endif
+#include <limits.h>
#include <signal.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <unistd.h>
#ifndef FRONTEND
+#include "miscadmin.h"
+#endif
+
+#ifndef FRONTEND
/* We use only 3- and 4-parameter elog calls in this file, for simplicity */
/* NOTE: caller must provide gettext call around str! */
#define log_error(str, param) elog(LOG, str, param)
@@ -556,7 +561,7 @@ set_pglocale_pgservice(const char *argv0, const char *app)
/* don't set LC_ALL in the backend */
if (strcmp(app, PG_TEXTDOMAIN("postgres")) != 0)
- setlocale(LC_ALL, "");
+ pg_setlocale(LC_ALL, "");
if (find_my_exec(argv0, my_exec_path) < 0)
return;
@@ -586,6 +591,223 @@ set_pglocale_pgservice(const char *argv0, const char *app)
}
}
+#ifndef pg_setlocale
+static char *setlocale_native_forked(int category);
+
+/*
+ * pg_setlocale
+ *
+ * Usage Guidelines
+ * ----------------
+ *
+ * In the backend, it is always safe to replace a plain setlocale() call with
+ * a pg_setlocale() call. Use of pg_setlocale() is mandatory in code that can
+ * run before main() finishes initializing the process locale. After that,
+ * pg_setlocale() degenerates to plain setlocale(), and either is acceptable.
+ *
+ * In the frontend, pg_setlocale() has three noteworthy side effects:
+ *
+ * 1. Calls exit(EXIT_FAILURE) on internal error, much like pg_malloc().
+ * 2. Forks a child that calls a non-async-signal-safe function. If the
+ * original process was multithreaded, behavior is undefined.
+ * 3. The process will receive SIGCHLD for one already-waited child, much as
+ * it does during system().
+ *
+ * src/bin tolerates those effects. src/interfaces libraries do not; they
+ * must use plain setlocale().
+ *
+ * Background
+ * ----------
+ *
+ * On Darwin, libintl's replacement setlocale() calls CFLocaleCopyCurrent()
+ * when its second argument is "" and every relevant environment variable is
+ * unset or empty. CFLocaleCopyCurrent() makes the process multithreaded,
+ * which raises several programming hazards we'd just as soon not face. This
+ * function addresses the problem by retrieving the locale string in a
+ * short-lived child process, where the extra thread is acceptable.
+ *
+ * libintl's replacement newlocale() behaves likewise. All our newlocale()
+ * calls happen in the backend, after main() has initialized the process
+ * locale. So long as that is true, they require no wrapper.
+ */
+char *
+pg_setlocale(int category, const char *locale)
+{
+ const char *env_lang;
+
+ /* Plain setlocale() is fine for non-"" locale arguments. */
+ if (locale == NULL || locale[0] != '\0')
+ goto degenerate;
+
+#ifndef FRONTEND
+
+ /*
+ * After main()'s series of pg_perm_setlocale() calls, every locale
+ * category environment variable has a non-empty value. Recognizing this
+ * case is no mere optimization, because the slow path's use of
+ * ereport(FATAL) is unacceptable during normal operation.
+ */
+ if (MyProcPid != 0)
+ goto degenerate;
+#endif
+
+ /*
+ * With LANG available, libintl_setlocale(category, "") never calls
+ * CFLocaleCopyCurrent(). Optimize that common case.
+ */
+ env_lang = getenv("LANG");
+ if (env_lang != NULL && env_lang[0] != '\0')
+ goto degenerate;
+
+ return setlocale_native_forked(category);
+degenerate:
+ return setlocale(category, locale);
+}
+
+static char *
+setlocale_native_forked(int category)
+{
+#ifdef FRONTEND
+
+ /*
+ * In place of an ugly #ifdef FRONTEND variation of each ereport(), squash
+ * most of them to a generic message.
+ */
+#define ereport_f(elevel, rest) \
+ do { \
+ fprintf(stderr, _("could not determine native locale\n")); \
+ exit(EXIT_FAILURE); \
+ } while (0)
+#else
+#define ereport_f(elevel, rest) ereport(elevel, rest)
+#endif
+
+ int pipefd[2];
+ sigset_t block_chld;
+ sigset_t old_sigs;
+ pid_t pid;
+ pid_t waited_pid;
+ int exit_status;
+
+ /*
+ * The longest locale strings come from setlocale(LC_ALL, ""), reaching
+ * 101 bytes on Darwin under environment variable settings such as
+ * "LANG=fr_BE.ISO8859-15 LC_NUMERIC=it_CH.ISO8859-1". That's comfortably
+ * below PIPE_BUF of 512.
+ */
+ char locbuf[PIPE_BUF];
+ int rlen;
+
+ if (pipe(pipefd) != 0)
+ ereport_f(FATAL,
+ (errcode_for_file_access(),
+ errmsg("could not create locale retrieval pipe: %m")));
+
+ /* Block SIGCHLD so we can get an exit status. */
+ sigemptyset(&block_chld);
+ sigaddset(&block_chld, SIGCHLD);
+ sigprocmask(SIG_BLOCK, &block_chld, &old_sigs);
+
+ /*
+ * Run the child.
+ */
+ pid = fork();
+ if (pid < 0)
+ {
+ /*
+ * This is the most likely non-bug error, so furnish a precise message
+ * in both backend and frontend.
+ */
+#ifdef FRONTEND
+ fprintf(stderr, _("could not fork for locale retrieval: %s\n"),
+ strerror(errno));
+ exit(EXIT_FAILURE);
+#else
+ ereport(FATAL,
+ (errmsg("could not fork for locale retrieval: %m")));
+#endif
+ }
+ else if (pid == 0)
+ {
+ char *result;
+ int status = 0;
+
+ /*
+ * Limit the child's actions to setlocale() and to system calls. This
+ * shields other PostgreSQL facilities from the need to contemplate
+ * implications of executing within this special-case child process.
+ * It makes unnecessary the additional actions of fork_process() and
+ * the post-fork actions of BackendStartup(). These restrictions do
+ * have us miss precise logging for a failed system call, but these
+ * calls almost cannot fail.
+ *
+ * When we would otherwise write() enough bytes to possibly block,
+ * truncate instead. The parent process notices this truncation and
+ * raises an error. Since we ensure that write() won't block, don't
+ * bother closing the read end of the pipe. (This also prevents
+ * SIGPIPE, which typically still has SIG_DFL.) Close the write end
+ * to detect EIO, though EIO is likely impossible when closing a pipe.
+ */
+
+ result = setlocale(category, "");
+
+ if (result != NULL)
+ {
+ int wlen = Min(strlen(result), sizeof(locbuf));
+
+ if (write(pipefd[1], result, wlen) != wlen)
+ status = 1;
+ }
+ if (close(pipefd[1]) != 0)
+ status = 1;
+
+ _exit(status);
+ }
+
+ /* Wait for the child to exit. */
+ waited_pid = waitpid(pid, &exit_status, 0);
+ sigprocmask(SIG_SETMASK, &old_sigs, NULL);
+ if (waited_pid != pid)
+ ereport_f(FATAL,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("waitpid() failed: %m")));
+ if (exit_status != 0)
+ ereport_f(FATAL,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("could not determine native locale: %s",
+ wait_result_to_str(exit_status))));
+
+ /*
+ * Retrieve the child's work. Child writes nothing if setlocale() returns
+ * NULL; otherwise, it writes the returned string, unterminated. (This
+ * assumes setlocale() never returns "".)
+ */
+ if (close(pipefd[1]) != 0)
+ ereport_f(FATAL,
+ (errcode_for_file_access(),
+ errmsg("could not close locale retrieval pipe: %m")));
+ rlen = read(pipefd[0], locbuf, sizeof(locbuf));
+ if (rlen < 0)
+ ereport_f(FATAL,
+ (errcode_for_file_access(),
+ errmsg("could not read from locale retrieval pipe: %m")));
+ else if (rlen == sizeof(locbuf))
+ ereport_f(FATAL,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("native locale setting is too long (maximum %d bytes)",
+ (int) sizeof(locbuf))));
+ if (close(pipefd[0]) != 0)
+ ereport_f(FATAL,
+ (errcode_for_file_access(),
+ errmsg("could not close locale retrieval pipe: %m")));
+
+ if (rlen == 0)
+ return NULL;
+ locbuf[rlen] = '\0';
+ return setlocale(category, locbuf);
+}
+#endif
+
#ifdef WIN32
/*
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index ddcf4b0..933736f 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -391,6 +391,9 @@
/* Define to 1 if the PS_STRINGS thing exists. */
#undef HAVE_PS_STRINGS
+/* Define to 1 if you have the `pthread_is_threaded_np' function. */
+#undef HAVE_PTHREAD_IS_THREADED_NP
+
/* Define to 1 if you have the <pwd.h> header file. */
#undef HAVE_PWD_H
diff --git a/src/include/port.h b/src/include/port.h
index 9f8465e..824a072 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -213,6 +213,12 @@ extern char *pgwin32_setlocale(int category, const char *locale);
#define setlocale(a,b) pgwin32_setlocale(a,b)
#endif /* WIN32 */
+#ifdef __darwin__
+extern char *pg_setlocale(int category, const char *locale);
+#else
+#define pg_setlocale(category, locale) setlocale(category, locale)
+#endif
+
/* Portable prompt handling */
extern char *simple_prompt(const char *prompt, int maxlen, bool echo);
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 1d025d4..36853d1 100644
--- a/src/pl/plperl/plperl.c
+++ b/src/pl/plperl/plperl.c
@@ -734,15 +734,15 @@ plperl_init_interp(void)
*save_numeric,
*save_time;
- loc = setlocale(LC_COLLATE, NULL);
+ loc = pg_setlocale(LC_COLLATE, NULL);
save_collate = loc ? pstrdup(loc) : NULL;
- loc = setlocale(LC_CTYPE, NULL);
+ loc = pg_setlocale(LC_CTYPE, NULL);
save_ctype = loc ? pstrdup(loc) : NULL;
- loc = setlocale(LC_MONETARY, NULL);
+ loc = pg_setlocale(LC_MONETARY, NULL);
save_monetary = loc ? pstrdup(loc) : NULL;
- loc = setlocale(LC_NUMERIC, NULL);
+ loc = pg_setlocale(LC_NUMERIC, NULL);
save_numeric = loc ? pstrdup(loc) : NULL;
- loc = setlocale(LC_TIME, NULL);
+ loc = pg_setlocale(LC_TIME, NULL);
save_time = loc ? pstrdup(loc) : NULL;
#define PLPERL_RESTORE_LOCALE(name, saved) \
@@ -3902,7 +3902,7 @@ plperl_inline_callback(void *arg)
static char *
setlocale_perl(int category, char *locale)
{
- char *RETVAL = setlocale(category, locale);
+ char *RETVAL = pg_setlocale(category, locale);
if (RETVAL)
{
@@ -3917,7 +3917,7 @@ setlocale_perl(int category, char *locale)
#ifdef LC_ALL
if (category == LC_ALL)
- newctype = setlocale(LC_CTYPE, NULL);
+ newctype = pg_setlocale(LC_CTYPE, NULL);
else
#endif
newctype = RETVAL;
@@ -3935,7 +3935,7 @@ setlocale_perl(int category, char *locale)
#ifdef LC_ALL
if (category == LC_ALL)
- newcoll = setlocale(LC_COLLATE, NULL);
+ newcoll = pg_setlocale(LC_COLLATE, NULL);
else
#endif
newcoll = RETVAL;
@@ -3954,7 +3954,7 @@ setlocale_perl(int category, char *locale)
#ifdef LC_ALL
if (category == LC_ALL)
- newnum = setlocale(LC_NUMERIC, NULL);
+ newnum = pg_setlocale(LC_NUMERIC, NULL);
else
#endif
newnum = RETVAL;
diff --git a/src/port/chklocale.c b/src/port/chklocale.c
index 588dfd9..a58c39a 100644
--- a/src/port/chklocale.c
+++ b/src/port/chklocale.c
@@ -278,6 +278,13 @@ pg_codepage_to_encoding(UINT cp)
*
* If running in the backend and write_message is false, this function must
* cope with the possibility that elog() and palloc() are not yet usable.
+ *
+ * This departs from the pg_setlocale() usage guidelines. It should use
+ * pg_setlocale() when pg_bind_textdomain_codeset() calls it before the
+ * postmaster locale is fully initialized. However, it should use plain
+ * setlocale() when libpq calls it. The difference doesn't matter for a NULL
+ * locale argument. Those callers pass NULL ctype, making the question
+ * academic. Use plain setlocale() to reduce libpq code footprint.
*/
int
pg_get_encoding_from_locale(const char *ctype, bool write_message)
setlocale-main-harden-v1.patchtext/plain; charset=us-asciiDownload
commit 6525617
Author: Noah Misch <noah@leadboat.com>
AuthorDate: Fri Oct 10 04:03:41 2014 -0400
Commit: Noah Misch <noah@leadboat.com>
CommitDate: Fri Oct 10 04:03:41 2014 -0400
Always set the six locale category environment variables in main().
Typical server invocations already achieved that. Invalid locale
settings in the initial postmaster environment interfered, as could
malloc() failure. Setting "LC_MESSAGES=pt_BR.utf8 LC_ALL=invalid" in
the postmaster environment will now choose C-locale messages, not
Brazilian Portuguese messages. Most localized programs, including all
PostgreSQL frontend executables, do likewise. Users are unlikely to
observe changes involving locale categories other than LC_MESSAGES.
CheckMyDatabase() ensures that we successfully set LC_COLLATE and
LC_CTYPE; main() sets the remaining three categories to locale "C",
which almost cannot fail. Back-patch to 9.0 (all supported versions).
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index c51b391..7e6cd0e 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -43,6 +43,7 @@ const char *progname;
static void startup_hacks(const char *progname);
+static void init_locale(int category, const char *locale);
static void help(const char *progname);
static void check_root(const char *progname);
@@ -115,31 +116,31 @@ main(int argc, char *argv[])
char *env_locale;
if ((env_locale = getenv("LC_COLLATE")) != NULL)
- pg_perm_setlocale(LC_COLLATE, env_locale);
+ init_locale(LC_COLLATE, env_locale);
else
- pg_perm_setlocale(LC_COLLATE, "");
+ init_locale(LC_COLLATE, "");
if ((env_locale = getenv("LC_CTYPE")) != NULL)
- pg_perm_setlocale(LC_CTYPE, env_locale);
+ init_locale(LC_CTYPE, env_locale);
else
- pg_perm_setlocale(LC_CTYPE, "");
+ init_locale(LC_CTYPE, "");
}
#else
- pg_perm_setlocale(LC_COLLATE, "");
- pg_perm_setlocale(LC_CTYPE, "");
+ init_locale(LC_COLLATE, "");
+ init_locale(LC_CTYPE, "");
#endif
#ifdef LC_MESSAGES
- pg_perm_setlocale(LC_MESSAGES, "");
+ init_locale(LC_MESSAGES, "");
#endif
/*
* We keep these set to "C" always, except transiently in pg_locale.c; see
* that file for explanations.
*/
- pg_perm_setlocale(LC_MONETARY, "C");
- pg_perm_setlocale(LC_NUMERIC, "C");
- pg_perm_setlocale(LC_TIME, "C");
+ init_locale(LC_MONETARY, "C");
+ init_locale(LC_NUMERIC, "C");
+ init_locale(LC_TIME, "C");
/*
* Now that we have absorbed as much as we wish to from the locale
@@ -272,6 +273,23 @@ startup_hacks(const char *progname)
/*
+ * Make the initial permanent setting for a locale category. If that fails,
+ * perhaps due to LC_foo=invalid in the environment, use locale C. If even
+ * that fails, perhaps due to out-of-memory, the entire startup fails with it.
+ * When this returns, we are guaranteed to have a setting for the given
+ * category's environment variable.
+ */
+static void
+init_locale(int category, const char *locale)
+{
+ if (pg_perm_setlocale(category, locale) == NULL &&
+ pg_perm_setlocale(category, "C") == NULL)
+ elog(FATAL, "could not adopt C locale");
+}
+
+
+
+/*
* Help display should match the options accepted by PostmasterMain()
* and PostgresMain().
*
On 10/10/14 8:24 PM, Noah Misch wrote:
Here's an implementation thereof covering both backend and frontend use of
setlocale(). A setlocale() wrapper, pg_setlocale(), injects use of the child
process where necessary.
Would such a change not be better in gnulib itself?
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 10, 2014 at 09:14:38PM -0400, Peter Eisentraut wrote:
On 10/10/14 8:24 PM, Noah Misch wrote:
Here's an implementation thereof covering both backend and frontend use of
setlocale(). A setlocale() wrapper, pg_setlocale(), injects use of the child
process where necessary.Would such a change not be better in gnulib itself?
Good question. It would be nice to make the change there, for the benefit of
other consumers. The patch's setlocale_native_forked() assumes it never runs
in a multithreaded process, but libintl_setlocale() must not assume that. I
see a few ways libintl/gnulib might proceed:
1) exec() a helper program to run CFLocaleCopyCurrent(). Distributing that
executable alongside libintl and locating it at runtime is daunting.
2) Audit the CFLocaleCopyCurrent() code to see if it will, in practice, run
reliably in a fork of a multithreaded process. (That conclusion could
change without notice.) Use the setlocale_native_forked() technique.
3) Don't change libintl_setlocale(), but add a libintl_setlocale_nothread()
for single-threaded consumers to adopt. Each consumer will need to modify
code. libintl has a lean API; I expect adding this would be controversial.
Among those, I would probably adopt (2). We know much about the process
conditions in which pg_setlocale() will run, so placing the workaround in
PostgreSQL erases the problem of (2). I'm inclined to stick with placing the
workaround in PostgreSQL, but there are fair arguments on both sides.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/11/14 1:41 AM, Noah Misch wrote:
Good question. It would be nice to make the change there, for the benefit of
other consumers. The patch's setlocale_native_forked() assumes it never runs
in a multithreaded process, but libintl_setlocale() must not assume that. I
see a few ways libintl/gnulib might proceed:
Yeah, it's difficult to see how they might proceed if they keep calling
into Core Foundation, which might do anything, now or in the future.
I went ahead and submitted a bug report to gettext:
https://savannah.gnu.org/bugs/index.php?43404
(They way I understand it is that the files concerned originate in
gettext and are copied to gnulib.)
Let's see what they say.
I like the idea of calling pthread_is_threaded_np() as a verification.
This appears to be a OS X-specific function at the moment. If other
platforms start adding it, then we'll run into the usual problems of how
to link binaries that use pthread functions. Maybe that's not a
realistic concern.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Oct 11, 2014 at 09:07:46AM -0400, Peter Eisentraut wrote:
On 10/11/14 1:41 AM, Noah Misch wrote:
Good question. It would be nice to make the change there, for the benefit of
other consumers. The patch's setlocale_native_forked() assumes it never runs
in a multithreaded process, but libintl_setlocale() must not assume that. I
see a few ways libintl/gnulib might proceed:Yeah, it's difficult to see how they might proceed if they keep calling
into Core Foundation, which might do anything, now or in the future.I went ahead and submitted a bug report to gettext:
https://savannah.gnu.org/bugs/index.php?43404(They way I understand it is that the files concerned originate in
gettext and are copied to gnulib.)Let's see what they say.
The gettext maintainer was open to implementing the setlocale_native_forked()
technique in gettext, though the last visible progress was in October. In any
event, PostgreSQL builds will see older gettext for several years. If
setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster
check during startup whether it has become multithreaded. If multithreaded:
FATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.
I wondered whether to downgrade FATAL to LOG in back branches. Introducing a
new reason to block startup is disruptive for a minor release, but having the
postmaster deadlock at an unpredictable later time is even more disruptive. I
am inclined to halt startup that way in all branches.
I like the idea of calling pthread_is_threaded_np() as a verification.
This appears to be a OS X-specific function at the moment. If other
platforms start adding it, then we'll run into the usual problems of how
to link binaries that use pthread functions. Maybe that's not a
realistic concern.
True. As written, "configure" will report the function unavailable if it
requires threading libraries. For a measure that's just a backstop against
other bugs, that may be just right.
I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a
good thing to have regardless of what happens with gettext.
Thanks,
nm
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/28/2014 04:58 PM, Noah Misch wrote:
On Sat, Oct 11, 2014 at 09:07:46AM -0400, Peter Eisentraut wrote:
On 10/11/14 1:41 AM, Noah Misch wrote:
Good question. It would be nice to make the change there, for the benefit of
other consumers. The patch's setlocale_native_forked() assumes it never runs
in a multithreaded process, but libintl_setlocale() must not assume that. I
see a few ways libintl/gnulib might proceed:Yeah, it's difficult to see how they might proceed if they keep calling
into Core Foundation, which might do anything, now or in the future.I went ahead and submitted a bug report to gettext:
https://savannah.gnu.org/bugs/index.php?43404(They way I understand it is that the files concerned originate in
gettext and are copied to gnulib.)Let's see what they say.
The gettext maintainer was open to implementing the setlocale_native_forked()
technique in gettext, though the last visible progress was in October. In any
event, PostgreSQL builds will see older gettext for several years. If
setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster
check during startup whether it has become multithreaded. If multithreaded:FATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.I wondered whether to downgrade FATAL to LOG in back branches. Introducing a
new reason to block startup is disruptive for a minor release, but having the
postmaster deadlock at an unpredictable later time is even more disruptive. I
am inclined to halt startup that way in all branches.
Yeah. It should be easily fixable, AIUI, and startup is surely a good
and obvious time to to that.
I like the idea of calling pthread_is_threaded_np() as a verification.
This appears to be a OS X-specific function at the moment. If other
platforms start adding it, then we'll run into the usual problems of how
to link binaries that use pthread functions. Maybe that's not a
realistic concern.True. As written, "configure" will report the function unavailable if it
requires threading libraries. For a measure that's just a backstop against
other bugs, that may be just right.I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a
good thing to have regardless of what happens with gettext.
I'm OK with this, but on its own it won't fix orangutan's problems, will it?
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Dec 28, 2014 at 4:58 PM, Noah Misch <noah@leadboat.com> wrote:
I wondered whether to downgrade FATAL to LOG in back branches. Introducing a
new reason to block startup is disruptive for a minor release, but having the
postmaster deadlock at an unpredictable later time is even more disruptive. I
am inclined to halt startup that way in all branches.
Jeepers. I'd rather not do that. From your report, this problem has
been around for years. Yet, as far as I know, it's bothering very few
real users, some of whom might be far more bothered by the postmaster
suddenly failing to start. I'm fine with a FATAL in master, but I'd
vote against doing anything that might prevent startup in the
back-branches without more compelling justification.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Dec 28, 2014 at 07:20:04PM -0500, Andrew Dunstan wrote:
On 12/28/2014 04:58 PM, Noah Misch wrote:
The gettext maintainer was open to implementing the setlocale_native_forked()
technique in gettext, though the last visible progress was in October. In any
event, PostgreSQL builds will see older gettext for several years. If
setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster
check during startup whether it has become multithreaded. If multithreaded:FATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.
I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a
good thing to have regardless of what happens with gettext.I'm OK with this, but on its own it won't fix orangutan's problems, will it?
Right; setlocale-main-harden-v1.patch fixes a bug not affecting orangutan at
all. None of the above will make orangutan turn green. Checking
multithreading during startup would merely let it fail cleanly.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
So at this point removing the --enable-nls from my config will solve the
build problem.
Everyone knows there is an issue so there is no point in continuing to have
it fail.
On 31 December 2014 at 13:52, Noah Misch <noah@leadboat.com> wrote:
Show quoted text
On Sun, Dec 28, 2014 at 07:20:04PM -0500, Andrew Dunstan wrote:
On 12/28/2014 04:58 PM, Noah Misch wrote:
The gettext maintainer was open to implementing the
setlocale_native_forked()
technique in gettext, though the last visible progress was in October.
In any
event, PostgreSQL builds will see older gettext for several years. If
setlocale-darwin-fork-v1.patch is not wanted, I suggest making thepostmaster
check during startup whether it has become multithreaded. If
multithreaded:
FATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.I would like to go ahead and commit setlocale-main-harden-v1.patch,
which is a
good thing to have regardless of what happens with gettext.
I'm OK with this, but on its own it won't fix orangutan's problems, will
it?
Right; setlocale-main-harden-v1.patch fixes a bug not affecting orangutan
at
all. None of the above will make orangutan turn green. Checking
multithreading during startup would merely let it fail cleanly.
On Wed, Dec 31, 2014 at 12:32:37AM -0500, Robert Haas wrote:
On Sun, Dec 28, 2014 at 4:58 PM, Noah Misch <noah@leadboat.com> wrote:
I wondered whether to downgrade FATAL to LOG in back branches. Introducing a
new reason to block startup is disruptive for a minor release, but having the
postmaster deadlock at an unpredictable later time is even more disruptive. I
am inclined to halt startup that way in all branches.Jeepers. I'd rather not do that. From your report, this problem has
been around for years. Yet, as far as I know, it's bothering very few
real users, some of whom might be far more bothered by the postmaster
suddenly failing to start. I'm fine with a FATAL in master, but I'd
vote against doing anything that might prevent startup in the
back-branches without more compelling justification.
Clusters hosted on OS X fall into these categories:
1) Unaffected configuration. This includes everyone setting a valid messages
locale via LANG, LC_ALL or LC_MESSAGES.
2) Affected configuration. Through luck and light use, the cluster would not
experience the crashes/hangs.
3) Cluster would experience the crashes/hangs.
DBAs in (3) want the FATAL at startup, but those in (2) want a LOG message
instead. DBAs in (1) don't care. Since intermittent postmaster hangs are far
worse than startup failure, if (2) and (3) have similar population, FATAL is
the better bet. If (2) is sufficiently more populous than (3), then the many
small pricks from startup failure do add up to hurt more than the occasional
postmaster hang. Who knows how that calculation plays out.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 31, 2014 at 01:55:23PM -0500, Dave Cramer wrote:
So at this point removing the --enable-nls from my config will solve the
build problem.Everyone knows there is an issue so there is no point in continuing to have
it fail.
We hope all packagers will build with --enable-nls, so OS X buildfarm coverage
thereof will be valuable. Let me see what I can pull together over the next
several weeks toward getting it green.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 31, 2014 at 01:56:08PM -0500, Noah Misch wrote:
On Wed, Dec 31, 2014 at 12:32:37AM -0500, Robert Haas wrote:
On Sun, Dec 28, 2014 at 4:58 PM, Noah Misch <noah@leadboat.com> wrote:
I wondered whether to downgrade FATAL to LOG in back branches. Introducing a
new reason to block startup is disruptive for a minor release, but having the
postmaster deadlock at an unpredictable later time is even more disruptive. I
am inclined to halt startup that way in all branches.Jeepers. I'd rather not do that. From your report, this problem has
been around for years. Yet, as far as I know, it's bothering very few
real users, some of whom might be far more bothered by the postmaster
suddenly failing to start. I'm fine with a FATAL in master, but I'd
vote against doing anything that might prevent startup in the
back-branches without more compelling justification.Clusters hosted on OS X fall into these categories:
1) Unaffected configuration. This includes everyone setting a valid messages
locale via LANG, LC_ALL or LC_MESSAGES.
2) Affected configuration. Through luck and light use, the cluster would not
experience the crashes/hangs.
3) Cluster would experience the crashes/hangs.DBAs in (3) want the FATAL at startup, but those in (2) want a LOG message
instead. DBAs in (1) don't care. Since intermittent postmaster hangs are far
worse than startup failure, if (2) and (3) have similar population, FATAL is
the better bet. If (2) is sufficiently more populous than (3), then the many
small pricks from startup failure do add up to hurt more than the occasional
postmaster hang. Who knows how that calculation plays out.
The first attached patch, for all branches, adds LOG-level messages and an
assertion. So cassert builds will fail hard, while others won't. The second
patch, for master only, changes the startup-time message to FATAL. If we
decide to use FATAL in all branches, I would just squash them into one.
Attachments:
darwin-check-multithreaded-v1.patchtext/plain; charset=us-asciiDownload
diff --git a/configure b/configure
index 7594401..f7dedc0 100755
--- a/configure
+++ b/configure
@@ -11366,7 +11366,7 @@ fi
LIBS_including_readline="$LIBS"
LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
-for ac_func in cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l
+for ac_func in cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l
do :
as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index 0dc3f18..76c6405 100644
--- a/configure.in
+++ b/configure.in
@@ -1265,7 +1265,7 @@ PGAC_FUNC_GETTIMEOFDAY_1ARG
LIBS_including_readline="$LIBS"
LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
-AC_CHECK_FUNCS([cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l])
+AC_CHECK_FUNCS([cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat pthread_is_threaded_np readlink setproctitle setsid shm_open sigprocmask symlink sync_file_range towlower utime utimes wcstombs wcstombs_l])
AC_REPLACE_FUNCS(fseeko)
case $host_os in
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 5106f52..6b0190c 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -87,6 +87,10 @@
#include <dns_sd.h>
#endif
+#ifdef HAVE_PTHREAD_IS_THREADED_NP
+#include <pthread.h>
+#endif
+
#include "access/transam.h"
#include "access/xlog.h"
#include "bootstrap/bootstrap.h"
@@ -1200,6 +1204,24 @@ PostmasterMain(int argc, char *argv[])
*/
RemovePgTempFiles();
+#ifdef HAVE_PTHREAD_IS_THREADED_NP
+
+ /*
+ * On Darwin, libintl replaces setlocale() with a version that calls
+ * CFLocaleCopyCurrent() when its second argument is "" and every relevant
+ * environment variable is unset or empty. CFLocaleCopyCurrent() makes
+ * the process multithreaded. The postmaster calls sigprocmask() and
+ * calls fork() without an immediate exec(), both of which have undefined
+ * behavior in a multithreaded program. A multithreaded postmaster is the
+ * normal case on Windows, which offers neither fork() nor sigprocmask().
+ */
+ if (pthread_is_threaded_np() != 0)
+ ereport(LOG,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("postmaster became multithreaded during startup"),
+ errhint("Set the LC_ALL environment variable to a valid locale.")));
+#endif
+
/*
* Remember postmaster startup time
*/
@@ -1657,6 +1679,15 @@ ServerLoop(void)
last_touch_time = now;
}
+#ifdef HAVE_PTHREAD_IS_THREADED_NP
+
+ /*
+ * With assertions enabled, check regularly for appearance of
+ * additional threads. All builds check at start and exit.
+ */
+ Assert(pthread_is_threaded_np() == 0);
+#endif
+
/*
* If we already sent SIGQUIT to children and they are slow to shut
* down, it's time to send them SIGKILL. This doesn't happen
@@ -4745,6 +4776,18 @@ SubPostmasterMain(int argc, char *argv[])
static void
ExitPostmaster(int status)
{
+#ifdef HAVE_PTHREAD_IS_THREADED_NP
+
+ /*
+ * There is no known cause for a postmaster to become multithreaded after
+ * startup. Recheck to account for the possibility of unknown causes.
+ */
+ if (pthread_is_threaded_np() != 0)
+ ereport(LOG,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("postmaster became multithreaded")));
+#endif
+
/* should cleanup shared memory and kill all backends */
/*
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 465281c..83f04e9 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -394,6 +394,9 @@
/* Define to 1 if the PS_STRINGS thing exists. */
#undef HAVE_PS_STRINGS
+/* Define to 1 if you have the `pthread_is_threaded_np' function. */
+#undef HAVE_PTHREAD_IS_THREADED_NP
+
/* Define to 1 if you have the <pwd.h> header file. */
#undef HAVE_PWD_H
darwin-multithreaded-fatal-v1.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6b0190c..96d54e8 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1216,7 +1216,7 @@ PostmasterMain(int argc, char *argv[])
* normal case on Windows, which offers neither fork() nor sigprocmask().
*/
if (pthread_is_threaded_np() != 0)
- ereport(LOG,
+ ereport(FATAL,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("postmaster became multithreaded during startup"),
errhint("Set the LC_ALL environment variable to a valid locale.")));
@@ -4781,11 +4781,14 @@ ExitPostmaster(int status)
/*
* There is no known cause for a postmaster to become multithreaded after
* startup. Recheck to account for the possibility of unknown causes.
+ * This message uses LOG level, because an unclean shutdown at this point
+ * would usually not look much different from a clean shutdown.
*/
if (pthread_is_threaded_np() != 0)
ereport(LOG,
- (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("postmaster became multithreaded")));
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("postmaster became multithreaded"),
+ errdetail("Please report this to <pgsql-bugs@postgresql.org>.")));
#endif
/* should cleanup shared memory and kill all backends */
On Fri, Jan 2, 2015 at 1:04 PM, Noah Misch <noah@leadboat.com> wrote:
The first attached patch, for all branches, adds LOG-level messages and an
assertion. So cassert builds will fail hard, while others won't. The second
patch, for master only, changes the startup-time message to FATAL. If we
decide to use FATAL in all branches, I would just squash them into one.
+ errdetail("Please report this to <pgsql-bugs@postgresql.org>.")));
Er, is mentioning a mailing list in an error message really necessary?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 05, 2015 at 02:25:09PM +0900, Michael Paquier wrote:
On Fri, Jan 2, 2015 at 1:04 PM, Noah Misch <noah@leadboat.com> wrote:
The first attached patch, for all branches, adds LOG-level messages and an
assertion. So cassert builds will fail hard, while others won't. The second
patch, for master only, changes the startup-time message to FATAL. If we
decide to use FATAL in all branches, I would just squash them into one.+ errdetail("Please report this to <pgsql-bugs@postgresql.org>.")));
Er, is mentioning a mailing list in an error message really necessary?
Necessary? No.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 31, 2014 at 01:52:49PM -0500, Noah Misch wrote:
On Sun, Dec 28, 2014 at 07:20:04PM -0500, Andrew Dunstan wrote:
On 12/28/2014 04:58 PM, Noah Misch wrote:
The gettext maintainer was open to implementing the setlocale_native_forked()
technique in gettext, though the last visible progress was in October. In any
event, PostgreSQL builds will see older gettext for several years. If
setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster
check during startup whether it has become multithreaded. If multithreaded:FATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a
good thing to have regardless of what happens with gettext.I'm OK with this, but on its own it won't fix orangutan's problems, will it?
Right; setlocale-main-harden-v1.patch fixes a bug not affecting orangutan at
all. None of the above will make orangutan turn green. Checking
multithreading during startup would merely let it fail cleanly.
OS X --enable-nls buildfarm members should run tests under LANG=C instead of
with locale environment variables unset (make check NO_LOCALE=1). I see two
ways to arrange that: (1) add a build-farm.conf option, or (2) have
pg_regress.c:initialize_environment() treat OS X like Windows. I mildly favor
(2); see attached, untested patch. Windows and OS X --enable-nls share the
characteristic that setlocale(LC_x, "") consults sources other than
environment variables. (I do wonder why commit 4a6fd46 used LANG=en instead
of LANG=C.) On the other hand, LANG=en has been inessential on Windows ever
since "pg_regress --no-locale" started to use "initdb --no-locale". While I
prefer to see the LANG= hack go away rather than proliferate, I can't cite a
practical reason to care.
Thanks,
nm
Attachments:
darwin-nolocale-v1.patchtext/plain; charset=us-asciiDownload
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index e8c644b..e55835e 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -790,9 +790,16 @@ initialize_environment(void)
unsetenv("LC_NUMERIC");
unsetenv("LC_TIME");
unsetenv("LANG");
- /* On Windows the default locale cannot be English, so force it */
-#if defined(WIN32) || defined(__CYGWIN__)
- putenv("LANG=en");
+ /*
+ * Most platforms have adopted the POSIX locale as their
+ * implementation-defined default locale. Exceptions include native
+ * Windows, Darwin with --enable-nls, and Cygwin with --enable-nls.
+ * (Use of --enable-nls matters because libintl replaces setlocale().)
+ * Also, PostgreSQL does not support Darwin with locale environment
+ * variables unset; see PostmasterMain().
+ */
+#if defined(WIN32) || defined(__CYGWIN__) || defined(__darwin__)
+ putenv("LANG=C");
#endif
}
On 1/1/15 11:04 PM, Noah Misch wrote:
Clusters hosted on OS X fall into these categories:
1) Unaffected configuration. This includes everyone setting a valid messages
locale via LANG, LC_ALL or LC_MESSAGES.
2) Affected configuration. Through luck and light use, the cluster would not
experience the crashes/hangs.
3) Cluster would experience the crashes/hangs.DBAs in (3) want the FATAL at startup, but those in (2) want a LOG message
instead. DBAs in (1) don't care. Since intermittent postmaster hangs are far
worse than startup failure, if (2) and (3) have similar population, FATAL is
the better bet. If (2) is sufficiently more populous than (3), then the many
small pricks from startup failure do add up to hurt more than the occasional
postmaster hang. Who knows how that calculation plays out.The first attached patch, for all branches, adds LOG-level messages and an
assertion. So cassert builds will fail hard, while others won't. The second
patch, for master only, changes the startup-time message to FATAL. If we
decide to use FATAL in all branches, I would just squash them into one.
What I'm seeing now is that the unaccent regression tests when run under
make check-world abort with
FATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.
My locale settings for this purpose are
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
The environment variable LANG is set, all the other ones are not set.
I could presumably set LC_ALL, but then I lose the ability to set
different locales for different categories.
Running
LC_ALL=$LANG make -C contrib/unaccent check
doesn't fix it; I get the same error.
The behavior is also a bit strange. The step
============== starting postmaster ==============
hangs for one minute before failing. This probably has nothing to do
with your change, but maybe pg_regress could detect postmaster startup
failures better.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 14, 2015 at 04:48:53PM -0500, Peter Eisentraut wrote:
What I'm seeing now is that the unaccent regression tests when run under
make check-world abort withFATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.
contrib/unaccent/Makefile sets NO_LOCALE=1, so that makes sense. I expect the
patch over here will fix it:
/messages/by-id/20150109063015.GA2491320@tornado.leadboat.com
The behavior is also a bit strange. The step
============== starting postmaster ==============
hangs for one minute before failing. This probably has nothing to do
with your change, but maybe pg_regress could detect postmaster startup
failures better.
Yeah, I think any postmaster startup failure has that effect. Not ideal.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jan 15, 2015 at 1:04 AM, Noah Misch <noah@leadboat.com> wrote:
On Wed, Jan 14, 2015 at 04:48:53PM -0500, Peter Eisentraut wrote:
What I'm seeing now is that the unaccent regression tests when run under
make check-world abort withFATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.contrib/unaccent/Makefile sets NO_LOCALE=1, so that makes sense. I expect the
patch over here will fix it:
/messages/by-id/20150109063015.GA2491320@tornado.leadboat.com
I just hit this same problem; are you going to commit that patch soon?
It's rather annoying to have make check-world fail.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jan 15, 2015 at 09:24:01AM -0500, Robert Haas wrote:
On Thu, Jan 15, 2015 at 1:04 AM, Noah Misch <noah@leadboat.com> wrote:
On Wed, Jan 14, 2015 at 04:48:53PM -0500, Peter Eisentraut wrote:
What I'm seeing now is that the unaccent regression tests when run under
make check-world abort withFATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.contrib/unaccent/Makefile sets NO_LOCALE=1, so that makes sense. I expect the
patch over here will fix it:
/messages/by-id/20150109063015.GA2491320@tornado.leadboat.comI just hit this same problem; are you going to commit that patch soon?
It's rather annoying to have make check-world fail.
Sure, done. Dave, orangutan should now be able to pass with --enable-nls.
Would you restore that option?
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 16 January 2015 at 01:33, Noah Misch <noah@leadboat.com> wrote:
On Thu, Jan 15, 2015 at 09:24:01AM -0500, Robert Haas wrote:
On Thu, Jan 15, 2015 at 1:04 AM, Noah Misch <noah@leadboat.com> wrote:
On Wed, Jan 14, 2015 at 04:48:53PM -0500, Peter Eisentraut wrote:
What I'm seeing now is that the unaccent regression tests when run
under
make check-world abort with
FATAL: postmaster became multithreaded during startup
HINT: Set the LC_ALL environment variable to a valid locale.contrib/unaccent/Makefile sets NO_LOCALE=1, so that makes sense. I
expect the
patch over here will fix it:
/messages/by-id/20150109063015.GA2491320@tornado.leadboat.com
I just hit this same problem; are you going to commit that patch soon?
It's rather annoying to have make check-world fail.Sure, done. Dave, orangutan should now be able to pass with --enable-nls.
Would you restore that option?
I can, but is this for HEAD or all versions ?
On Fri, Jan 16, 2015 at 05:43:44AM -0500, Dave Cramer wrote:
On 16 January 2015 at 01:33, Noah Misch <noah@leadboat.com> wrote:
Sure, done. Dave, orangutan should now be able to pass with --enable-nls.
Would you restore that option?I can, but is this for HEAD or all versions ?
All versions.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers