BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG
The following bug has been logged on the website:
Bug reference: 17946
Logged by: Guido Brugnara
Email address: gdo@leader.it
PostgreSQL version: 12.15
Operating system: Ubuntu 20.04
Description:
After upgrading an application using Postgresql from version 10 to 12,
fields of type "money" are no longer generated with the € symbol but with
$.
I identified the problem that occurs when making use of functions with
"LANGUAGE plperl," see with the following queries to be executed in order:
# from shell ...
sudo su -c psql\ postgres postgres <<'__SQL__';
SET lc_monetary TO 'C';
SELECT 12.34::money AS price;
SET lc_monetary TO 'it_IT.UTF-8';
SELECT 12.34::money AS price;
SET lc_monetary TO 'en_GB.UTF-8';
SELECT 12.34::money AS price;
CREATE EXTENSION plperl;
SET lc_monetary TO 'C';
SELECT 12.34::money AS price;
DO LANGUAGE 'plperl' $$ my $rv = spi_exec_query(q{SELECT 12.34::money AS
price;}, 1);elog(NOTICE, $rv->{rows}[0]->{price});$$;
SET lc_monetary TO 'it_IT.UTF-8';
SELECT 12.34::money AS price;
DO LANGUAGE 'plperl' $$ my $rv = spi_exec_query(q{SELECT 12.34::money AS
price;}, 1);elog(NOTICE, $rv->{rows}[0]->{price});$$;
SET lc_monetary TO 'en_GB.UTF-8';
SELECT 12.34::money AS price;
DO LANGUAGE 'plperl' $$ my $rv = spi_exec_query(q{SELECT 12.34::money AS
price;}, 1);elog(NOTICE, $rv->{rows}[0]->{price});$$;
__SQL__
#end.
The first three SELECTs generate content with the currencies Dollar, Euro &
Pound, as expected, while the last three only with Dollar.
It would appear that after first DO LANGUAGE 'plper' call, LC_MONETARY even
if it is varied, has no effect in subsequent queries.
Any suggestions?
PG Bug reporting form <noreply@postgresql.org> writes:
After upgrading an application using Postgresql from version 10 to 12,
fields of type "money" are no longer generated with the € symbol but with
$.
Hmm, seems to work for me:
$ psql
psql (12.15)
Type "help" for help.
postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
£12.34
(1 row)
postgres=# DO LANGUAGE 'plperl' $$ my $rv = spi_exec_query(q{SELECT 12.34::money AS
price;}, 1);elog(NOTICE, $rv->{rows}[0]->{price});$$;
NOTICE: £12.34
DO
postgres=# SET lc_monetary TO 'it_IT.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
---------
€ 12,34
(1 row)
postgres=# DO LANGUAGE 'plperl' $$ my $rv = spi_exec_query(q{SELECT 12.34::money AS
price;}, 1);elog(NOTICE, $rv->{rows}[0]->{price});$$;
NOTICE: € 12,34
DO
IIRC, we've seen trouble in the past with some versions of libperl
clobbering the host application's locale settings. Maybe you
have a plperl.on_init or plperl.on_plperl_init action that is
causing that to happen? In any case, I'd call it a Perl bug not
a Postgres bug.
regards, tom lane
On 25/05/2023 15:33, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
After upgrading an application using Postgresql from version 10 to 12,
fields of type "money" are no longer generated with the € symbol but with
$.Hmm, seems to work for me:
I can reproduce this:
psql (16beta1)
Type "help" for help.
postgres=# DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo') $$;
NOTICE: foo
DO
postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
$12.34
(1 row)
If I don't call the plperl function, it works as expected:
sql (16beta1)
Type "help" for help.
postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
£12.34
(1 row)
I should note that 'en_GB.UTF-8' is the default locale in my system, and
that's what I used in initdb. I don't know if it makes a difference.
IIRC, we've seen trouble in the past with some versions of libperl
clobbering the host application's locale settings. Maybe you
have a plperl.on_init or plperl.on_plperl_init action that is
causing that to happen? In any case, I'd call it a Perl bug not
a Postgres bug
I did some debugging, initializing the perl interpreter calls uselocale():
#0 __GI___uselocale (newloc=0x7f9f47ff0940 <_nl_C_locobj>) at
./locale/uselocale.c:31
#1 0x00007f9f373bd069 in ?? () from
/usr/lib/x86_64-linux-gnu/libperl.so.5.36
#2 0x00007f9f373bce74 in ?? () from
/usr/lib/x86_64-linux-gnu/libperl.so.5.36
#3 0x00007f9f373bfc15 in Perl_init_i18nl10n () from
/usr/lib/x86_64-linux-gnu/libperl.so.5.36
#4 0x00007f9f48b74cfb in plperl_init_interp () at plperl.c:809
#5 0x00007f9f48b78adc in _PG_init () at plperl.c:483
#6 0x000055c98b8e9b63 in internal_load_library (libname=0x55c98bebaf90
"/home/heikki/pgsql.fsmfork/lib/plperl.so") at dfmgr.c:289
#7 0x000055c98b8ea1c2 in load_external_function
(filename=filename@entry=0x55c98bebb1c0 "$libdir/plperl",
funcname=funcname@entry=0x55c98beba378 "plperl_inline_handler",
signalNotFound=signalNotFound@entry=true,
filehandle=filehandle@entry=0x7ffd20942b48) at dfmgr.c:116
#8 0x000055c98b8ea864 in fmgr_info_C_lang (functionId=129304,
procedureTuple=0x7f9f4778ccb8, finfo=0x7ffd20942bf0) at fmgr.c:386
#9 fmgr_info_cxt_security (functionId=129304, finfo=0x7ffd20942bf0,
mcxt=<optimized out>, ignore_security=<optimized out>) at fmgr.c:246
#10 0x000055c98b8eba72 in fmgr_info (finfo=0x7ffd20942bf0,
functionId=<optimized out>) at fmgr.c:129
#11 OidFunctionCall1Coll (functionId=<optimized out>,
collation=collation@entry=0, arg1=94324124262840) at fmgr.c:1386
#12 0x000055c98b5e1385 in ExecuteDoStmt
(pstate=pstate@entry=0x55c98beba0b0, stmt=stmt@entry=0x55c98be90858,
atomic=atomic@entry=false) at functioncmds.c:2144
#13 0x000055c98b7c24ce in standard_ProcessUtility (pstmt=0x55c98be908e0,
queryString=0x55c98be8fd50 "DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo')
$$;", readOnlyTree=<optimized out>,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x55c98be90b80, qc=0x7ffd20942f30) at utility.c:714
#14 0x000055c98b7c0d9f in PortalRunUtility
(portal=portal@entry=0x55c98bf0b710, pstmt=pstmt@entry=0x55c98be908e0,
isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=0x55c98be90b80,
qc=0x7ffd20942f30) at pquery.c:1158
#15 0x000055c98b7c0ecb in PortalRunMulti
(portal=portal@entry=0x55c98bf0b710, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false,
dest=dest@entry=0x55c98be90b80,
altdest=altdest@entry=0x55c98be90b80, qc=qc@entry=0x7ffd20942f30)
at pquery.c:1322
#16 0x000055c98b7c139d in PortalRun (portal=portal@entry=0x55c98bf0b710,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true,
dest=dest@entry=0x55c98be90b80,
altdest=altdest@entry=0x55c98be90b80, qc=0x7ffd20942f30) at pquery.c:791
#17 0x000055c98b7bd85d in exec_simple_query (query_string=0x55c98be8fd50
"DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo') $$;") at postgres.c:1274
#18 0x000055c98b7bf978 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4632
#19 0x000055c98b73f743 in BackendRun (port=<optimized out>,
port=<optimized out>) at postmaster.c:4461
#20 BackendStartup (port=<optimized out>) at postmaster.c:4189
#21 ServerLoop () at postmaster.c:1779
#22 0x000055c98b74077a in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x55c98be88fc0) at postmaster.c:1463
#23 0x000055c98b4a96be in main (argc=3, argv=0x55c98be88fc0) at main.c:198
I think the uselocale() call renders ineffective the setlocale() calls
that we make later. Maybe we should replace our setlocale() calls with
uselocale(), too.
--
Heikki Linnakangas
Neon (https://neon.tech)
On Mon Jun 5, 2023 at 11:00 AM CDT, Heikki Linnakangas wrote:
On 25/05/2023 15:33, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
After upgrading an application using Postgresql from version 10 to 12,
fields of type "money" are no longer generated with the € symbol but with
$.Hmm, seems to work for me:
I can reproduce this:
psql (16beta1)
Type "help" for help.postgres=# DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo') $$;
NOTICE: foo
DO
postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
$12.34
(1 row)If I don't call the plperl function, it works as expected:
sql (16beta1)
Type "help" for help.postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
£12.34
(1 row)I should note that 'en_GB.UTF-8' is the default locale in my system, and
that's what I used in initdb. I don't know if it makes a difference.
I am looking into this bug. I have also reproduced it.
--
Tristan Partin
Neon (https://neon.tech)
On 6/9/23 11:31, Tristan Partin wrote:
On Mon Jun 5, 2023 at 11:00 AM CDT, Heikki Linnakangas wrote:
On 25/05/2023 15:33, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
After upgrading an application using Postgresql from version 10 to 12,
fields of type "money" are no longer generated with the € symbol but with
$.Hmm, seems to work for me:
I can reproduce this:
psql (16beta1)
Type "help" for help.postgres=# DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo') $$;
NOTICE: foo
DO
postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
$12.34
(1 row)If I don't call the plperl function, it works as expected:
sql (16beta1)
Type "help" for help.postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
£12.34
(1 row)I should note that 'en_GB.UTF-8' is the default locale in my system, and
that's what I used in initdb. I don't know if it makes a difference.I am looking into this bug. I have also reproduced it.
It reproduces for me on both pg16beta1 and pg10. I wonder if it isn't a
behavior change in libperl itself. It seems that merely doing "load
'plperl';" is enough to cause the issue as long as it is done prior to
doing "SET lc_monetary TO 'en_GB.UTF-8'; SELECT 12.34::money AS price;".
When done in the opposite order the problem does not occur.
8<------------------------------
# On pg10 with perl v5.34.0
# note that on my system
# LC_NUMERIC=""
# LC_ALL=""
# LANG="en_US.UTF-8"
#
# this works correctly
psql nmx << EOF
SET lc_monetary TO 'en_GB.UTF-8';
SELECT 12.34::money AS price;
load 'plperl';
SELECT 12.34::money AS price;
EOF
SET
price
--------
£12.34
(1 row)
LOAD
price
--------
£12.34
(1 row)
# this does not
psql nmx << EOF
SET lc_monetary TO 'en_GB.UTF-8';
load 'plperl';
SELECT 12.34::money AS price;
EOF
SET
LOAD
price
--------
$12.34
(1 row)
8<------------------------------
Since I am also seeing this on pg10, I wonder if it is a change in
perl.I found this[1]"locale changes in 5.19.1 break LC_NUMERIC handling" https://github.com/Perl/perl5/issues/13089 -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com:
"What did change is that perl space code no
longer pays attention to the LC_NUMERIC
category outside 'use locale'. This is the way
it has always worked, AFAIK, for LC_COLLATE
and, mostly, LC_CTYPE, and for some uses of
LC_NUMERIC."
[1]: "locale changes in 5.19.1 break LC_NUMERIC handling" https://github.com/Perl/perl5/issues/13089 -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
handling"
https://github.com/Perl/perl5/issues/13089
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On 6/9/23 15:05, Joe Conway wrote:
I wonder if it isn't a behavior change in libperl itself. It seems
that merely doing "load 'plperl';" is enough to cause the issue
I can reproduce with a simple test program by linking libperl:
8<-------- test.c ----------------
#include <locale.h>
#include <stdio.h>
#define off64_t __off64_t
#include <EXTERN.h>
#include <perl.h>
int
main(int argc, char *argv[])
{
struct lconv *extlconv;
#ifdef WITH_PERL
PerlInterpreter *plperl;
plperl = perl_alloc();
perl_construct(plperl);
#endif
setlocale(LC_MONETARY, "en_GB.UTF-8");
extlconv = localeconv();
printf("currency symbol = \"%s\"\n",
extlconv->currency_symbol);
return 0;
}
8<-------- test.c ----------------
Adjust the perl paths to suit:
8<------------------------
gcc -O0 -ggdb3 -o test \
-I /usr/lib64/perl5/CORE \
-lperl \
test.c
./test
currency symbol = "£"
gcc -O0 -ggdb3 -o test \
-I /usr/lib64/perl5/CORE \
-lperl -DWITH_PERL \
test.c
./test
currency symbol = "$"
8<------------------------
It happens because somehow loading libperl prevents localeconv() from
returning the correct values, even though libperl only seems to call
"setlocale(LC_ALL, NULL)" which ought not change anything.
8<------------------------
gdb ./test
Reading symbols from ./test...
(gdb) b setlocale
Breakpoint 1 at 0x10f0
(gdb) r
Starting program: /opt/src/pgsql-
Breakpoint 1, __GI_setlocale (category=6, locale=0x0) at
./locale/setlocale.c:218
218 ./locale/setlocale.c: No such file or directory.
(gdb) bt
#0 __GI_setlocale (category=6, locale=0x0) at ./locale/setlocale.c:218
#1 0x00007ffff7d96b97 in Perl_init_i18nl10n () from
/lib/x86_64-linux-gnu/libperl.so.5.34
#2 0x0000555555555225 in main (argc=1, argv=0x7fffffffe1d8) at test.c:18
(gdb) c
Continuing.
Breakpoint 1, __GI_setlocale (category=4, locale=0x55555555602e
"en_GB.UTF-8") at ./locale/setlocale.c:218
218 in ./locale/setlocale.c
(gdb) bt
#0 __GI_setlocale (category=4, locale=0x55555555602e "en_GB.UTF-8") at
./locale/setlocale.c:218
#1 0x0000555555555239 in main (argc=1, argv=0x7fffffffe1d8) at test.c:20
main (argc=1, argv=0x7fffffffe1d8) at test.c:21
21 extlconv = localeconv();
(gdb)
22 printf("currency symbol = \"%s\"\n",
(gdb)
currency symbol = "$"
24 return 0;
(gdb)
8<------------------------
Will continue to dig in the morning.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On 6/9/23 22:10, Joe Conway wrote:
On 6/9/23 15:05, Joe Conway wrote:
I wonder if it isn't a behavior change in libperl itself. It seems
that merely doing "load 'plperl';" is enough to cause the issueI can reproduce with a simple test program by linking libperl:
8<-------- test.c ----------------
A bit more spelunking leads me to the following observations and
conclusions:
1/ On RHEL7 with perl v5.16.3 the problem does not occur
2/ On RHEL9 with perl v5.32.1 the problem does occur
3/ The difference in behavior is triggered by the newer perl doing a
bunch of newlocale/uselocale calls not done by the older perl, combined
with a glibc behavior which seems surprising at best.
From localeinfo.h in glibc source tree:
8<------------------------
/* This fetches the thread-local locale_t pointer, either one set with
uselocale or &_nl_global_locale. */
#define _NL_CURRENT_LOCALE (__libc_tsd_get (locale_t, LOCALE))
8<------------------------
4/ I successfully tested a fix in the simplified reproducer program sent
earlier. It amounts to adding:
8<------------------------
uselocale(LC_GLOBAL_LOCALE);
8<------------------------
prior to calling
8<------------------------
extlconv = localeconv();
8<------------------------
5/ The attached fixes the issue for me on pg10 and passes check-world.
Comments?
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
20230610-plperl_locale_issue-000.patchtext/x-patch; charset=UTF-8; name=20230610-plperl_locale_issue-000.patchDownload
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index bbf100e..560a324 100644
*** a/src/backend/utils/adt/pg_locale.c
--- b/src/backend/utils/adt/pg_locale.c
*************** PGLC_localeconv(void)
*** 562,567 ****
--- 562,574 ----
setlocale(LC_CTYPE, locale_numeric);
#endif
+ /*
+ * Ensure the global locale is being used. This is
+ * necessary, for example, if another loaded library
+ * such as libperl has done uselocale() underneath us.
+ * */
+ uselocale(LC_GLOBAL_LOCALE);
+
/* Get formatting information for numeric */
setlocale(LC_NUMERIC, locale_numeric);
extlconv = localeconv();
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index e7f31f2..03d6623 100644
*** a/src/fe_utils/print.c
--- b/src/fe_utils/print.c
*************** setDecimalLocale(void)
*** 3402,3407 ****
--- 3402,3413 ----
{
struct lconv *extlconv;
+ /*
+ * Ensure the global locale is being used. This is
+ * necessary, for example, if another loaded library
+ * such as libperl has done uselocale() underneath us.
+ * */
+ uselocale(LC_GLOBAL_LOCALE);
extlconv = localeconv();
/* Don't accept an empty decimal_point string */
Joe Conway <mail@joeconway.com> writes:
5/ The attached fixes the issue for me on pg10 and passes check-world.
Comments?
The call in PGLC_localeconv seems *very* oddly placed. Why not
do that before it does any other locale calls? Otherwise you don't
really have reason to believe you're saving the appropriate
values to restore later.
regards, tom lane
On 6/10/23 14:42, Tom Lane wrote:
Joe Conway <mail@joeconway.com> writes:
5/ The attached fixes the issue for me on pg10 and passes check-world.
Comments?The call in PGLC_localeconv seems *very* oddly placed. Why not
do that before it does any other locale calls? Otherwise you don't
really have reason to believe you're saving the appropriate
values to restore later.
As far as I can tell it really only affects localeconv(), so I tried to
place it close to those. But I am fine with moving it up.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On 6/10/23 15:07, Joe Conway wrote:
On 6/10/23 14:42, Tom Lane wrote:
Joe Conway <mail@joeconway.com> writes:
5/ The attached fixes the issue for me on pg10 and passes check-world.
Comments?The call in PGLC_localeconv seems *very* oddly placed. Why not
do that before it does any other locale calls? Otherwise you don't
really have reason to believe you're saving the appropriate
values to restore later.As far as I can tell it really only affects localeconv(), so I tried to
place it close to those. But I am fine with moving it up.
This version is against pg16 (rather than pg10), moves up that hunk,
mentions localeconv() in the comment as the reason for the call, and
fixes some whitespace sloppiness. I will plan to apply to all supported
branches.
Better?
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
20230610-plperl_locale_issue-001.patchtext/x-patch; charset=UTF-8; name=20230610-plperl_locale_issue-001.patchDownload
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 31e3b16..0cd3237 100644
*** a/src/backend/utils/adt/pg_locale.c
--- b/src/backend/utils/adt/pg_locale.c
*************** PGLC_localeconv(void)
*** 505,510 ****
--- 505,517 ----
}
/*
+ * Ensure the global locale is being used by localeconv().
+ * This is necessary, for example, if another loaded library
+ * such as libperl has done uselocale() underneath us.
+ */
+ uselocale(LC_GLOBAL_LOCALE);
+
+ /*
* This is tricky because we really don't want to risk throwing error
* while the locale is set to other than our usual settings. Therefore,
* the process is: collect the usual settings, set locale to special
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index 7af1ccb..4ef4d9e 100644
*** a/src/fe_utils/print.c
--- b/src/fe_utils/print.c
*************** setDecimalLocale(void)
*** 3628,3633 ****
--- 3628,3639 ----
{
struct lconv *extlconv;
+ /*
+ * Ensure the global locale is being used by localeconv().
+ * This is necessary, for example, if another loaded library
+ * has done uselocale() underneath us.
+ */
+ uselocale(LC_GLOBAL_LOCALE);
extlconv = localeconv();
/* Don't accept an empty decimal_point string */
Joe Conway <mail@joeconway.com> writes:
This version is against pg16 (rather than pg10), moves up that hunk,
mentions localeconv() in the comment as the reason for the call, and
fixes some whitespace sloppiness. I will plan to apply to all supported
branches.
Better?
WFM.
regards, tom lane
On 10/06/2023 22:28, Joe Conway wrote:
On 6/10/23 15:07, Joe Conway wrote:
On 6/10/23 14:42, Tom Lane wrote:
Joe Conway <mail@joeconway.com> writes:
5/ The attached fixes the issue for me on pg10 and passes check-world.
Comments?The call in PGLC_localeconv seems *very* oddly placed. Why not
do that before it does any other locale calls? Otherwise you don't
really have reason to believe you're saving the appropriate
values to restore later.As far as I can tell it really only affects localeconv(), so I tried to
place it close to those. But I am fine with moving it up.This version is against pg16 (rather than pg10), moves up that hunk,
mentions localeconv() in the comment as the reason for the call, and
fixes some whitespace sloppiness. I will plan to apply to all supported
branches.Better?
The man page for uselocale(LC_GLOBAL_LOCALE) says: "The calling thread's
current locale is set to the global locale determined by setlocale(3)."
Does that undo the effect of calling uselocale() previously, so if you
later call setlocale(), the new locale takes effect in the thread too?
Or is it equivalent to "uselocale(LC_ALL, setlocale(NULL))", so that it
sets the thread's locale to the current global locale, but later
setlocale() calls have no effect on it?
In any case, this still doesn't feel like the right place. We have many
more setlocale() calls. Shouldn't we sprinkle them all with
uselocale(LC_GLOBAL_LOCALE)? cache_locale_time() for example. Or rather,
all the places where we use any functions that depend on the current locale.
How about we replace all setlocale() calls with uselocale()?
Shouldn't we restore the old thread-specific locale after the calls? I'm
not sure why libperl calls uselocale(), but we are now overwriting the
locale that it sets. We have a few other uselocale() calls in
pg_locale.c, and we take care to restore the old locale in those.
There are a few uselocale() calls in ecpg, and they are protected by
HAVE_USELOCALE. Interestingly, the calls in pg_locale.c are not, but
they are protected by HAVE_LOCALE_T. Seems a little inconsistent.
--
Heikki Linnakangas
Neon (https://neon.tech)
(moving to hackers)
On 6/12/23 05:13, Heikki Linnakangas wrote:
On 10/06/2023 22:28, Joe Conway wrote:
On 6/10/23 15:07, Joe Conway wrote:
On 6/10/23 14:42, Tom Lane wrote:
Joe Conway <mail@joeconway.com> writes:
5/ The attached fixes the issue for me on pg10 and passes check-world.
Comments?The call in PGLC_localeconv seems *very* oddly placed. Why not
do that before it does any other locale calls? Otherwise you don't
really have reason to believe you're saving the appropriate
values to restore later.As far as I can tell it really only affects localeconv(), so I tried to
place it close to those. But I am fine with moving it up.This version is against pg16 (rather than pg10), moves up that hunk,
mentions localeconv() in the comment as the reason for the call, and
fixes some whitespace sloppiness. I will plan to apply to all supported
branches.Better?
The man page for uselocale(LC_GLOBAL_LOCALE) says: "The calling thread's
current locale is set to the global locale determined by setlocale(3)."
Does that undo the effect of calling uselocale() previously, so if you
later call setlocale(), the new locale takes effect in the thread too?
Or is it equivalent to "uselocale(LC_ALL, setlocale(NULL))", so that it
sets the thread's locale to the current global locale, but later
setlocale() calls have no effect on it?
setlocale() changes the global locale, but uselocale() changes the
locale that is currently active, as I understand it.
Also note that uselocale man page says "Unlike setlocale(3), uselocale()
does not allow selective replacement of individual locale categories.
To employ a locale that differs in only a few categories from the
current locale, use calls to duplocale(3) and newlocale(3) to obtain a
locale object equivalent to the current locale and modify the desired
categories in that object."
In any case, this still doesn't feel like the right place. We have many
more setlocale() calls. Shouldn't we sprinkle them all with
uselocale(LC_GLOBAL_LOCALE)? cache_locale_time() for example. Or rather,
all the places where we use any functions that depend on the current locale.How about we replace all setlocale() calls with uselocale()?
I don't see us backpatching something that invasive. It might be the
right thing to do for pg17, or even pg16, but I think that is a
different discussion
Shouldn't we restore the old thread-specific locale after the calls? I'm
not sure why libperl calls uselocale(), but we are now overwriting the
locale that it sets.
That is a good question. Though arguably perl is doing the wrong thing
by not resetting the global locale when it is being used embedded.
We have a few other uselocale() calls in pg_locale.c, and we take
care to restore the old locale in those.
I think as long as we are relying on setlocale rather than uselocale in
general (see above), the global locale is where we want things left.
There are a few uselocale() calls in ecpg, and they are protected by
HAVE_USELOCALE. Interestingly, the calls in pg_locale.c are not, but
they are protected by HAVE_LOCALE_T. Seems a little inconsistent.
Possibly something we should clean up, but I think that is separate from
this fix.
In general I think we have 2 or possibly three distinct things here:
1/ how do we fix the misbehavior reported due to libperl in existing
stable branches
2/ what makes most sense going forward (and does that mean pg16 or pg17)
3/ misc code cleanups
I was mostly trying to concentrate on #1, but 2 & 3 are worthy of
discussion.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On 6/12/23 10:44, Joe Conway wrote:
1/ how do we fix the misbehavior reported due to libperl in existing
stable branches
<snip>
I was mostly trying to concentrate on #1, but 2 & 3 are worthy of
discussion.
Hmm, browsing through the perl source I came across a reference to this
(from https://perldoc.perl.org/perllocale):
---------------
PERL_SKIP_LOCALE_INIT
This environment variable, available starting in Perl v5.20, if set
(to any value), tells Perl to not use the rest of the environment
variables to initialize with. Instead, Perl uses whatever the current
locale settings are. This is particularly useful in embedded
environments, see "Using embedded Perl with POSIX locales" in perlembed.
---------------
Seems we ought to be using that.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Mon Jun 12, 2023 at 4:13 AM CDT, Heikki Linnakangas wrote:
There are a few uselocale() calls in ecpg, and they are protected by
HAVE_USELOCALE. Interestingly, the calls in pg_locale.c are not, but
they are protected by HAVE_LOCALE_T. Seems a little inconsistent.
Patch is attached. CC-ing hackers.
--
Tristan Partin
Neon (https://neon.tech)
Attachments:
v1-0001-Make-uselocale-protection-more-consistent.patchtext/x-patch; charset=utf-8; name=v1-0001-Make-uselocale-protection-more-consistent.patchDownload
From 02a2cdb83405b5aecaec2af02e379a81161f8372 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Wed, 14 Jun 2023 11:36:00 -0500
Subject: [PATCH v1] Make uselocale protection more consistent
In ecpg, uselocale uses are protected by checking if HAVE_USELOCALE is
defined. Use the same check in pg_locale.c. Since HAVE_USELOCALE implies
HAVE_LOCALE_T, the code should be the same on _all_ platforms that
Postgres supports. Otherwise, I am sure there would have been a bug
report with pg_locale.c failing to build due to the system having
locale_t, but not uselocale.
---
src/backend/utils/adt/pg_locale.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 31e3b16ae0..3585afb298 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -2972,7 +2972,7 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
}
else
{
-#ifdef HAVE_LOCALE_T
+#ifdef HAVE_USELOCALE
#ifdef HAVE_WCSTOMBS_L
/* Use wcstombs_l for nondefault locales */
result = wcstombs_l(to, from, tolen, locale->info.lt);
@@ -2984,11 +2984,11 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
uselocale(save_locale);
#endif /* HAVE_WCSTOMBS_L */
-#else /* !HAVE_LOCALE_T */
- /* Can't have locale != 0 without HAVE_LOCALE_T */
+#else /* !HAVE_USELOCALE */
+ /* Can't have locale != 0 without HAVE_LOCALE_T, which HAVE_USELOCALE implies */
elog(ERROR, "wcstombs_l is not available");
result = 0; /* keep compiler quiet */
-#endif /* HAVE_LOCALE_T */
+#endif /* HAVE_USELOCALE */
}
return result;
@@ -3049,7 +3049,7 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
}
else
{
-#ifdef HAVE_LOCALE_T
+#ifdef HAVE_USELOCALE
#ifdef HAVE_MBSTOWCS_L
/* Use mbstowcs_l for nondefault locales */
result = mbstowcs_l(to, str, tolen, locale->info.lt);
@@ -3061,11 +3061,11 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
uselocale(save_locale);
#endif /* HAVE_MBSTOWCS_L */
-#else /* !HAVE_LOCALE_T */
- /* Can't have locale != 0 without HAVE_LOCALE_T */
+#else /* !HAVE_USELOCALE */
+ /* Can't have locale != 0 without HAVE_LOCALE_T, which HAVE_USELOCALE implies */
elog(ERROR, "mbstowcs_l is not available");
result = 0; /* keep compiler quiet */
-#endif /* HAVE_LOCALE_T */
+#endif /* HAVE_USELOCALE */
}
pfree(str);
--
Tristan Partin
Neon (https://neon.tech)
On 6/12/23 17:28, Joe Conway wrote:
On 6/12/23 10:44, Joe Conway wrote:
1/ how do we fix the misbehavior reported due to libperl in existing
stable branches<snip>
I was mostly trying to concentrate on #1, but 2 & 3 are worthy of
discussion.Hmm, browsing through the perl source I came across a reference to this
(from https://perldoc.perl.org/perllocale):---------------
PERL_SKIP_LOCALE_INITThis environment variable, available starting in Perl v5.20, if set
(to any value), tells Perl to not use the rest of the environment
variables to initialize with. Instead, Perl uses whatever the current
locale settings are. This is particularly useful in embedded
environments, see "Using embedded Perl with POSIX locales" in perlembed.
---------------Seems we ought to be using that.
Turns out that that does nothing useful as far as I can tell.
So I am back to proposing the attached against pg16beta1, to be
backpatched to pg11.
Since much of the discussion happened on pgsql-bugs, the background
summary for hackers is this:
When plperl is first loaded, the init function eventually works its way
to calling Perl_init_i18nl10n(). In versions of perl >= 5.20, that ends
up at S_emulate_setlocale() which does a series of uselocale() calls.
For reference, RHEL 7 is perl 5.16.3 while RHEL 9 is perl 5.32.1. Older
versions of perl do not have this behavior.
The problem with uselocale() is that it changes the active locale away
from the default global locale. Subsequent uses of setlocale() affect
the global locale, but if that is not the active locale, it does not
control the results of locale dependent functions such as localeconv(),
which is what we depend on in PGLC_localeconv().
The result is illustrated in this example:
8<------------
psql test
psql (16beta1)
Type "help" for help.
test=# show lc_monetary;
lc_monetary
-------------
en_GB.UTF-8
(1 row)
test=# SELECT 12.34::money AS price;
price
--------
£12.34
(1 row)
test=# \q
8<------------
psql test
psql (16beta1)
Type "help" for help.
test=# load 'plperl';
LOAD
test=# show lc_monetary;
lc_monetary
-------------
en_GB.UTF-8
(1 row)
test=# SELECT 12.34::money AS price;
price
--------
$12.34
(1 row)
8<------------
Notice that merely loading plperl makes the currency symbol wrong.
I have proposed a targeted fix that I believe is safe to backpatch --
attached.
IIUC, Tom was +1, but Heikki was looking for a more general solution.
My issue with the more general solution is that it will likely be too
invasive to backpatch, and at the moment at least, there are no other
confirmed bugs related to all of this (even if the current code is more
fragile than we would prefer).
I would like to commit this to all supported branches in the next few
days, unless there are other suggestions or objections.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
20230610-plperl_locale_issue-002.patchtext/x-patch; charset=UTF-8; name=20230610-plperl_locale_issue-002.patchDownload
Ensure the global locale gets used by localeconv()
A loaded library, such as libperl, may call uselocale() underneath us.
This can result in localeconv() grabbing the wrong locale for
numeric and monetary symbols and formatting. Fix that by resetting
to the global locale determined by setlocale(). Backpatch to all
supported versions.
Author: Joe Conway
Reviewed-By: Tom Lane
Reported by: Guido Brugnara
Discussion: https://postgr.es/m/flat/17946-3e84cb577e9551c3%40postgresql.org
Backpatch-through: 11
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 31e3b16..9dba161 100644
*** a/src/backend/utils/adt/pg_locale.c
--- b/src/backend/utils/adt/pg_locale.c
*************** PGLC_localeconv(void)
*** 505,510 ****
--- 505,517 ----
}
/*
+ * Ensure the global locale will be used by localeconv().
+ * This is necessary, for example, if another loaded library
+ * such as libperl has done uselocale() underneath us.
+ */
+ uselocale(LC_GLOBAL_LOCALE);
+
+ /*
* This is tricky because we really don't want to risk throwing error
* while the locale is set to other than our usual settings. Therefore,
* the process is: collect the usual settings, set locale to special
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index 7af1ccb..5d80e77 100644
*** a/src/fe_utils/print.c
--- b/src/fe_utils/print.c
*************** setDecimalLocale(void)
*** 3628,3633 ****
--- 3628,3639 ----
{
struct lconv *extlconv;
+ /*
+ * Ensure the global locale will be used by localeconv().
+ * This is necessary, for example, if another loaded library
+ * has done uselocale() underneath us.
+ */
+ uselocale(LC_GLOBAL_LOCALE);
extlconv = localeconv();
/* Don't accept an empty decimal_point string */
On 18/06/2023 21:27, Joe Conway wrote:
I have proposed a targeted fix that I believe is safe to backpatch --
attached.IIUC, Tom was +1, but Heikki was looking for a more general solution.
My issue with the more general solution is that it will likely be too
invasive to backpatch, and at the moment at least, there are no other
confirmed bugs related to all of this (even if the current code is more
fragile than we would prefer).
Ok, I agree switching to uselocale() everywhere is too much to
backpatch. We should consider it for master though.
With the patch you're proposing, do we now have a coding rule that you
must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to
setlocale()? If so, you missed a few spots: pg_perm_setlocale,
pg_bind_textdomain_codeset, and cache_locale_time.
The current locale affects a lot of other things than localeconv()
calls. For example, LC_MESSAGES affects all strerror() calls. Do we need
to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror()
calls too?
I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after
returning from the perl interpreter, instead of before setlocale()
calls, if we want all Postgres code to run with the global locale. Not
sure how much performance overhead that would have.
I just found out about perl's "switch_to_global_locale" function
(https://perldoc.perl.org/perlapi#switch_to_global_locale). Should we
use that?
Testing the patch, I bumped into this:
postgres=# create or replace function finnish_to_number() returns
numeric as $$ select to_number('1,23', '9D99'); $$ language sql set
lc_numeric to 'fi_FI.utf8';
CREATE FUNCTION
postgres=# DO LANGUAGE 'plperlu' $$
use POSIX qw(setlocale LC_NUMERIC);
use locale;
setlocale LC_NUMERIC, "fi_FI.utf8";
$n = 5/2; # Assign numeric 2.5 to $n
spi_exec_query('SELECT finnish_to_number()');
$a = " $n"; # Locale-dependent conversion to string
elog(NOTICE, "half five is $n"); # Locale-dependent output
$$;
NOTICE: half five is 2,5
DO
postgres=# select to_char(now(), 'Day');
WARNING: could not determine encoding for locale "en_GB.UTF-8": codeset
is "ANSI_X3.4-1968"
to_char
-----------
Tuesday
(1 row)
--
Heikki Linnakangas
Neon (https://neon.tech)
On 6/19/23 19:30, Heikki Linnakangas wrote:
On 18/06/2023 21:27, Joe Conway wrote:
I have proposed a targeted fix that I believe is safe to backpatch --
attached.IIUC, Tom was +1, but Heikki was looking for a more general solution.
My issue with the more general solution is that it will likely be too
invasive to backpatch, and at the moment at least, there are no other
confirmed bugs related to all of this (even if the current code is more
fragile than we would prefer).Ok, I agree switching to uselocale() everywhere is too much to
backpatch. We should consider it for master though.
Makes sense
With the patch you're proposing, do we now have a coding rule that you
must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to
setlocale()? If so, you missed a few spots: pg_perm_setlocale,
pg_bind_textdomain_codeset, and cache_locale_time.
Well I was not proposing such a rule (trying to stay narrowly focused on
the demonstrated issue) but I suppose it might make sense. Anywhere we
use setlocale() we are depending on subsequent locale operations to use
the global locale. And uselocale(LC_GLOBAL_LOCALE) itself looks like it
ought to be pretty cheap.
The current locale affects a lot of other things than localeconv()
calls. For example, LC_MESSAGES affects all strerror() calls. Do we need
to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror()
calls too?
That seems heavy handed
I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after
returning from the perl interpreter, instead of before setlocale()
calls, if we want all Postgres code to run with the global locale. Not
sure how much performance overhead that would have.
I don't see how that is practical, or at least it does not really
address the issue. I think any loaded shared library could cause the
same problem by running newlocale() + uselocale() on init. Perhaps I
should go test that theory though.
I just found out about perl's "switch_to_global_locale" function
(https://perldoc.perl.org/perlapi#switch_to_global_locale). Should we
use that?
Maybe, although it does not seem to exist on the older perl version on
RHEL7. And same comment as above -- while it might solve the problem
with libperl, it doesn't address similar problems with other loaded
shared libraries.
Testing the patch, I bumped into this:
postgres=# create or replace function finnish_to_number() returns
numeric as $$ select to_number('1,23', '9D99'); $$ language sql set
lc_numeric to 'fi_FI.utf8';
CREATE FUNCTION
postgres=# DO LANGUAGE 'plperlu' $$
use POSIX qw(setlocale LC_NUMERIC);
use locale;setlocale LC_NUMERIC, "fi_FI.utf8";
$n = 5/2; # Assign numeric 2.5 to $n
spi_exec_query('SELECT finnish_to_number()');
$a = " $n"; # Locale-dependent conversion to string
elog(NOTICE, "half five is $n"); # Locale-dependent output
$$;
NOTICE: half five is 2,5
DO
postgres=# select to_char(now(), 'Day');
WARNING: could not determine encoding for locale "en_GB.UTF-8": codeset
is "ANSI_X3.4-1968"
to_char
-----------
Tuesday
(1 row)
Do you think that is because uselocale(LC_GLOBAL_LOCALE) pulls out the
rug from under perl?
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On 21/06/2023 01:02, Joe Conway wrote:
On 6/19/23 19:30, Heikki Linnakangas wrote:
On 18/06/2023 21:27, Joe Conway wrote:
With the patch you're proposing, do we now have a coding rule that you
must call "uselocale(LC_GLOBAL_LOCALE)" before every and any call to
setlocale()? If so, you missed a few spots: pg_perm_setlocale,
pg_bind_textdomain_codeset, and cache_locale_time.Well I was not proposing such a rule (trying to stay narrowly focused on
the demonstrated issue) but I suppose it might make sense. Anywhere we
use setlocale() we are depending on subsequent locale operations to use
the global locale. And uselocale(LC_GLOBAL_LOCALE) itself looks like it
ought to be pretty cheap.The current locale affects a lot of other things than localeconv()
calls. For example, LC_MESSAGES affects all strerror() calls. Do we need
to call "uselocale(LC_GLOBAL_LOCALE)" before all possible strerror()
calls too?That seems heavy handed
Yet I think that's exactly where this is heading. See this case (for
gettext() rather than strerror()):
postgres=# set lc_messages ='sv_SE.UTF8';
SET
postgres=# this prints syntax error in Swedish;
FEL: syntaxfel vid eller nära "this"
LINE 1: this prints syntax error in Swedish;
^
postgres=# load 'plperl';
LOAD
postgres=# set lc_messages ='en_GB.utf8';
SET
postgres=# this *should* print syntax error in English;
FEL: syntaxfel vid eller nära "this"
LINE 1: this *should* print syntax error in English;
^
I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after
returning from the perl interpreter, instead of before setlocale()
calls, if we want all Postgres code to run with the global locale. Not
sure how much performance overhead that would have.I don't see how that is practical, or at least it does not really
address the issue. I think any loaded shared library could cause the
same problem by running newlocale() + uselocale() on init. Perhaps I
should go test that theory though.
Any shared library could do that, that's true. Any shared library could
also call 'chdir'. But most shared libraries don't. I think it's the
responsibility of the extension that loads the shared library, plperl in
this case, to make sure it doesn't mess up the environment for the
postgres backend.
Testing the patch, I bumped into this:
postgres=# create or replace function finnish_to_number() returns
numeric as $$ select to_number('1,23', '9D99'); $$ language sql set
lc_numeric to 'fi_FI.utf8';
CREATE FUNCTION
postgres=# DO LANGUAGE 'plperlu' $$
use POSIX qw(setlocale LC_NUMERIC);
use locale;setlocale LC_NUMERIC, "fi_FI.utf8";
$n = 5/2; # Assign numeric 2.5 to $n
spi_exec_query('SELECT finnish_to_number()');
$a = " $n"; # Locale-dependent conversion to string
elog(NOTICE, "half five is $n"); # Locale-dependent output
$$;
NOTICE: half five is 2,5
DO
postgres=# select to_char(now(), 'Day');
WARNING: could not determine encoding for locale "en_GB.UTF-8": codeset
is "ANSI_X3.4-1968"
to_char
-----------
Tuesday
(1 row)Do you think that is because uselocale(LC_GLOBAL_LOCALE) pulls out the
rug from under perl?
libperl is fine in this case. But cache_locale_time() also calls
setlocale(), and your patch didn't add the "uselocale(LC_GLOBAL_LOCALE)"
there.
It's a valid concern that "uselocale(LC_GLOBAL_LOCALE)" could pull the
rug from under perl. I tried to find issues like that, by calling
locale-dependent functions in plperl, with SQL functions that call
"uselocale(LC_GLOBAL_LOCALE)" via PGLC_localeconv() in between. But I
couldn't find any case where the perl code would misbehave. I guess
libperl calls uselocale() before any locale-dependent function, but I
didn't look very closely.
--
Heikki Linnakangas
Neon (https://neon.tech)
On 6/22/23 03:26, Heikki Linnakangas wrote:
On 21/06/2023 01:02, Joe Conway wrote:
On 6/19/23 19:30, Heikki Linnakangas wrote:
I think we should call "uselocale(LC_GLOBAL_LOCALE)" immediately after
returning from the perl interpreter, instead of before setlocale()
calls, if we want all Postgres code to run with the global locale. Not
sure how much performance overhead that would have.I don't see how that is practical, or at least it does not really
address the issue. I think any loaded shared library could cause the
same problem by running newlocale() + uselocale() on init. Perhaps I
should go test that theory though.Any shared library could do that, that's true. Any shared library could
also call 'chdir'. But most shared libraries don't. I think it's the
responsibility of the extension that loads the shared library, plperl in
this case, to make sure it doesn't mess up the environment for the
postgres backend.
Ok, fair enough.
The attached fixes all of the issues raised on this thread by
specifically patching plperl.
8<------------
create or replace function finnish_to_number()
returns numeric as
$$
select to_number('1,23', '9D99')
$$ language sql set lc_numeric to 'fi_FI.utf8';
pl_regression=# show lc_monetary;
lc_monetary
-------------
C
(1 row)
DO LANGUAGE 'plperlu'
$$
use POSIX qw(setlocale LC_NUMERIC);
use locale;
setlocale LC_NUMERIC, "fi_FI.utf8";
$n = 5/2; # Assign numeric 2.5 to $n
spi_exec_query('SELECT finnish_to_number()');
# Locale-dependent conversion to string
$a = " $n";
# Locale-dependent output
elog(NOTICE, "half five is $n");
$$;
NOTICE: half five is 2,5
DO
set lc_messages ='sv_SE.UTF8';
this prints syntax error in Swedish;
FEL: syntaxfel vid eller nära "this"
LINE 1: this prints syntax error in Swedish;
^
set lc_messages ='en_GB.utf8';
this *should* print syntax error in English;
ERROR: syntax error at or near "this"
LINE 1: this *should* print syntax error in English;
^
set lc_monetary ='sv_SE.UTF8';
SELECT 12.34::money AS price;
price
----------
12,34 kr
(1 row)
set lc_monetary ='en_GB.UTF8';
SELECT 12.34::money AS price;
price
--------
£12.34
(1 row)
set lc_monetary ='en_US.UTF8';
SELECT 12.34::money AS price;
price
--------
$12.34
(1 row)
8<------------
This works correctly from what I can see -- tested against pg16beta1 on
Linux Mint with perl v5.34.0 as well as against pg15.2 on RHEL 7 with
perl v5.16.3.
Although I have not looked yet, presumably we could have similar
problems with plpython. I would like to get agreement on this approach
against plperl before diving into that though.
Thoughts?
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
plperl_locale_issue-004.patchtext/x-patch; charset=UTF-8; name=plperl_locale_issue-004.patchDownload
Ensure correct locale is used when executing plperl
Newer versions of libperl, via plperl, call uselocale() which
has the effect of changing the current locale away from the
global locale underneath postgres. This can result in, among other
infelicities, localeconv() grabbing the wrong locale for
numeric and monetary symbols and formatting. Fix that by arranging
to capture the perl locale and swapping with the global locale
as appropriate when entering and exiting libperl. Importantly,
this dance is also needed when exiting perl via SPI calls made
while executing perl.
Backpatch to all supported versions.
Author: Joe Conway
Reviewed-By: Tom Lane and Heikki Linnakangas
Reported by: Guido Brugnara
Discussion: https://postgr.es/m/flat/17946-3e84cb577e9551c3%40postgresql.org
Backpatch-through: 11
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 8638642..9831361 100644
*** a/src/pl/plperl/plperl.c
--- b/src/pl/plperl/plperl.c
*************** typedef struct plperl_array_info
*** 223,228 ****
--- 223,233 ----
static HTAB *plperl_interp_hash = NULL;
static HTAB *plperl_proc_hash = NULL;
static plperl_interp_desc *plperl_active_interp = NULL;
+ /*
+ * Newer versions of perl call uselocale() to switch away from
+ * the global locale used by the backend. Store that here.
+ */
+ static locale_t perl_locale_obj = LC_GLOBAL_LOCALE;
/* If we have an unassigned "held" interpreter, it's stored here */
static PerlInterpreter *plperl_held_interp = NULL;
*************** static char *setlocale_perl(int category
*** 302,307 ****
--- 307,314 ----
#define setlocale_perl(a,b) Perl_setlocale(a,b)
#endif /* defined(WIN32) && PERL_VERSION_LT(5, 28, 0) */
+ static void plperl_xact_callback(XactEvent event, void *arg);
+
/*
* Decrement the refcount of the given SV within the active Perl interpreter
*
*************** _PG_init(void)
*** 482,487 ****
--- 489,508 ----
*/
plperl_held_interp = plperl_init_interp();
+ /*
+ * Grab a copy of perl locale in use, and switch back
+ * to the global one. We will need to switch back and
+ * forth, such that the current locale is perl's whenever
+ * we are about to evaluate perl code, and the global
+ * locale whenever we return to Postgres. Note that using
+ * SPI to execute SQL counts as returning to Postgres,
+ * albeit recursively.
+ */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
+ /* Arrange to restore the global locale in case of ERROR */
+ RegisterXactCallback(plperl_xact_callback, NULL);
+
inited = true;
}
*************** plperl_trusted_init(void)
*** 962,967 ****
--- 983,991 ----
char *key;
I32 klen;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* use original require while we set up */
PL_ppaddr[OP_REQUIRE] = pp_require_orig;
PL_ppaddr[OP_DOFILE] = pp_require_orig;
*************** plperl_trusted_init(void)
*** 1028,1033 ****
--- 1052,1060 ----
errmsg("%s", strip_trailing_ws(sv2cstr(ERRSV))),
errcontext("while executing plperl.on_plperl_init")));
}
+
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
}
*************** plperl_untrusted_init(void)
*** 1039,1044 ****
--- 1066,1074 ----
{
dTHX;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/*
* Nothing to do except execute plperl.on_plperlu_init
*/
*************** plperl_untrusted_init(void)
*** 1051,1056 ****
--- 1081,1089 ----
errmsg("%s", strip_trailing_ws(sv2cstr(ERRSV))),
errcontext("while executing plperl.on_plperlu_init")));
}
+
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
}
*************** plperl_call_handler(PG_FUNCTION_ARGS)
*** 1856,1861 ****
--- 1889,1897 ----
plperl_interp_desc *volatile oldinterp = plperl_active_interp;
plperl_call_data this_call_data;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Initialize current-call status record */
MemSet(&this_call_data, 0, sizeof(this_call_data));
this_call_data.fcinfo = fcinfo;
*************** plperl_call_handler(PG_FUNCTION_ARGS)
*** 1882,1887 ****
--- 1918,1926 ----
}
PG_END_TRY();
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
return retval;
}
*************** plperl_inline_handler(PG_FUNCTION_ARGS)
*** 1902,1907 ****
--- 1941,1949 ----
plperl_call_data this_call_data;
ErrorContextCallback pl_error_context;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Initialize current-call status record */
MemSet(&this_call_data, 0, sizeof(this_call_data));
*************** plperl_inline_handler(PG_FUNCTION_ARGS)
*** 1975,1980 ****
--- 2017,2025 ----
error_context_stack = pl_error_context.previous;
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
PG_RETURN_VOID();
}
*************** plperl_validator(PG_FUNCTION_ARGS)
*** 2045,2051 ****
--- 2090,2102 ----
/* Postpone body checks if !check_function_bodies */
if (check_function_bodies)
{
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
(void) compile_plperl_function(funcoid, is_trigger, is_event_trigger);
+
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
}
/* the result of a validator is ignored */
*************** plperl_spi_exec(char *query, int limit)
*** 3153,3160 ****
--- 3204,3218 ----
pg_verifymbstr(query, strlen(query), false);
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
spi_rv = SPI_execute(query, current_call_data->prodesc->fn_readonly,
limit);
+
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
ret_hv = plperl_spi_execute_fetch_result(SPI_tuptable, SPI_processed,
spi_rv);
*************** plperl_spi_exec(char *query, int limit)
*** 3177,3182 ****
--- 3235,3243 ----
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Punt the error to Perl */
croak_cstr(edata->message);
*************** plperl_spi_query(char *query)
*** 3426,3431 ****
--- 3487,3495 ----
/* Make sure the query is validly encoded */
pg_verifymbstr(query, strlen(query), false);
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
/* Create a cursor for the query */
plan = SPI_prepare(query, 0, NULL);
if (plan == NULL)
*************** plperl_spi_query(char *query)
*** 3441,3446 ****
--- 3505,3513 ----
PinPortal(portal);
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Commit the inner transaction, return to outer xact context */
ReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
*************** plperl_spi_query(char *query)
*** 3460,3465 ****
--- 3527,3535 ----
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Punt the error to Perl */
croak_cstr(edata->message);
*************** plperl_spi_prepare(char *query, int argc
*** 3640,3645 ****
--- 3710,3718 ----
/* Make sure the query is validly encoded */
pg_verifymbstr(query, strlen(query), false);
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
/************************************************************
* Prepare the plan and check for errors
************************************************************/
*************** plperl_spi_prepare(char *query, int argc
*** 3649,3654 ****
--- 3722,3730 ----
elog(ERROR, "SPI_prepare() failed:%s",
SPI_result_code_string(SPI_result));
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/************************************************************
* Save the plan into permanent memory (right now it's in the
* SPI procCxt, which will go away at function end).
*************** plperl_spi_prepare(char *query, int argc
*** 3697,3702 ****
--- 3773,3781 ----
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Punt the error to Perl */
croak_cstr(edata->message);
*************** plperl_spi_exec_prepared(char *query, HV
*** 3798,3807 ****
--- 3877,3893 ----
/************************************************************
* go
************************************************************/
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
spi_rv = SPI_execute_plan(qdesc->plan, argvalues, nulls,
current_call_data->prodesc->fn_readonly, limit);
ret_hv = plperl_spi_execute_fetch_result(SPI_tuptable, SPI_processed,
spi_rv);
+
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
if (argc > 0)
{
pfree(argvalues);
*************** plperl_spi_exec_prepared(char *query, HV
*** 3827,3832 ****
--- 3913,3921 ----
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Punt the error to Perl */
croak_cstr(edata->message);
*************** plperl_spi_query_prepared(char *query, i
*** 3911,3918 ****
--- 4000,4014 ----
/************************************************************
* go
************************************************************/
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
portal = SPI_cursor_open(NULL, qdesc->plan, argvalues, nulls,
current_call_data->prodesc->fn_readonly);
+
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
if (argc > 0)
{
pfree(argvalues);
*************** plperl_spi_query_prepared(char *query, i
*** 3945,3950 ****
--- 4041,4049 ----
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Punt the error to Perl */
croak_cstr(edata->message);
*************** plperl_util_elog(int level, SV *msg)
*** 4064,4070 ****
--- 4163,4177 ----
PG_TRY();
{
cmsg = sv2cstr(msg);
+
+ /* switch back to the global locale */
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+
elog(level, "%s", cmsg);
+
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
pfree(cmsg);
}
PG_CATCH();
*************** plperl_util_elog(int level, SV *msg)
*** 4079,4084 ****
--- 4186,4194 ----
if (cmsg)
pfree(cmsg);
+ /* ensure the perl locale is in use */
+ uselocale(perl_locale_obj);
+
/* Punt the error to Perl */
croak_cstr(edata->message);
}
*************** setlocale_perl(int category, char *local
*** 4245,4247 ****
--- 4355,4368 ----
return RETVAL;
}
#endif /* defined(WIN32) && PERL_VERSION_LT(5, 28, 0) */
+
+ /*
+ * plperl_xact_callback --- cleanup at main-transaction end.
+ */
+ static void
+ plperl_xact_callback(XactEvent event, void *arg)
+ {
+ /* ensure global locale is the current locale */
+ if (uselocale((locale_t) 0) != LC_GLOBAL_LOCALE)
+ perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
+ }
On Mon Jun 5, 2023 at 11:00 AM CDT, Heikki Linnakangas wrote:
On 25/05/2023 15:33, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
After upgrading an application using Postgresql from version 10 to 12,
fields of type "money" are no longer generated with the € symbol but with
$.Hmm, seems to work for me:
I can reproduce this:
psql (16beta1)
Type "help" for help.postgres=# DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo') $$;
NOTICE: foo
DO
postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
$12.34
(1 row)If I don't call the plperl function, it works as expected:
sql (16beta1)
Type "help" for help.postgres=# SET lc_monetary TO 'en_GB.UTF-8';
SET
postgres=# SELECT 12.34::money AS price;
price
--------
£12.34
(1 row)I should note that 'en_GB.UTF-8' is the default locale in my system, and
that's what I used in initdb. I don't know if it makes a difference.IIRC, we've seen trouble in the past with some versions of libperl
clobbering the host application's locale settings. Maybe you
have a plperl.on_init or plperl.on_plperl_init action that is
causing that to happen? In any case, I'd call it a Perl bug not
a Postgres bugI did some debugging, initializing the perl interpreter calls uselocale():
#0 __GI___uselocale (newloc=0x7f9f47ff0940 <_nl_C_locobj>) at
./locale/uselocale.c:31
#1 0x00007f9f373bd069 in ?? () from
/usr/lib/x86_64-linux-gnu/libperl.so.5.36
#2 0x00007f9f373bce74 in ?? () from
/usr/lib/x86_64-linux-gnu/libperl.so.5.36
#3 0x00007f9f373bfc15 in Perl_init_i18nl10n () from
/usr/lib/x86_64-linux-gnu/libperl.so.5.36
#4 0x00007f9f48b74cfb in plperl_init_interp () at plperl.c:809
#5 0x00007f9f48b78adc in _PG_init () at plperl.c:483
#6 0x000055c98b8e9b63 in internal_load_library (libname=0x55c98bebaf90
"/home/heikki/pgsql.fsmfork/lib/plperl.so") at dfmgr.c:289
#7 0x000055c98b8ea1c2 in load_external_function
(filename=filename@entry=0x55c98bebb1c0 "$libdir/plperl",
funcname=funcname@entry=0x55c98beba378 "plperl_inline_handler",
signalNotFound=signalNotFound@entry=true,
filehandle=filehandle@entry=0x7ffd20942b48) at dfmgr.c:116
#8 0x000055c98b8ea864 in fmgr_info_C_lang (functionId=129304,
procedureTuple=0x7f9f4778ccb8, finfo=0x7ffd20942bf0) at fmgr.c:386
#9 fmgr_info_cxt_security (functionId=129304, finfo=0x7ffd20942bf0,
mcxt=<optimized out>, ignore_security=<optimized out>) at fmgr.c:246
#10 0x000055c98b8eba72 in fmgr_info (finfo=0x7ffd20942bf0,
functionId=<optimized out>) at fmgr.c:129
#11 OidFunctionCall1Coll (functionId=<optimized out>,
collation=collation@entry=0, arg1=94324124262840) at fmgr.c:1386
#12 0x000055c98b5e1385 in ExecuteDoStmt
(pstate=pstate@entry=0x55c98beba0b0, stmt=stmt@entry=0x55c98be90858,
atomic=atomic@entry=false) at functioncmds.c:2144
#13 0x000055c98b7c24ce in standard_ProcessUtility (pstmt=0x55c98be908e0,
queryString=0x55c98be8fd50 "DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo')
$$;", readOnlyTree=<optimized out>,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x55c98be90b80, qc=0x7ffd20942f30) at utility.c:714
#14 0x000055c98b7c0d9f in PortalRunUtility
(portal=portal@entry=0x55c98bf0b710, pstmt=pstmt@entry=0x55c98be908e0,
isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=0x55c98be90b80,
qc=0x7ffd20942f30) at pquery.c:1158
#15 0x000055c98b7c0ecb in PortalRunMulti
(portal=portal@entry=0x55c98bf0b710, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false,
dest=dest@entry=0x55c98be90b80,
altdest=altdest@entry=0x55c98be90b80, qc=qc@entry=0x7ffd20942f30)
at pquery.c:1322
#16 0x000055c98b7c139d in PortalRun (portal=portal@entry=0x55c98bf0b710,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true,
dest=dest@entry=0x55c98be90b80,
altdest=altdest@entry=0x55c98be90b80, qc=0x7ffd20942f30) at pquery.c:791
#17 0x000055c98b7bd85d in exec_simple_query (query_string=0x55c98be8fd50
"DO LANGUAGE 'plperl' $$ elog(NOTICE, 'foo') $$;") at postgres.c:1274
#18 0x000055c98b7bf978 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4632
#19 0x000055c98b73f743 in BackendRun (port=<optimized out>,
port=<optimized out>) at postmaster.c:4461
#20 BackendStartup (port=<optimized out>) at postmaster.c:4189
#21 ServerLoop () at postmaster.c:1779
#22 0x000055c98b74077a in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x55c98be88fc0) at postmaster.c:1463
#23 0x000055c98b4a96be in main (argc=3, argv=0x55c98be88fc0) at main.c:198I think the uselocale() call renders ineffective the setlocale() calls
that we make later. Maybe we should replace our setlocale() calls with
uselocale(), too.
For what it's worth to everyone else in the thread (especially Joe), I
have a patch locally that fixes the mentioned bug using uselocale(). I
am not sure that it is worth committing for v16 given how _large_ (the
patch is actually quite small, +216 -235) of a change it is. I am going
to spend tomorrow combing over it a bit more and evaluating other
setlocale uses in the codebase.
--
Tristan Partin
Neon (https://neon.tech)
On 6/29/23 22:13, Tristan Partin wrote:
On Mon Jun 5, 2023 at 11:00 AM CDT, Heikki Linnakangas wrote:
I think the uselocale() call renders ineffective the setlocale() calls
that we make later. Maybe we should replace our setlocale() calls with
uselocale(), too.For what it's worth to everyone else in the thread (especially Joe), I
have a patch locally that fixes the mentioned bug using uselocale(). I
am not sure that it is worth committing for v16 given how _large_ (the
patch is actually quite small, +216 -235) of a change it is. I am going
to spend tomorrow combing over it a bit more and evaluating other
setlocale uses in the codebase.
(moving thread to hackers)
I don't see a patch attached -- how is it different than what I posted a
week ago and added to the commitfest here?
https://commitfest.postgresql.org/43/4413/
FWIW, if you are proposing replacing all uses of setlocale() with
uselocale() as Heikki suggested:
1/ I don't think that is pg16 material, and almost certainly not
back-patchable to earlier.
2/ It probably does not solve all of the identified issues caused by the
newer perl libraries by itself, i.e. I believe the patch posted to the
CF is still needed.
3/ I believe it is probably the right way to go for pg17+, but I would
love to hear opinions from Jeff Davis, Peter Eisentraut, and/or Thomas
Munroe (the locale code "usual suspects" ;-)), and others, about that.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Fri Jun 30, 2023 at 7:13 AM CDT, Joe Conway wrote:
On 6/29/23 22:13, Tristan Partin wrote:
On Mon Jun 5, 2023 at 11:00 AM CDT, Heikki Linnakangas wrote:
I think the uselocale() call renders ineffective the setlocale() calls
that we make later. Maybe we should replace our setlocale() calls with
uselocale(), too.For what it's worth to everyone else in the thread (especially Joe), I
have a patch locally that fixes the mentioned bug using uselocale(). I
am not sure that it is worth committing for v16 given how _large_ (the
patch is actually quite small, +216 -235) of a change it is. I am going
to spend tomorrow combing over it a bit more and evaluating other
setlocale uses in the codebase.(moving thread to hackers)
I don't see a patch attached -- how is it different than what I posted a
week ago and added to the commitfest here?https://commitfest.postgresql.org/43/4413/
FWIW, if you are proposing replacing all uses of setlocale() with
uselocale() as Heikki suggested:1/ I don't think that is pg16 material, and almost certainly not
back-patchable to earlier.
I am in agreement.
2/ It probably does not solve all of the identified issues caused by the
newer perl libraries by itself, i.e. I believe the patch posted to the
CF is still needed.
Perhaps. I do think your patch is still valuable regardless. Works for
backpatching and is just good defensive programming. I have added myself
as a reviewer.
3/ I believe it is probably the right way to go for pg17+, but I would
love to hear opinions from Jeff Davis, Peter Eisentraut, and/or Thomas
Munroe (the locale code "usual suspects" ;-)), and others, about that.
Thanks for your patience. Attached is a patch that should cover all the
problematic use cases of setlocale(). There are some setlocale() calls in
tests, initdb, and ecpg left. I plan to get to ecpglib before the final
version of this patch after I abstract over Windows not having
uselocale(). I think leaving initdb and tests as is would be fine, but I
am also happy to just permanently purge setlocale() from the codebase
if people see value in that. We could also poison[0]https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html setlocale() at that
point.
[0]: https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html
--
Tristan Partin
Neon (https://neon.tech)
Attachments:
v1-0001-Skip-checking-for-uselocale-on-Windows.patchtext/x-patch; charset=utf-8; name=v1-0001-Skip-checking-for-uselocale-on-Windows.patchDownload
From 5688bc2b2c6331f437a72b6a429199c5416ccd76 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Fri, 30 Jun 2023 09:31:04 -0500
Subject: [PATCH v1 1/3] Skip checking for uselocale on Windows
Windows doesn't have uselocale, so skip it.
---
meson.build | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/meson.build b/meson.build
index fbec997947..df76f10822 100644
--- a/meson.build
+++ b/meson.build
@@ -2439,7 +2439,7 @@ func_checks = [
['strsignal'],
['sync_file_range'],
['syncfs'],
- ['uselocale'],
+ ['uselocale', {'skip': host_system == 'windows'}],
['wcstombs_l'],
]
--
Tristan Partin
Neon (https://neon.tech)
v1-0002-Add-locale_is_c-function.patchtext/x-patch; charset=utf-8; name=v1-0002-Add-locale_is_c-function.patchDownload
From d52ab7851e9638e36cd99859014d7aa31154e600 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Fri, 30 Jun 2023 11:13:29 -0500
Subject: [PATCH v1 2/3] Add locale_is_c function
In some places throughout the codebase, there are string comparisons for
locales matching C or POSIX. Encapsulate this logic into a single
function and use it.
---
src/backend/utils/adt/pg_locale.c | 37 ++++++++++++++++++-------------
src/backend/utils/init/postinit.c | 4 +---
src/backend/utils/mb/mbutils.c | 5 ++---
src/include/utils/pg_locale.h | 1 +
4 files changed, 25 insertions(+), 22 deletions(-)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 514d4a9eeb..b455ef3aff 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -154,6 +154,21 @@ static void icu_set_collation_attributes(UCollator *collator, const char *loc,
UErrorCode *status);
#endif
+/*
+ * Check if a locale is the C locale
+ *
+ * POSIX is an alias for C. Passing ingore_case as true will use
+ * pg_strcasecmp() instead of strcmp().
+ */
+bool
+locale_is_c(const char *locale, bool ignore_case)
+{
+ if (!ignore_case)
+ return strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0;
+
+ return pg_strcasecmp(locale, "C") == 0 || pg_strcasecmp(locale, "POSIX") == 0;
+}
+
/*
* pg_perm_setlocale
*
@@ -1239,10 +1254,8 @@ lookup_collation_cache(Oid collation, bool set_flags)
datum = SysCacheGetAttrNotNull(COLLOID, tp, Anum_pg_collation_collctype);
collctype = TextDatumGetCString(datum);
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ cache_entry->collate_is_c = locale_is_c(collcollate, false);
+ cache_entry->ctype_is_c = locale_is_c(collctype, false);
}
else
{
@@ -1290,12 +1303,8 @@ lc_collate_is_c(Oid collation)
if (!localeptr)
elog(ERROR, "invalid LC_COLLATE setting");
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
- result = false;
+ result = locale_is_c(localeptr, false);
+
return (bool) result;
}
@@ -1343,12 +1352,8 @@ lc_ctype_is_c(Oid collation)
if (!localeptr)
elog(ERROR, "invalid LC_CTYPE setting");
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
- result = false;
+ result = locale_is_c(localeptr, false);
+
return (bool) result;
}
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0f9b92b32e..a92a0c438f 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -419,9 +419,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (strcmp(ctype, "C") == 0 ||
- strcmp(ctype, "POSIX") == 0)
- database_ctype_is_c = true;
+ database_ctype_is_c = locale_is_c(ctype, false);
if (dbform->datlocprovider == COLLPROVIDER_ICU)
{
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 67a1ab2ab2..997156515c 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -39,6 +39,7 @@
#include "mb/pg_wchar.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
+#include "utils/pg_locale.h"
#include "utils/syscache.h"
#include "varatt.h"
@@ -1237,9 +1238,7 @@ pg_bind_textdomain_codeset(const char *domainname)
int new_msgenc;
#ifndef WIN32
- const char *ctype = setlocale(LC_CTYPE, NULL);
-
- if (pg_strcasecmp(ctype, "C") == 0 || pg_strcasecmp(ctype, "POSIX") == 0)
+ if (locale_is_c(locale_ctype, true))
#endif
if (encoding != PG_SQL_ASCII &&
raw_pg_bind_textdomain_codeset(domainname, encoding))
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index e2a7243542..872bef798a 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -54,6 +54,7 @@ extern PGDLLIMPORT bool database_ctype_is_c;
extern bool check_locale(int category, const char *locale, char **canonname);
extern char *pg_perm_setlocale(int category, const char *locale);
+extern bool locale_is_c(const char *locale, bool ignore_case);
extern bool lc_collate_is_c(Oid collation);
extern bool lc_ctype_is_c(Oid collation);
--
Tristan Partin
Neon (https://neon.tech)
v1-0003-Use-uselocale-instead-of-setlocale.patchtext/x-patch; charset=utf-8; name=v1-0003-Use-uselocale-instead-of-setlocale.patchDownload
From 75b5f7a4e4ecaf494302721b35c919d5e193b6d9 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 3 Jul 2023 08:51:59 -0500
Subject: [PATCH v1 3/3] Use uselocale() instead of setlocale()
setlocale() has been a thorn in Postgres's side for a while. Most
recently this has manifested in bug #17946 where loading libperl would
cause the localization to change out from under Postgres.
---
src/backend/main/main.c | 27 +-
src/backend/utils/adt/cash.c | 29 +++
src/backend/utils/adt/formatting.c | 11 +-
src/backend/utils/adt/pg_locale.c | 404 ++++++++++++++++-------------
src/backend/utils/init/postinit.c | 4 +-
src/backend/utils/mb/mbutils.c | 2 +-
src/common/exec.c | 16 --
src/include/utils/pg_locale.h | 2 +-
8 files changed, 275 insertions(+), 220 deletions(-)
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index ed11e8be7f..5b0e9fccda 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -130,13 +130,6 @@ main(int argc, char *argv[])
init_locale("LC_NUMERIC", LC_NUMERIC, "C");
init_locale("LC_TIME", LC_TIME, "C");
- /*
- * Now that we have absorbed as much as we wish to from the locale
- * environment, remove any LC_ALL setting, so that the environment
- * variables installed by pg_perm_setlocale have force.
- */
- unsetenv("LC_ALL");
-
/*
* Catch standard options before doing much else, in particular before we
* insist on not being root.
@@ -307,8 +300,24 @@ startup_hacks(const char *progname)
static void
init_locale(const char *categoryname, int category, const char *locale)
{
- if (pg_perm_setlocale(category, locale) == NULL &&
- pg_perm_setlocale(category, "C") == NULL)
+ /* Since we are initializing global locales, NULL is an invalid argument. */
+ Assert(locale);
+
+ /*
+ * uselocale() and friends don't have an equivalent to
+ * setlocale(category, NULL), so we have to stash the actual locale string
+ * in case it is needed. To do that, we canonicalize the empty string
+ * argument here. pg_setlocale() asserts that it doesn't receive the empty
+ * string.
+ */
+ if (locale[0] == '\0')
+ locale = setlocale(category, locale);
+
+ if (locale == NULL)
+ elog(FATAL, "%s locale was not properly canonicalized", categoryname);
+
+ if (pg_setlocale(category, locale) == NULL &&
+ pg_setlocale(category, "C") == NULL)
elog(FATAL, "could not adopt \"%s\" locale nor C locale for %s",
locale, categoryname);
}
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 32fbad2f57..5c254d912f 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -111,6 +111,11 @@ cash_in(PG_FUNCTION_ARGS)
*csymbol;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(escontext, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/*
* frac_digits will be CHAR_MAX in some locales, notably C. However, just
* testing for == CHAR_MAX is risky, because of compilers like gcc that
@@ -325,6 +330,11 @@ cash_out(PG_FUNCTION_ARGS)
sep_by_space;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
points = lconvert->frac_digits;
if (points < 0 || points > 10)
@@ -1036,6 +1046,11 @@ cash_numeric(PG_FUNCTION_ARGS)
int fpoint;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1095,6 +1110,11 @@ numeric_cash(PG_FUNCTION_ARGS)
Datum numeric_scale;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1128,6 +1148,11 @@ int4_cash(PG_FUNCTION_ARGS)
int i;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1158,6 +1183,10 @@ int8_cash(PG_FUNCTION_ARGS)
int i;
struct lconv *lconvert = PGLC_localeconv();
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index e6246dc44b..b4229d075e 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -5067,12 +5067,12 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Positive / Negative number sign
*/
- if (lconv->negative_sign && *lconv->negative_sign)
+ if (lconv && lconv->negative_sign && *lconv->negative_sign)
Np->L_negative_sign = lconv->negative_sign;
else
Np->L_negative_sign = "-";
- if (lconv->positive_sign && *lconv->positive_sign)
+ if (lconv && lconv->positive_sign && *lconv->positive_sign)
Np->L_positive_sign = lconv->positive_sign;
else
Np->L_positive_sign = "+";
@@ -5080,9 +5080,8 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Number decimal point
*/
- if (lconv->decimal_point && *lconv->decimal_point)
+ if (lconv && lconv->decimal_point && *lconv->decimal_point)
Np->decimal = lconv->decimal_point;
-
else
Np->decimal = ".";
@@ -5096,7 +5095,7 @@ NUM_prepare_locale(NUMProc *Np)
* but "" for thousands_sep, so we set the thousands_sep too.
* http://archives.postgresql.org/pgsql-hackers/2007-11/msg00772.php
*/
- if (lconv->thousands_sep && *lconv->thousands_sep)
+ if (lconv && lconv->thousands_sep && *lconv->thousands_sep)
Np->L_thousands_sep = lconv->thousands_sep;
/* Make sure thousands separator doesn't match decimal point symbol. */
else if (strcmp(Np->decimal, ",") != 0)
@@ -5107,7 +5106,7 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Currency symbol
*/
- if (lconv->currency_symbol && *lconv->currency_symbol)
+ if (lconv && lconv->currency_symbol && *lconv->currency_symbol)
Np->L_currency_symbol = lconv->currency_symbol;
else
Np->L_currency_symbol = " ";
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index b455ef3aff..9b5d4384c6 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -98,6 +98,21 @@ char *locale_time;
int icu_validation_level = WARNING;
+/*
+ * Current locale settings
+ *
+ * uselocale() and friends don't have an equivalent to setlocale(cat, NULL), so
+ * as we set locales for various categories, stash them for retrieval as needed.
+ */
+static struct {
+ char *collate;
+ char *ctype;
+ char *messages;
+ char *monetary;
+ char *numeric;
+ char *time;
+} curr_locales = { 0 };
+
/*
* lc_time localization cache.
*
@@ -169,50 +184,143 @@ locale_is_c(const char *locale, bool ignore_case)
return pg_strcasecmp(locale, "C") == 0 || pg_strcasecmp(locale, "POSIX") == 0;
}
+
/*
- * pg_perm_setlocale
- *
- * This wraps the libc function setlocale(), with two additions. First, when
- * changing LC_CTYPE, update gettext's encoding for the current message
- * domain. GNU gettext automatically tracks LC_CTYPE on most platforms, but
- * not on Windows. Second, if the operation is successful, the corresponding
- * LC_XXX environment variable is set to match. By setting the environment
- * variable, we ensure that any subsequent use of setlocale(..., "") will
- * preserve the settings made through this routine. Of course, LC_ALL must
- * also be unset to fully ensure that, but that has to be done elsewhere after
- * all the individual LC_XXX variables have been set correctly. (Thank you
- * Perl for making this kluge necessary.)
+ * Turns locale categories into their mask equivalents for use in newlocale(3)
+ *
+ * GCC has the masks defined as (1 << category), but this isn't guaranteed to be
+ * the same in other libcs.
+ */
+static int
+category_to_mask(int category)
+{
+ switch (category) {
+ case LC_ALL:
+ return LC_ALL_MASK;
+ case LC_COLLATE:
+ return LC_COLLATE_MASK;
+ case LC_CTYPE:
+ return LC_CTYPE_MASK;
+#ifdef LC_MESSAGES
+ case LC_MESSAGES:
+ return LC_MESSAGES_MASK;
+#endif
+ case LC_MONETARY:
+ return LC_MONETARY_MASK;
+ case LC_NUMERIC:
+ return LC_NUMERIC_MASK;
+ case LC_TIME:
+ return LC_TIME_MASK;
+ }
+
+ pg_unreachable();
+}
+
+
+static char *
+pg_getlocale(int category)
+{
+ switch (category) {
+ case LC_COLLATE:
+ return curr_locales.collate;
+ case LC_CTYPE:
+ return curr_locales.ctype;
+ case LC_MESSAGES:
+ return curr_locales.messages;
+ case LC_MONETARY:
+ return curr_locales.monetary;
+ case LC_NUMERIC:
+ return curr_locales.numeric;
+ case LC_TIME:
+ return curr_locales.time;
+ default:
+ pg_unreachable();
+ }
+}
+
+
+/*
+ * pg_setlocale
+ *
+ * This wraps the libc function uselocale(), with a single addition. When
+ * changing LC_CTYPE, update gettext's encoding for the current message domain.
+ * GNU gettext automatically tracks LC_CTYPE on most platforms, but not on
+ * Windows. This function does will abort if locale is the empty string. A
+ * workaround is to pass the output of setlocale(category, "") to the locale
+ * argument.
*/
char *
-pg_perm_setlocale(int category, const char *locale)
+pg_setlocale(int category, const char *locale)
{
- char *result;
- const char *envvar;
+ char *dup;
+ char **save;
+ int category_mask;
+ locale_t prev_locale;
+ locale_t temp_locale;
+ locale_t work_locale;
-#ifndef WIN32
- result = setlocale(category, locale);
-#else
+ if (locale == NULL)
+ return pg_getlocale(category);
+
+ Assert(locale[0] != '\0');
+
+ category_mask = category_to_mask(category);
+
+ prev_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
+
+ dup = strdup(locale);
+ work_locale = duplocale(prev_locale);
+ if (!dup || work_locale == (locale_t) 0) {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+
+ return NULL;
+ }
+
+ temp_locale = newlocale(category_mask, locale, work_locale);
+ if (temp_locale == (locale_t) 0) {
+ freelocale(work_locale);
+
+ return NULL; /* fall out immediately on failure */
+ }
+ work_locale = temp_locale;
+
+ uselocale(work_locale);
+ if (prev_locale != LC_GLOBAL_LOCALE)
+ freelocale(prev_locale);
/*
- * On Windows, setlocale(LC_MESSAGES) does not work, so just assume that
- * the given value is good and set it in the environment variables. We
- * must ignore attempts to set to "", which means "keep using the old
- * environment value".
+ * Copying the locale must occur before we potentially bind the text domain.
*/
-#ifdef LC_MESSAGES
- if (category == LC_MESSAGES)
- {
- result = (char *) locale;
- if (locale == NULL || locale[0] == '\0')
- return result;
+ switch (category) {
+ case LC_COLLATE:
+ save = &curr_locales.collate;
+ break;
+ case LC_CTYPE:
+ save = &curr_locales.ctype;
+ break;
+ case LC_MESSAGES:
+ save = &curr_locales.messages;
+ break;
+ case LC_MONETARY:
+ save = &curr_locales.monetary;
+ break;
+ case LC_NUMERIC:
+ save = &curr_locales.numeric;
+ break;
+ case LC_TIME:
+ save = &curr_locales.time;
+ break;
+ default:
+ pg_unreachable();
}
- else
-#endif
- result = setlocale(category, locale);
-#endif /* WIN32 */
- if (result == NULL)
- return result; /* fall out immediately on failure */
+ /* Free old locale string, and stash the new one */
+ free(*save);
+ *save = dup;
+ dup = NULL;
/*
* Use the right encoding in translated messages. Under ENABLE_NLS, let
@@ -223,12 +331,6 @@ pg_perm_setlocale(int category, const char *locale)
*/
if (category == LC_CTYPE)
{
- static char save_lc_ctype[LOCALE_NAME_BUFLEN];
-
- /* copy setlocale() return value before callee invokes it again */
- strlcpy(save_lc_ctype, result, sizeof(save_lc_ctype));
- result = save_lc_ctype;
-
#ifdef ENABLE_NLS
SetMessageEncoding(pg_bind_textdomain_codeset(textdomain(NULL)));
#else
@@ -236,43 +338,7 @@ pg_perm_setlocale(int category, const char *locale)
#endif
}
- switch (category)
- {
- case LC_COLLATE:
- envvar = "LC_COLLATE";
- break;
- case LC_CTYPE:
- envvar = "LC_CTYPE";
- break;
-#ifdef LC_MESSAGES
- case LC_MESSAGES:
- envvar = "LC_MESSAGES";
-#ifdef WIN32
- result = IsoLocaleName(locale);
- if (result == NULL)
- result = (char *) locale;
- elog(DEBUG3, "IsoLocaleName() executed; locale: \"%s\"", result);
-#endif /* WIN32 */
- break;
-#endif /* LC_MESSAGES */
- case LC_MONETARY:
- envvar = "LC_MONETARY";
- break;
- case LC_NUMERIC:
- envvar = "LC_NUMERIC";
- break;
- case LC_TIME:
- envvar = "LC_TIME";
- break;
- default:
- elog(FATAL, "unrecognized LC category: %d", category);
- return NULL; /* keep compiler quiet */
- }
-
- if (setenv(envvar, result, 1) != 0)
- return NULL;
-
- return result;
+ return *save;
}
@@ -289,32 +355,43 @@ pg_perm_setlocale(int category, const char *locale)
bool
check_locale(int category, const char *locale, char **canonname)
{
- char *save;
- char *res;
+ locale_t saved_locale;
+ locale_t temp_locale;
+ locale_t work_locale;
+ int category_mask;
+
+ category_mask = category_to_mask(category);
if (canonname)
*canonname = NULL; /* in case of failure */
- save = setlocale(category, NULL);
- if (!save)
- return false; /* won't happen, we hope */
+ saved_locale = uselocale((locale_t) 0);
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0) {
+ if (errno == ENOMEM)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+
+ return false;
+ }
- /* save may be pointing at a modifiable scratch variable, see above. */
- save = pstrdup(save);
+ temp_locale = newlocale(category_mask, locale, work_locale);
+ if (temp_locale == (locale_t) 0) {
+ freelocale(work_locale);
- /* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+ return false;
+ }
+ work_locale = temp_locale;
- /* save canonical name if requested. */
- if (res && canonname)
- *canonname = pstrdup(res);
+ uselocale(work_locale);
+ if (canonname)
+ *canonname = pstrdup(locale); /* TODO: This does not handle "" as the locale */
- /* restore old value. */
- if (!setlocale(category, save))
- elog(WARNING, "failed to restore old locale \"%s\"", save);
- pfree(save);
+ uselocale(saved_locale);
+ freelocale(work_locale);
- return (res != NULL);
+ return true;
}
@@ -327,7 +404,7 @@ check_locale(int category, const char *locale, char **canonname)
*
* Note: we accept value = "" as selecting the postmaster's environment
* value, whatever it was (so long as the environment setting is legal).
- * This will have been locked down by an earlier call to pg_perm_setlocale.
+ * This will have been locked down by an earlier call to pg_setlocale.
*/
bool
check_locale_monetary(char **newval, void **extra, GucSource source)
@@ -406,7 +483,7 @@ assign_locale_messages(const char *newval, void *extra)
* We ignore failure, as per comment above.
*/
#ifdef LC_MESSAGES
- (void) pg_perm_setlocale(LC_MESSAGES, newval);
+ (void) pg_setlocale(LC_MESSAGES, newval);
#endif
}
@@ -502,11 +579,8 @@ PGLC_localeconv(void)
static bool CurrentLocaleConvAllocated = false;
struct lconv *extlconv;
struct lconv worklconv;
- char *save_lc_monetary;
- char *save_lc_numeric;
-#ifdef WIN32
- char *save_lc_ctype;
-#endif
+ locale_t saved_locale;
+ locale_t work_locale;
/* Did we do it already? */
if (CurrentLocaleConvValid)
@@ -533,16 +607,15 @@ PGLC_localeconv(void)
*/
memset(&worklconv, 0, sizeof(worklconv));
- /* Save prevailing values of monetary and numeric locales */
- save_lc_monetary = setlocale(LC_MONETARY, NULL);
- if (!save_lc_monetary)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_monetary = pstrdup(save_lc_monetary);
+ /* Save prevailing locale */
+ saved_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
- save_lc_numeric = setlocale(LC_NUMERIC, NULL);
- if (!save_lc_numeric)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_numeric = pstrdup(save_lc_numeric);
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
#ifdef WIN32
@@ -560,37 +633,37 @@ PGLC_localeconv(void)
* results. Hence, we must temporarily set that category as well.
*/
- /* Save prevailing value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
- if (!save_lc_ctype)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_ctype = pstrdup(save_lc_ctype);
-
/* Here begins the critical section where we must not throw error */
/* use numeric to set the ctype */
- setlocale(LC_CTYPE, locale_numeric);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_numeric, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
/* Get formatting information for numeric */
- setlocale(LC_NUMERIC, locale_numeric);
+ work_locale = newlocale(LC_NUMERIC_MASK, locale_numeric, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
extlconv = localeconv();
- /* Must copy data now in case setlocale() overwrites it */
+ /* Must copy data now in case updating the locale overwrites it */
worklconv.decimal_point = strdup(extlconv->decimal_point);
worklconv.thousands_sep = strdup(extlconv->thousands_sep);
worklconv.grouping = strdup(extlconv->grouping);
#ifdef WIN32
/* use monetary to set the ctype */
- setlocale(LC_CTYPE, locale_monetary);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_monetary, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
/* Get formatting information for monetary */
- setlocale(LC_MONETARY, locale_monetary);
+ work_locale = newlocale(LC_MONETARY_MASK, locale_monetary, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
extlconv = localeconv();
- /* Must copy data now in case setlocale() overwrites it */
+ /* Must copy data now in case updating the locale overwrites it */
worklconv.int_curr_symbol = strdup(extlconv->int_curr_symbol);
worklconv.currency_symbol = strdup(extlconv->currency_symbol);
worklconv.mon_decimal_point = strdup(extlconv->mon_decimal_point);
@@ -608,22 +681,9 @@ PGLC_localeconv(void)
worklconv.p_sign_posn = extlconv->p_sign_posn;
worklconv.n_sign_posn = extlconv->n_sign_posn;
- /*
- * Restore the prevailing locale settings; failure to do so is fatal.
- * Possibly we could limp along with nondefault LC_MONETARY or LC_NUMERIC,
- * but proceeding with the wrong value of LC_CTYPE would certainly be bad
- * news; and considering that the prevailing LC_MONETARY and LC_NUMERIC
- * are almost certainly "C", there's really no reason that restoring those
- * should fail.
- */
-#ifdef WIN32
- if (!setlocale(LC_CTYPE, save_lc_ctype))
- elog(FATAL, "failed to restore LC_CTYPE to \"%s\"", save_lc_ctype);
-#endif
- if (!setlocale(LC_MONETARY, save_lc_monetary))
- elog(FATAL, "failed to restore LC_MONETARY to \"%s\"", save_lc_monetary);
- if (!setlocale(LC_NUMERIC, save_lc_numeric))
- elog(FATAL, "failed to restore LC_NUMERIC to \"%s\"", save_lc_numeric);
+ /* Restore the prevailing locale settings; failure to do so is fatal. */
+ uselocale(saved_locale);
+ freelocale(work_locale);
/*
* At this point we've done our best to clean up, and can call functions
@@ -634,13 +694,6 @@ PGLC_localeconv(void)
{
int encoding;
- /* Release the pstrdup'd locale names */
- pfree(save_lc_monetary);
- pfree(save_lc_numeric);
-#ifdef WIN32
- pfree(save_lc_ctype);
-#endif
-
/* If any of the preceding strdup calls failed, complain now. */
if (!struct_lconv_is_valid(&worklconv))
ereport(ERROR,
@@ -787,10 +840,8 @@ cache_locale_time(void)
bool strftimefail = false;
int encoding;
int i;
- char *save_lc_time;
-#ifdef WIN32
- char *save_lc_ctype;
-#endif
+ locale_t saved_locale;
+ locale_t work_locale;
/* did we do this already? */
if (CurrentLCTimeValid)
@@ -805,11 +856,15 @@ cache_locale_time(void)
* results afterwards.
*/
- /* Save prevailing value of time locale */
- save_lc_time = setlocale(LC_TIME, NULL);
- if (!save_lc_time)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_time = pstrdup(save_lc_time);
+ /* Save prevailing locale */
+ saved_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
+
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
#ifdef WIN32
@@ -820,17 +875,14 @@ cache_locale_time(void)
* the encoding we'll get back from strftime_win32().
*/
- /* Save prevailing value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
- if (!save_lc_ctype)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_ctype = pstrdup(save_lc_ctype);
-
/* use lc_time to set the ctype */
- setlocale(LC_CTYPE, locale_time);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_time, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
- setlocale(LC_TIME, locale_time);
+ work_locale = newlocale(LC_TIME_MASK, locale_time, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
/* We use times close to current time as data for strftime(). */
timenow = time(NULL);
@@ -878,12 +930,8 @@ cache_locale_time(void)
* Restore the prevailing locale settings; as in PGLC_localeconv(),
* failure to do so is fatal.
*/
-#ifdef WIN32
- if (!setlocale(LC_CTYPE, save_lc_ctype))
- elog(FATAL, "failed to restore LC_CTYPE to \"%s\"", save_lc_ctype);
-#endif
- if (!setlocale(LC_TIME, save_lc_time))
- elog(FATAL, "failed to restore LC_TIME to \"%s\"", save_lc_time);
+ uselocale(saved_locale);
+ freelocale(work_locale);
/*
* At this point we've done our best to clean up, and can throw errors, or
@@ -892,12 +940,6 @@ cache_locale_time(void)
if (strftimefail)
elog(ERROR, "strftime() failed: %m");
- /* Release the pstrdup'd locale names */
- pfree(save_lc_time);
-#ifdef WIN32
- pfree(save_lc_ctype);
-#endif
-
#ifndef WIN32
/*
@@ -1292,18 +1334,14 @@ lc_collate_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
if (default_locale.provider == COLLPROVIDER_ICU)
return false;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_COLLATE setting");
- result = locale_is_c(localeptr, false);
+ result = locale_is_c(curr_locales.collate, false);
return (bool) result;
}
@@ -1336,23 +1374,19 @@ lc_ctype_is_c(Oid collation)
/*
* If we're asked about the default collation, we have to inquire of the C
- * library. Cache the result so we only have to compute it once.
+ * library.
*/
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
if (default_locale.provider == COLLPROVIDER_ICU)
return false;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_CTYPE setting");
- result = locale_is_c(localeptr, false);
+ result = locale_is_c(curr_locales.ctype, false);
return (bool) result;
}
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a92a0c438f..0741f5666b 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -405,14 +405,14 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
datum = SysCacheGetAttrNotNull(DATABASEOID, tup, Anum_pg_database_datctype);
ctype = TextDatumGetCString(datum);
- if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
+ if (pg_setlocale(LC_COLLATE, collate) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_COLLATE \"%s\", "
" which is not recognized by setlocale().", collate),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
+ if (pg_setlocale(LC_CTYPE, ctype) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_CTYPE \"%s\", "
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 997156515c..8dc7da47bd 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -1238,7 +1238,7 @@ pg_bind_textdomain_codeset(const char *domainname)
int new_msgenc;
#ifndef WIN32
- if (locale_is_c(locale_ctype, true))
+ if (locale_is_c(pg_setlocale(LC_CTYPE, NULL), true))
#endif
if (encoding != PG_SQL_ASCII &&
raw_pg_bind_textdomain_codeset(domainname, encoding))
diff --git a/src/common/exec.c b/src/common/exec.c
index f209b934df..f7a9643be4 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -438,22 +438,6 @@ set_pglocale_pgservice(const char *argv0, const char *app)
char path[MAXPGPATH];
char my_exec_path[MAXPGPATH];
- /* don't set LC_ALL in the backend */
- if (strcmp(app, PG_TEXTDOMAIN("postgres")) != 0)
- {
- setlocale(LC_ALL, "");
-
- /*
- * One could make a case for reproducing here PostmasterMain()'s test
- * for whether the process is multithreaded. Unlike the postmaster,
- * no frontend program calls sigprocmask() or otherwise provides for
- * mutual exclusion between signal handlers. While frontends using
- * fork(), if multithreaded, are formally exposed to undefined
- * behavior, we have not witnessed a concrete bug. Therefore,
- * complaining about multithreading here may be mere pedantry.
- */
- }
-
if (find_my_exec(argv0, my_exec_path) < 0)
return;
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 872bef798a..60682051cb 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -52,7 +52,7 @@ extern PGDLLIMPORT char *localized_full_months[];
extern PGDLLIMPORT bool database_ctype_is_c;
extern bool check_locale(int category, const char *locale, char **canonname);
-extern char *pg_perm_setlocale(int category, const char *locale);
+extern char *pg_setlocale(int category, const char *locale);
extern bool locale_is_c(const char *locale, bool ignore_case);
extern bool lc_collate_is_c(Oid collation);
--
Tristan Partin
Neon (https://neon.tech)
Joe,
The Reply-To header in your email is pointing at joe@cd, fyi. Pretty
strange.
--
Tristan Partin
Neon (https://neon.tech)
On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote:
Although I have not looked yet, presumably we could have similar
problems with plpython. I would like to get agreement on this approach
against plperl before diving into that though.Thoughts?
I don't see anything immediately wrong with this. I think doing a
similar thing for plpython would make sense. Might make sense to CC any
other pl* maintainers too.
--
Tristan Partin
Neon (https://neon.tech)
On 7/3/23 12:17, Tristan Partin wrote:
The Reply-To header in your email is pointing at joe@cd, fyi. Pretty
strange.
I noticed that -- it happened only the one time, and I am not sure why.
Seems fine now though.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On 7/3/23 12:25, Tristan Partin wrote:
On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote:
Although I have not looked yet, presumably we could have similar
problems with plpython. I would like to get agreement on this approach
against plperl before diving into that though.Thoughts?
I don't see anything immediately wrong with this. I think doing a
similar thing for plpython would make sense. Might make sense to CC any
other pl* maintainers too.
In our tree there are only plperl and plpython to worry about.
"other pl* maintainers" is a fuzzy concept since other pl's are
scattered far and wide.
I think it is reasonable to expect such maintainers to be paying
attention to hackers and pick up on it themselves (I say that as a pl
maintainer myself -- plr)
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Mon Jun 5, 2023 at 11:00 AM CDT, Heikki Linnakangas wrote:
I think the uselocale() call renders ineffective the setlocale() calls
that we make later. Maybe we should replace our setlocale() calls with
uselocale(), too.
Should we just stop supporting systems without uselocale() that aren't
Windows, where you can get thread-safe localization using another
method? I am not aware of other systems that might have their own
non-POSIX APIs.
--
Tristan Partin
Neon (https://neon.tech)
On Mon Jul 3, 2023 at 9:42 AM CDT, Tristan Partin wrote:
Thanks for your patience. Attached is a patch that should cover all the
problematic use cases of setlocale(). There are some setlocale() calls in
tests, initdb, and ecpg left. I plan to get to ecpglib before the final
version of this patch after I abstract over Windows not having
uselocale(). I think leaving initdb and tests as is would be fine, but I
am also happy to just permanently purge setlocale() from the codebase
if people see value in that. We could also poison[0] setlocale() at that
point.
Here is a v2 with best effort Windows support. My patch currently
assumes that you either have uselocale() or are Windows. I dropped the
environment variable hacks, but could bring them back if we didn't like
this requirement.
I tried to add an email[0]/messages/by-id/CTUJ604ZWHI1.3PFZK152XCWLX@gonk to discuss this with hackers, but failed to add
the CC. Let's discuss here instead given my complete inability to manage
mailing lists :).
[0]: /messages/by-id/CTUJ604ZWHI1.3PFZK152XCWLX@gonk
--
Tristan Partin
Neon (https://neon.tech)
Attachments:
v2-0001-Skip-checking-for-uselocale-on-Windows.patchtext/x-patch; charset=utf-8; name=v2-0001-Skip-checking-for-uselocale-on-Windows.patchDownload
From 5688bc2b2c6331f437a72b6a429199c5416ccd76 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Fri, 30 Jun 2023 09:31:04 -0500
Subject: [PATCH v2 1/3] Skip checking for uselocale on Windows
Windows doesn't have uselocale, so skip it.
---
meson.build | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/meson.build b/meson.build
index fbec997947..df76f10822 100644
--- a/meson.build
+++ b/meson.build
@@ -2439,7 +2439,7 @@ func_checks = [
['strsignal'],
['sync_file_range'],
['syncfs'],
- ['uselocale'],
+ ['uselocale', {'skip': host_system == 'windows'}],
['wcstombs_l'],
]
--
Tristan Partin
Neon (https://neon.tech)
v2-0002-Add-locale_is_c-function.patchtext/x-patch; charset=utf-8; name=v2-0002-Add-locale_is_c-function.patchDownload
From d52ab7851e9638e36cd99859014d7aa31154e600 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Fri, 30 Jun 2023 11:13:29 -0500
Subject: [PATCH v2 2/3] Add locale_is_c function
In some places throughout the codebase, there are string comparisons for
locales matching C or POSIX. Encapsulate this logic into a single
function and use it.
---
src/backend/utils/adt/pg_locale.c | 37 ++++++++++++++++++-------------
src/backend/utils/init/postinit.c | 4 +---
src/backend/utils/mb/mbutils.c | 5 ++---
src/include/utils/pg_locale.h | 1 +
4 files changed, 25 insertions(+), 22 deletions(-)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 514d4a9eeb..b455ef3aff 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -154,6 +154,21 @@ static void icu_set_collation_attributes(UCollator *collator, const char *loc,
UErrorCode *status);
#endif
+/*
+ * Check if a locale is the C locale
+ *
+ * POSIX is an alias for C. Passing ingore_case as true will use
+ * pg_strcasecmp() instead of strcmp().
+ */
+bool
+locale_is_c(const char *locale, bool ignore_case)
+{
+ if (!ignore_case)
+ return strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0;
+
+ return pg_strcasecmp(locale, "C") == 0 || pg_strcasecmp(locale, "POSIX") == 0;
+}
+
/*
* pg_perm_setlocale
*
@@ -1239,10 +1254,8 @@ lookup_collation_cache(Oid collation, bool set_flags)
datum = SysCacheGetAttrNotNull(COLLOID, tp, Anum_pg_collation_collctype);
collctype = TextDatumGetCString(datum);
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ cache_entry->collate_is_c = locale_is_c(collcollate, false);
+ cache_entry->ctype_is_c = locale_is_c(collctype, false);
}
else
{
@@ -1290,12 +1303,8 @@ lc_collate_is_c(Oid collation)
if (!localeptr)
elog(ERROR, "invalid LC_COLLATE setting");
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
- result = false;
+ result = locale_is_c(localeptr, false);
+
return (bool) result;
}
@@ -1343,12 +1352,8 @@ lc_ctype_is_c(Oid collation)
if (!localeptr)
elog(ERROR, "invalid LC_CTYPE setting");
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
- result = false;
+ result = locale_is_c(localeptr, false);
+
return (bool) result;
}
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0f9b92b32e..a92a0c438f 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -419,9 +419,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (strcmp(ctype, "C") == 0 ||
- strcmp(ctype, "POSIX") == 0)
- database_ctype_is_c = true;
+ database_ctype_is_c = locale_is_c(ctype, false);
if (dbform->datlocprovider == COLLPROVIDER_ICU)
{
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 67a1ab2ab2..997156515c 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -39,6 +39,7 @@
#include "mb/pg_wchar.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
+#include "utils/pg_locale.h"
#include "utils/syscache.h"
#include "varatt.h"
@@ -1237,9 +1238,7 @@ pg_bind_textdomain_codeset(const char *domainname)
int new_msgenc;
#ifndef WIN32
- const char *ctype = setlocale(LC_CTYPE, NULL);
-
- if (pg_strcasecmp(ctype, "C") == 0 || pg_strcasecmp(ctype, "POSIX") == 0)
+ if (locale_is_c(locale_ctype, true))
#endif
if (encoding != PG_SQL_ASCII &&
raw_pg_bind_textdomain_codeset(domainname, encoding))
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index e2a7243542..872bef798a 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -54,6 +54,7 @@ extern PGDLLIMPORT bool database_ctype_is_c;
extern bool check_locale(int category, const char *locale, char **canonname);
extern char *pg_perm_setlocale(int category, const char *locale);
+extern bool locale_is_c(const char *locale, bool ignore_case);
extern bool lc_collate_is_c(Oid collation);
extern bool lc_ctype_is_c(Oid collation);
--
Tristan Partin
Neon (https://neon.tech)
v2-0003-Use-thread-safe-locale-APIs.patchtext/x-patch; charset=utf-8; name=v2-0003-Use-thread-safe-locale-APIs.patchDownload
From 777a30e5c5f04fa433ea7e52cce8a0516223ace3 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 3 Jul 2023 08:51:59 -0500
Subject: [PATCH v2 3/3] Use thread-safe locale APIs
setlocale() has been a thorn in Postgres's side for a while. Most
recently this has manifested in bug #17946 where loading libperl would
cause the localization to change out from under Postgres. On Windows,
use _ENABLE_PER_THREAD_LOCALE.
---
src/backend/main/main.c | 55 +++-
src/backend/utils/adt/cash.c | 29 +++
src/backend/utils/adt/formatting.c | 11 +-
src/backend/utils/adt/pg_locale.c | 389 ++++++++++++++++++-----------
src/backend/utils/init/postinit.c | 4 +-
src/backend/utils/mb/mbutils.c | 2 +-
src/common/exec.c | 16 --
src/include/utils/pg_locale.h | 2 +-
8 files changed, 326 insertions(+), 182 deletions(-)
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index ed11e8be7f..093df88f14 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -59,6 +59,11 @@ int
main(int argc, char *argv[])
{
bool do_check_root = true;
+ char *collate_locale;
+ char *ctype_locale;
+#ifdef LC_MESSAGES
+ char *messages_locale;
+#endif
reached_main = true;
@@ -111,15 +116,39 @@ main(int argc, char *argv[])
* these set to "C" then message localization might not work well in the
* postmaster.
*/
- init_locale("LC_COLLATE", LC_COLLATE, "");
- init_locale("LC_CTYPE", LC_CTYPE, "");
+
+ /*
+ * Save off the original locale values prior to the first call of
+ * uselocale(). This ensures that we don't call pg_setlocale() with an empty
+ * string. See the function comment for pg_setlocale() for further
+ * explanation. Note that this restriction doesn't exist on Windows, but we
+ * can use the same code path anyway.
+ */
+ collate_locale = setlocale(LC_COLLATE, "");
+ ctype_locale = setlocale(LC_CTYPE, "");
+
+#ifdef LC_MESSAGES
+ messages_locale = setlocale(LC_MESSAGES, "");
+#endif
+
+
+#ifdef HAVE__CONFIGTHREADLOCALE
+ /*
+ * This call could most likely happen sooner, but just to err on the side of
+ * caution, call it after absorbing locales from the environment.
+ */
+ _configthreadlocale(_ENABLE_PER_THREAD_LOCALE);
+#endif
+
+ init_locale("LC_COLLATE", LC_COLLATE, collate_locale);
+ init_locale("LC_CTYPE", LC_CTYPE, ctype_locale);
/*
* LC_MESSAGES will get set later during GUC option processing, but we set
* it here to allow startup error messages to be localized.
*/
#ifdef LC_MESSAGES
- init_locale("LC_MESSAGES", LC_MESSAGES, "");
+ init_locale("LC_MESSAGES", LC_MESSAGES, messages_locale);
#endif
/*
@@ -130,13 +159,6 @@ main(int argc, char *argv[])
init_locale("LC_NUMERIC", LC_NUMERIC, "C");
init_locale("LC_TIME", LC_TIME, "C");
- /*
- * Now that we have absorbed as much as we wish to from the locale
- * environment, remove any LC_ALL setting, so that the environment
- * variables installed by pg_perm_setlocale have force.
- */
- unsetenv("LC_ALL");
-
/*
* Catch standard options before doing much else, in particular before we
* insist on not being root.
@@ -307,8 +329,17 @@ startup_hacks(const char *progname)
static void
init_locale(const char *categoryname, int category, const char *locale)
{
- if (pg_perm_setlocale(category, locale) == NULL &&
- pg_perm_setlocale(category, "C") == NULL)
+ /*
+ * Since we are initializing global locales, NULL and empty string are
+ * invalid arguments. See the pg_setlocale() function description for more
+ * explanation on the empty string. NULL just doesn't make sense in this
+ * context of initializing locales.
+ */
+ Assert(locale);
+ Assert(locale[0] != '\0');
+
+ if (pg_setlocale(category, locale) == NULL &&
+ pg_setlocale(category, "C") == NULL)
elog(FATAL, "could not adopt \"%s\" locale nor C locale for %s",
locale, categoryname);
}
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 32fbad2f57..5c254d912f 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -111,6 +111,11 @@ cash_in(PG_FUNCTION_ARGS)
*csymbol;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(escontext, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/*
* frac_digits will be CHAR_MAX in some locales, notably C. However, just
* testing for == CHAR_MAX is risky, because of compilers like gcc that
@@ -325,6 +330,11 @@ cash_out(PG_FUNCTION_ARGS)
sep_by_space;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
points = lconvert->frac_digits;
if (points < 0 || points > 10)
@@ -1036,6 +1046,11 @@ cash_numeric(PG_FUNCTION_ARGS)
int fpoint;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1095,6 +1110,11 @@ numeric_cash(PG_FUNCTION_ARGS)
Datum numeric_scale;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1128,6 +1148,11 @@ int4_cash(PG_FUNCTION_ARGS)
int i;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1158,6 +1183,10 @@ int8_cash(PG_FUNCTION_ARGS)
int i;
struct lconv *lconvert = PGLC_localeconv();
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index e6246dc44b..b4229d075e 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -5067,12 +5067,12 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Positive / Negative number sign
*/
- if (lconv->negative_sign && *lconv->negative_sign)
+ if (lconv && lconv->negative_sign && *lconv->negative_sign)
Np->L_negative_sign = lconv->negative_sign;
else
Np->L_negative_sign = "-";
- if (lconv->positive_sign && *lconv->positive_sign)
+ if (lconv && lconv->positive_sign && *lconv->positive_sign)
Np->L_positive_sign = lconv->positive_sign;
else
Np->L_positive_sign = "+";
@@ -5080,9 +5080,8 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Number decimal point
*/
- if (lconv->decimal_point && *lconv->decimal_point)
+ if (lconv && lconv->decimal_point && *lconv->decimal_point)
Np->decimal = lconv->decimal_point;
-
else
Np->decimal = ".";
@@ -5096,7 +5095,7 @@ NUM_prepare_locale(NUMProc *Np)
* but "" for thousands_sep, so we set the thousands_sep too.
* http://archives.postgresql.org/pgsql-hackers/2007-11/msg00772.php
*/
- if (lconv->thousands_sep && *lconv->thousands_sep)
+ if (lconv && lconv->thousands_sep && *lconv->thousands_sep)
Np->L_thousands_sep = lconv->thousands_sep;
/* Make sure thousands separator doesn't match decimal point symbol. */
else if (strcmp(Np->decimal, ",") != 0)
@@ -5107,7 +5106,7 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Currency symbol
*/
- if (lconv->currency_symbol && *lconv->currency_symbol)
+ if (lconv && lconv->currency_symbol && *lconv->currency_symbol)
Np->L_currency_symbol = lconv->currency_symbol;
else
Np->L_currency_symbol = " ";
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index b455ef3aff..d3ca8c4b2d 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -98,6 +98,23 @@ char *locale_time;
int icu_validation_level = WARNING;
+#ifdef HAVE_USELOCALE
+/*
+ * Current locale settings
+ *
+ * uselocale() and friends don't have an equivalent to setlocale(cat, NULL), so
+ * as we set locales for various categories, stash them for retrieval as needed.
+ */
+static struct {
+ char *collate;
+ char *ctype;
+ char *messages;
+ char *monetary;
+ char *numeric;
+ char *time;
+} curr_locales = { 0 };
+#endif /* HAVE_USELOCALE */
+
/*
* lc_time localization cache.
*
@@ -169,28 +186,146 @@ locale_is_c(const char *locale, bool ignore_case)
return pg_strcasecmp(locale, "C") == 0 || pg_strcasecmp(locale, "POSIX") == 0;
}
+
+#ifdef HAVE_USELOCALE
/*
- * pg_perm_setlocale
- *
- * This wraps the libc function setlocale(), with two additions. First, when
- * changing LC_CTYPE, update gettext's encoding for the current message
- * domain. GNU gettext automatically tracks LC_CTYPE on most platforms, but
- * not on Windows. Second, if the operation is successful, the corresponding
- * LC_XXX environment variable is set to match. By setting the environment
- * variable, we ensure that any subsequent use of setlocale(..., "") will
- * preserve the settings made through this routine. Of course, LC_ALL must
- * also be unset to fully ensure that, but that has to be done elsewhere after
- * all the individual LC_XXX variables have been set correctly. (Thank you
- * Perl for making this kluge necessary.)
+ * Turns locale categories into their mask equivalents for use in newlocale(3)
+ *
+ * GCC has the masks defined as (1 << category), but this isn't guaranteed to be
+ * the same in other libcs.
+ */
+static int
+category_to_mask(int category)
+{
+ switch (category) {
+ case LC_ALL:
+ return LC_ALL_MASK;
+ case LC_COLLATE:
+ return LC_COLLATE_MASK;
+ case LC_CTYPE:
+ return LC_CTYPE_MASK;
+#ifdef LC_MESSAGES
+ case LC_MESSAGES:
+ return LC_MESSAGES_MASK;
+#endif
+ case LC_MONETARY:
+ return LC_MONETARY_MASK;
+ case LC_NUMERIC:
+ return LC_NUMERIC_MASK;
+ case LC_TIME:
+ return LC_TIME_MASK;
+ }
+
+ pg_unreachable();
+}
+
+
+static char *
+pg_getlocale(int category)
+{
+ switch (category) {
+ case LC_COLLATE:
+ return curr_locales.collate;
+ case LC_CTYPE:
+ return curr_locales.ctype;
+ case LC_MESSAGES:
+ return curr_locales.messages;
+ case LC_MONETARY:
+ return curr_locales.monetary;
+ case LC_NUMERIC:
+ return curr_locales.numeric;
+ case LC_TIME:
+ return curr_locales.time;
+ default:
+ pg_unreachable();
+ }
+}
+#endif /* HAVE_USELOCALE */
+
+
+/*
+ * pg_setlocale
+ *
+ * This wraps the libc functions uselocale() or setlocale() (depending on the
+ * platform), with a single addition. When changing LC_CTYPE, update gettext's
+ * encoding for the current message domain. GNU gettext automatically tracks
+ * LC_CTYPE on most platforms, but not on Windows. This function will abort if
+ * locale is the empty string. A workaround is to pass the output of
+ * setlocale(category, "") to the locale argument.
*/
char *
-pg_perm_setlocale(int category, const char *locale)
+pg_setlocale(int category, const char *locale)
{
char *result;
- const char *envvar;
+#ifdef HAVE_USELOCALE
+ char **save;
+ int category_mask;
+ locale_t prev_locale;
+ locale_t temp_locale;
+ locale_t work_locale;
+
+ if (locale == NULL)
+ return pg_getlocale(category);
+
+ Assert(locale[0] != '\0');
+
+ category_mask = category_to_mask(category);
+
+ prev_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
+
+ result = strdup(locale);
+ work_locale = duplocale(prev_locale);
+ if (!result || work_locale == (locale_t) 0) {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+
+ return NULL;
+ }
+
+ temp_locale = newlocale(category_mask, locale, work_locale);
+ if (temp_locale == (locale_t) 0) {
+ freelocale(work_locale);
+
+ return NULL; /* fall out immediately on failure */
+ }
+ work_locale = temp_locale;
+
+ uselocale(work_locale);
+ if (prev_locale != LC_GLOBAL_LOCALE)
+ freelocale(prev_locale);
+
+ /*
+ * Copying the locale must occur before we potentially bind the text domain.
+ */
+ switch (category) {
+ case LC_COLLATE:
+ save = &curr_locales.collate;
+ break;
+ case LC_CTYPE:
+ save = &curr_locales.ctype;
+ break;
+ case LC_MESSAGES:
+ save = &curr_locales.messages;
+ break;
+ case LC_MONETARY:
+ save = &curr_locales.monetary;
+ break;
+ case LC_NUMERIC:
+ save = &curr_locales.numeric;
+ break;
+ case LC_TIME:
+ save = &curr_locales.time;
+ break;
+ default:
+ pg_unreachable();
+ }
+
+ /* Free old locale string, and stash the new one */
+ free(*save);
+ *save = result;
-#ifndef WIN32
- result = setlocale(category, locale);
#else
/*
@@ -209,11 +344,12 @@ pg_perm_setlocale(int category, const char *locale)
else
#endif
result = setlocale(category, locale);
-#endif /* WIN32 */
if (result == NULL)
return result; /* fall out immediately on failure */
+#endif /* HAVE_USELOCALE */
+
/*
* Use the right encoding in translated messages. Under ENABLE_NLS, let
* pg_bind_textdomain_codeset() figure it out. Under !ENABLE_NLS, message
@@ -223,11 +359,13 @@ pg_perm_setlocale(int category, const char *locale)
*/
if (category == LC_CTYPE)
{
+#ifndef HAVE_USELOCALE
static char save_lc_ctype[LOCALE_NAME_BUFLEN];
/* copy setlocale() return value before callee invokes it again */
strlcpy(save_lc_ctype, result, sizeof(save_lc_ctype));
result = save_lc_ctype;
+#endif
#ifdef ENABLE_NLS
SetMessageEncoding(pg_bind_textdomain_codeset(textdomain(NULL)));
@@ -236,42 +374,6 @@ pg_perm_setlocale(int category, const char *locale)
#endif
}
- switch (category)
- {
- case LC_COLLATE:
- envvar = "LC_COLLATE";
- break;
- case LC_CTYPE:
- envvar = "LC_CTYPE";
- break;
-#ifdef LC_MESSAGES
- case LC_MESSAGES:
- envvar = "LC_MESSAGES";
-#ifdef WIN32
- result = IsoLocaleName(locale);
- if (result == NULL)
- result = (char *) locale;
- elog(DEBUG3, "IsoLocaleName() executed; locale: \"%s\"", result);
-#endif /* WIN32 */
- break;
-#endif /* LC_MESSAGES */
- case LC_MONETARY:
- envvar = "LC_MONETARY";
- break;
- case LC_NUMERIC:
- envvar = "LC_NUMERIC";
- break;
- case LC_TIME:
- envvar = "LC_TIME";
- break;
- default:
- elog(FATAL, "unrecognized LC category: %d", category);
- return NULL; /* keep compiler quiet */
- }
-
- if (setenv(envvar, result, 1) != 0)
- return NULL;
-
return result;
}
@@ -289,6 +391,47 @@ pg_perm_setlocale(int category, const char *locale)
bool
check_locale(int category, const char *locale, char **canonname)
{
+#ifdef HAVE_USELOCALE
+ locale_t saved_locale;
+ locale_t temp_locale;
+ locale_t work_locale;
+ int category_mask;
+
+ category_mask = category_to_mask(category);
+
+ if (canonname)
+ *canonname = NULL; /* in case of failure */
+
+ saved_locale = uselocale((locale_t) 0);
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0) {
+ if (errno == ENOMEM)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+
+ return false;
+ }
+
+ temp_locale = newlocale(category_mask, locale, work_locale);
+ if (temp_locale == (locale_t) 0) {
+ freelocale(work_locale);
+
+ return false;
+ }
+ work_locale = temp_locale;
+
+ uselocale(work_locale);
+ if (canonname)
+ *canonname = pstrdup(locale); /* TODO: This does not handle "" as the locale */
+
+ uselocale(saved_locale);
+ freelocale(work_locale);
+
+ return true;
+
+#else
+
char *save;
char *res;
@@ -315,6 +458,7 @@ check_locale(int category, const char *locale, char **canonname)
pfree(save);
return (res != NULL);
+#endif /* HAVE_USELOCALE */
}
@@ -327,7 +471,7 @@ check_locale(int category, const char *locale, char **canonname)
*
* Note: we accept value = "" as selecting the postmaster's environment
* value, whatever it was (so long as the environment setting is legal).
- * This will have been locked down by an earlier call to pg_perm_setlocale.
+ * This will have been locked down by an earlier call to pg_setlocale.
*/
bool
check_locale_monetary(char **newval, void **extra, GucSource source)
@@ -406,7 +550,7 @@ assign_locale_messages(const char *newval, void *extra)
* We ignore failure, as per comment above.
*/
#ifdef LC_MESSAGES
- (void) pg_perm_setlocale(LC_MESSAGES, newval);
+ (void) pg_setlocale(LC_MESSAGES, newval);
#endif
}
@@ -502,11 +646,8 @@ PGLC_localeconv(void)
static bool CurrentLocaleConvAllocated = false;
struct lconv *extlconv;
struct lconv worklconv;
- char *save_lc_monetary;
- char *save_lc_numeric;
-#ifdef WIN32
- char *save_lc_ctype;
-#endif
+ locale_t saved_locale;
+ locale_t work_locale;
/* Did we do it already? */
if (CurrentLocaleConvValid)
@@ -533,16 +674,15 @@ PGLC_localeconv(void)
*/
memset(&worklconv, 0, sizeof(worklconv));
- /* Save prevailing values of monetary and numeric locales */
- save_lc_monetary = setlocale(LC_MONETARY, NULL);
- if (!save_lc_monetary)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_monetary = pstrdup(save_lc_monetary);
+ /* Save prevailing locale */
+ saved_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
- save_lc_numeric = setlocale(LC_NUMERIC, NULL);
- if (!save_lc_numeric)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_numeric = pstrdup(save_lc_numeric);
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
#ifdef WIN32
@@ -560,37 +700,37 @@ PGLC_localeconv(void)
* results. Hence, we must temporarily set that category as well.
*/
- /* Save prevailing value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
- if (!save_lc_ctype)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_ctype = pstrdup(save_lc_ctype);
-
/* Here begins the critical section where we must not throw error */
/* use numeric to set the ctype */
- setlocale(LC_CTYPE, locale_numeric);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_numeric, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
/* Get formatting information for numeric */
- setlocale(LC_NUMERIC, locale_numeric);
+ work_locale = newlocale(LC_NUMERIC_MASK, locale_numeric, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
extlconv = localeconv();
- /* Must copy data now in case setlocale() overwrites it */
+ /* Must copy data now in case updating the locale overwrites it */
worklconv.decimal_point = strdup(extlconv->decimal_point);
worklconv.thousands_sep = strdup(extlconv->thousands_sep);
worklconv.grouping = strdup(extlconv->grouping);
#ifdef WIN32
/* use monetary to set the ctype */
- setlocale(LC_CTYPE, locale_monetary);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_monetary, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
/* Get formatting information for monetary */
- setlocale(LC_MONETARY, locale_monetary);
+ work_locale = newlocale(LC_MONETARY_MASK, locale_monetary, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
extlconv = localeconv();
- /* Must copy data now in case setlocale() overwrites it */
+ /* Must copy data now in case updating the locale overwrites it */
worklconv.int_curr_symbol = strdup(extlconv->int_curr_symbol);
worklconv.currency_symbol = strdup(extlconv->currency_symbol);
worklconv.mon_decimal_point = strdup(extlconv->mon_decimal_point);
@@ -608,22 +748,9 @@ PGLC_localeconv(void)
worklconv.p_sign_posn = extlconv->p_sign_posn;
worklconv.n_sign_posn = extlconv->n_sign_posn;
- /*
- * Restore the prevailing locale settings; failure to do so is fatal.
- * Possibly we could limp along with nondefault LC_MONETARY or LC_NUMERIC,
- * but proceeding with the wrong value of LC_CTYPE would certainly be bad
- * news; and considering that the prevailing LC_MONETARY and LC_NUMERIC
- * are almost certainly "C", there's really no reason that restoring those
- * should fail.
- */
-#ifdef WIN32
- if (!setlocale(LC_CTYPE, save_lc_ctype))
- elog(FATAL, "failed to restore LC_CTYPE to \"%s\"", save_lc_ctype);
-#endif
- if (!setlocale(LC_MONETARY, save_lc_monetary))
- elog(FATAL, "failed to restore LC_MONETARY to \"%s\"", save_lc_monetary);
- if (!setlocale(LC_NUMERIC, save_lc_numeric))
- elog(FATAL, "failed to restore LC_NUMERIC to \"%s\"", save_lc_numeric);
+ /* Restore the prevailing locale settings; failure to do so is fatal. */
+ uselocale(saved_locale);
+ freelocale(work_locale);
/*
* At this point we've done our best to clean up, and can call functions
@@ -634,13 +761,6 @@ PGLC_localeconv(void)
{
int encoding;
- /* Release the pstrdup'd locale names */
- pfree(save_lc_monetary);
- pfree(save_lc_numeric);
-#ifdef WIN32
- pfree(save_lc_ctype);
-#endif
-
/* If any of the preceding strdup calls failed, complain now. */
if (!struct_lconv_is_valid(&worklconv))
ereport(ERROR,
@@ -787,10 +907,8 @@ cache_locale_time(void)
bool strftimefail = false;
int encoding;
int i;
- char *save_lc_time;
-#ifdef WIN32
- char *save_lc_ctype;
-#endif
+ locale_t saved_locale;
+ locale_t work_locale;
/* did we do this already? */
if (CurrentLCTimeValid)
@@ -805,11 +923,15 @@ cache_locale_time(void)
* results afterwards.
*/
- /* Save prevailing value of time locale */
- save_lc_time = setlocale(LC_TIME, NULL);
- if (!save_lc_time)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_time = pstrdup(save_lc_time);
+ /* Save prevailing locale */
+ saved_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
+
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
#ifdef WIN32
@@ -820,17 +942,14 @@ cache_locale_time(void)
* the encoding we'll get back from strftime_win32().
*/
- /* Save prevailing value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
- if (!save_lc_ctype)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_ctype = pstrdup(save_lc_ctype);
-
/* use lc_time to set the ctype */
- setlocale(LC_CTYPE, locale_time);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_time, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
- setlocale(LC_TIME, locale_time);
+ work_locale = newlocale(LC_TIME_MASK, locale_time, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
/* We use times close to current time as data for strftime(). */
timenow = time(NULL);
@@ -878,12 +997,8 @@ cache_locale_time(void)
* Restore the prevailing locale settings; as in PGLC_localeconv(),
* failure to do so is fatal.
*/
-#ifdef WIN32
- if (!setlocale(LC_CTYPE, save_lc_ctype))
- elog(FATAL, "failed to restore LC_CTYPE to \"%s\"", save_lc_ctype);
-#endif
- if (!setlocale(LC_TIME, save_lc_time))
- elog(FATAL, "failed to restore LC_TIME to \"%s\"", save_lc_time);
+ uselocale(saved_locale);
+ freelocale(work_locale);
/*
* At this point we've done our best to clean up, and can throw errors, or
@@ -892,12 +1007,6 @@ cache_locale_time(void)
if (strftimefail)
elog(ERROR, "strftime() failed: %m");
- /* Release the pstrdup'd locale names */
- pfree(save_lc_time);
-#ifdef WIN32
- pfree(save_lc_ctype);
-#endif
-
#ifndef WIN32
/*
@@ -1292,18 +1401,14 @@ lc_collate_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
if (default_locale.provider == COLLPROVIDER_ICU)
return false;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_COLLATE setting");
- result = locale_is_c(localeptr, false);
+ result = locale_is_c(curr_locales.collate, false);
return (bool) result;
}
@@ -1336,23 +1441,19 @@ lc_ctype_is_c(Oid collation)
/*
* If we're asked about the default collation, we have to inquire of the C
- * library. Cache the result so we only have to compute it once.
+ * library.
*/
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
if (default_locale.provider == COLLPROVIDER_ICU)
return false;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_CTYPE setting");
- result = locale_is_c(localeptr, false);
+ result = locale_is_c(curr_locales.ctype, false);
return (bool) result;
}
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a92a0c438f..0741f5666b 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -405,14 +405,14 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
datum = SysCacheGetAttrNotNull(DATABASEOID, tup, Anum_pg_database_datctype);
ctype = TextDatumGetCString(datum);
- if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
+ if (pg_setlocale(LC_COLLATE, collate) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_COLLATE \"%s\", "
" which is not recognized by setlocale().", collate),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
+ if (pg_setlocale(LC_CTYPE, ctype) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_CTYPE \"%s\", "
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 997156515c..8dc7da47bd 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -1238,7 +1238,7 @@ pg_bind_textdomain_codeset(const char *domainname)
int new_msgenc;
#ifndef WIN32
- if (locale_is_c(locale_ctype, true))
+ if (locale_is_c(pg_setlocale(LC_CTYPE, NULL), true))
#endif
if (encoding != PG_SQL_ASCII &&
raw_pg_bind_textdomain_codeset(domainname, encoding))
diff --git a/src/common/exec.c b/src/common/exec.c
index f209b934df..f7a9643be4 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -438,22 +438,6 @@ set_pglocale_pgservice(const char *argv0, const char *app)
char path[MAXPGPATH];
char my_exec_path[MAXPGPATH];
- /* don't set LC_ALL in the backend */
- if (strcmp(app, PG_TEXTDOMAIN("postgres")) != 0)
- {
- setlocale(LC_ALL, "");
-
- /*
- * One could make a case for reproducing here PostmasterMain()'s test
- * for whether the process is multithreaded. Unlike the postmaster,
- * no frontend program calls sigprocmask() or otherwise provides for
- * mutual exclusion between signal handlers. While frontends using
- * fork(), if multithreaded, are formally exposed to undefined
- * behavior, we have not witnessed a concrete bug. Therefore,
- * complaining about multithreading here may be mere pedantry.
- */
- }
-
if (find_my_exec(argv0, my_exec_path) < 0)
return;
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 872bef798a..60682051cb 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -52,7 +52,7 @@ extern PGDLLIMPORT char *localized_full_months[];
extern PGDLLIMPORT bool database_ctype_is_c;
extern bool check_locale(int category, const char *locale, char **canonname);
-extern char *pg_perm_setlocale(int category, const char *locale);
+extern char *pg_setlocale(int category, const char *locale);
extern bool locale_is_c(const char *locale, bool ignore_case);
extern bool lc_collate_is_c(Oid collation);
--
Tristan Partin
Neon (https://neon.tech)
Here is an up to date patch given some churn on the master branch.
--
Tristan Partin
Neon (https://neon.tech)
Attachments:
v3-0001-Skip-checking-for-uselocale-on-Windows.patchtext/x-patch; charset=utf-8; name=v3-0001-Skip-checking-for-uselocale-on-Windows.patchDownload
From b68cec481768c7c635ec48329b4764eced264572 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Fri, 30 Jun 2023 09:31:04 -0500
Subject: [PATCH v3 1/3] Skip checking for uselocale on Windows
Windows doesn't have uselocale, so skip it.
---
meson.build | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/meson.build b/meson.build
index 0c44f19cb9..3356f72bf0 100644
--- a/meson.build
+++ b/meson.build
@@ -2438,7 +2438,7 @@ func_checks = [
['strsignal'],
['sync_file_range'],
['syncfs'],
- ['uselocale'],
+ ['uselocale', {'skip': host_system == 'windows'}],
['wcstombs_l'],
]
--
Tristan Partin
Neon (https://neon.tech)
v3-0002-Add-locale_is_c-function.patchtext/x-patch; charset=utf-8; name=v3-0002-Add-locale_is_c-function.patchDownload
From 3207694a1d214a7d5b844f3f6dfd8378408172af Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Fri, 30 Jun 2023 11:13:29 -0500
Subject: [PATCH v3 2/3] Add locale_is_c function
In some places throughout the codebase, there are string comparisons for
locales matching C or POSIX. Encapsulate this logic into a single
function and use it.
---
src/backend/utils/adt/pg_locale.c | 39 ++++++++++++++++++-------------
src/backend/utils/init/postinit.c | 4 +---
src/backend/utils/mb/mbutils.c | 5 ++--
src/include/utils/pg_locale.h | 1 +
4 files changed, 27 insertions(+), 22 deletions(-)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index aa9da99308..047c02dbab 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -191,6 +191,23 @@ wcstombs_l(char *dest, const wchar_t *src, size_t n, locale_t loc)
}
#endif
+
+/*
+ * Check if a locale is the C locale
+ *
+ * POSIX is an alias for C. Passing ingore_case as true will use
+ * pg_strcasecmp() instead of strcmp().
+ */
+bool
+locale_is_c(const char *locale, bool ignore_case)
+{
+ if (!ignore_case)
+ return strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0;
+
+ return pg_strcasecmp(locale, "C") == 0 || pg_strcasecmp(locale, "POSIX") == 0;
+}
+
+
/*
* pg_perm_setlocale
*
@@ -1276,10 +1293,8 @@ lookup_collation_cache(Oid collation, bool set_flags)
datum = SysCacheGetAttrNotNull(COLLOID, tp, Anum_pg_collation_collctype);
collctype = TextDatumGetCString(datum);
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ cache_entry->collate_is_c = locale_is_c(collcollate, false);
+ cache_entry->ctype_is_c = locale_is_c(collctype, false);
}
else
{
@@ -1327,12 +1342,8 @@ lc_collate_is_c(Oid collation)
if (!localeptr)
elog(ERROR, "invalid LC_COLLATE setting");
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
- result = false;
+ result = locale_is_c(localeptr, false);
+
return (bool) result;
}
@@ -1380,12 +1391,8 @@ lc_ctype_is_c(Oid collation)
if (!localeptr)
elog(ERROR, "invalid LC_CTYPE setting");
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
- result = false;
+ result = locale_is_c(localeptr, false);
+
return (bool) result;
}
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0f9b92b32e..a92a0c438f 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -419,9 +419,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (strcmp(ctype, "C") == 0 ||
- strcmp(ctype, "POSIX") == 0)
- database_ctype_is_c = true;
+ database_ctype_is_c = locale_is_c(ctype, false);
if (dbform->datlocprovider == COLLPROVIDER_ICU)
{
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 67a1ab2ab2..997156515c 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -39,6 +39,7 @@
#include "mb/pg_wchar.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
+#include "utils/pg_locale.h"
#include "utils/syscache.h"
#include "varatt.h"
@@ -1237,9 +1238,7 @@ pg_bind_textdomain_codeset(const char *domainname)
int new_msgenc;
#ifndef WIN32
- const char *ctype = setlocale(LC_CTYPE, NULL);
-
- if (pg_strcasecmp(ctype, "C") == 0 || pg_strcasecmp(ctype, "POSIX") == 0)
+ if (locale_is_c(locale_ctype, true))
#endif
if (encoding != PG_SQL_ASCII &&
raw_pg_bind_textdomain_codeset(domainname, encoding))
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 6447bea8e0..e08d96e099 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -54,6 +54,7 @@ extern PGDLLIMPORT bool database_ctype_is_c;
extern bool check_locale(int category, const char *locale, char **canonname);
extern char *pg_perm_setlocale(int category, const char *locale);
+extern bool locale_is_c(const char *locale, bool ignore_case);
extern bool lc_collate_is_c(Oid collation);
extern bool lc_ctype_is_c(Oid collation);
--
Tristan Partin
Neon (https://neon.tech)
v3-0003-Use-thread-safe-locale-APIs.patchtext/x-patch; charset=utf-8; name=v3-0003-Use-thread-safe-locale-APIs.patchDownload
From bc0a79fd0b1cf5f45c93dfd40fc00c054646fb00 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 3 Jul 2023 08:51:59 -0500
Subject: [PATCH v3 3/3] Use thread-safe locale APIs
setlocale() has been a thorn in Postgres's side for a while. Most
recently this has manifested in bug #17946 where loading libperl would
cause the localization to change out from under Postgres. On Windows,
use _ENABLE_PER_THREAD_LOCALE.
---
src/backend/main/main.c | 55 +++-
src/backend/utils/adt/cash.c | 29 +++
src/backend/utils/adt/formatting.c | 11 +-
src/backend/utils/adt/pg_locale.c | 388 ++++++++++++++++++-----------
src/backend/utils/init/postinit.c | 4 +-
src/backend/utils/mb/mbutils.c | 2 +-
src/common/exec.c | 16 --
src/include/utils/pg_locale.h | 2 +-
8 files changed, 325 insertions(+), 182 deletions(-)
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index ed11e8be7f..093df88f14 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -59,6 +59,11 @@ int
main(int argc, char *argv[])
{
bool do_check_root = true;
+ char *collate_locale;
+ char *ctype_locale;
+#ifdef LC_MESSAGES
+ char *messages_locale;
+#endif
reached_main = true;
@@ -111,15 +116,39 @@ main(int argc, char *argv[])
* these set to "C" then message localization might not work well in the
* postmaster.
*/
- init_locale("LC_COLLATE", LC_COLLATE, "");
- init_locale("LC_CTYPE", LC_CTYPE, "");
+
+ /*
+ * Save off the original locale values prior to the first call of
+ * uselocale(). This ensures that we don't call pg_setlocale() with an empty
+ * string. See the function comment for pg_setlocale() for further
+ * explanation. Note that this restriction doesn't exist on Windows, but we
+ * can use the same code path anyway.
+ */
+ collate_locale = setlocale(LC_COLLATE, "");
+ ctype_locale = setlocale(LC_CTYPE, "");
+
+#ifdef LC_MESSAGES
+ messages_locale = setlocale(LC_MESSAGES, "");
+#endif
+
+
+#ifdef HAVE__CONFIGTHREADLOCALE
+ /*
+ * This call could most likely happen sooner, but just to err on the side of
+ * caution, call it after absorbing locales from the environment.
+ */
+ _configthreadlocale(_ENABLE_PER_THREAD_LOCALE);
+#endif
+
+ init_locale("LC_COLLATE", LC_COLLATE, collate_locale);
+ init_locale("LC_CTYPE", LC_CTYPE, ctype_locale);
/*
* LC_MESSAGES will get set later during GUC option processing, but we set
* it here to allow startup error messages to be localized.
*/
#ifdef LC_MESSAGES
- init_locale("LC_MESSAGES", LC_MESSAGES, "");
+ init_locale("LC_MESSAGES", LC_MESSAGES, messages_locale);
#endif
/*
@@ -130,13 +159,6 @@ main(int argc, char *argv[])
init_locale("LC_NUMERIC", LC_NUMERIC, "C");
init_locale("LC_TIME", LC_TIME, "C");
- /*
- * Now that we have absorbed as much as we wish to from the locale
- * environment, remove any LC_ALL setting, so that the environment
- * variables installed by pg_perm_setlocale have force.
- */
- unsetenv("LC_ALL");
-
/*
* Catch standard options before doing much else, in particular before we
* insist on not being root.
@@ -307,8 +329,17 @@ startup_hacks(const char *progname)
static void
init_locale(const char *categoryname, int category, const char *locale)
{
- if (pg_perm_setlocale(category, locale) == NULL &&
- pg_perm_setlocale(category, "C") == NULL)
+ /*
+ * Since we are initializing global locales, NULL and empty string are
+ * invalid arguments. See the pg_setlocale() function description for more
+ * explanation on the empty string. NULL just doesn't make sense in this
+ * context of initializing locales.
+ */
+ Assert(locale);
+ Assert(locale[0] != '\0');
+
+ if (pg_setlocale(category, locale) == NULL &&
+ pg_setlocale(category, "C") == NULL)
elog(FATAL, "could not adopt \"%s\" locale nor C locale for %s",
locale, categoryname);
}
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 32fbad2f57..5c254d912f 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -111,6 +111,11 @@ cash_in(PG_FUNCTION_ARGS)
*csymbol;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(escontext, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/*
* frac_digits will be CHAR_MAX in some locales, notably C. However, just
* testing for == CHAR_MAX is risky, because of compilers like gcc that
@@ -325,6 +330,11 @@ cash_out(PG_FUNCTION_ARGS)
sep_by_space;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
points = lconvert->frac_digits;
if (points < 0 || points > 10)
@@ -1036,6 +1046,11 @@ cash_numeric(PG_FUNCTION_ARGS)
int fpoint;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1095,6 +1110,11 @@ numeric_cash(PG_FUNCTION_ARGS)
Datum numeric_scale;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1128,6 +1148,11 @@ int4_cash(PG_FUNCTION_ARGS)
int i;
struct lconv *lconvert = PGLC_localeconv();
+ if (lconvert == NULL)
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
@@ -1158,6 +1183,10 @@ int8_cash(PG_FUNCTION_ARGS)
int i;
struct lconv *lconvert = PGLC_localeconv();
+ ereturn(fcinfo->context, (Datum) 0,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not allocate memory for locale information")));
+
/* see comments about frac_digits in cash_in() */
fpoint = lconvert->frac_digits;
if (fpoint < 0 || fpoint > 10)
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index e27ea8ef97..d8dbcbd56b 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -5049,12 +5049,12 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Positive / Negative number sign
*/
- if (lconv->negative_sign && *lconv->negative_sign)
+ if (lconv && lconv->negative_sign && *lconv->negative_sign)
Np->L_negative_sign = lconv->negative_sign;
else
Np->L_negative_sign = "-";
- if (lconv->positive_sign && *lconv->positive_sign)
+ if (lconv && lconv->positive_sign && *lconv->positive_sign)
Np->L_positive_sign = lconv->positive_sign;
else
Np->L_positive_sign = "+";
@@ -5062,9 +5062,8 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Number decimal point
*/
- if (lconv->decimal_point && *lconv->decimal_point)
+ if (lconv && lconv->decimal_point && *lconv->decimal_point)
Np->decimal = lconv->decimal_point;
-
else
Np->decimal = ".";
@@ -5078,7 +5077,7 @@ NUM_prepare_locale(NUMProc *Np)
* but "" for thousands_sep, so we set the thousands_sep too.
* http://archives.postgresql.org/pgsql-hackers/2007-11/msg00772.php
*/
- if (lconv->thousands_sep && *lconv->thousands_sep)
+ if (lconv && lconv->thousands_sep && *lconv->thousands_sep)
Np->L_thousands_sep = lconv->thousands_sep;
/* Make sure thousands separator doesn't match decimal point symbol. */
else if (strcmp(Np->decimal, ",") != 0)
@@ -5089,7 +5088,7 @@ NUM_prepare_locale(NUMProc *Np)
/*
* Currency symbol
*/
- if (lconv->currency_symbol && *lconv->currency_symbol)
+ if (lconv && lconv->currency_symbol && *lconv->currency_symbol)
Np->L_currency_symbol = lconv->currency_symbol;
else
Np->L_currency_symbol = " ";
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 047c02dbab..178f387630 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -98,6 +98,23 @@ char *locale_time;
int icu_validation_level = WARNING;
+#ifdef HAVE_USELOCALE
+/*
+ * Current locale settings
+ *
+ * uselocale() and friends don't have an equivalent to setlocale(cat, NULL), so
+ * as we set locales for various categories, stash them for retrieval as needed.
+ */
+static struct {
+ char *collate;
+ char *ctype;
+ char *messages;
+ char *monetary;
+ char *numeric;
+ char *time;
+} curr_locales = { 0 };
+#endif /* HAVE_USELOCALE */
+
/*
* lc_time localization cache.
*
@@ -208,28 +225,145 @@ locale_is_c(const char *locale, bool ignore_case)
}
+#ifdef HAVE_USELOCALE
+/*
+ * Turns locale categories into their mask equivalents for use in newlocale(3)
+ *
+ * GCC has the masks defined as (1 << category), but this isn't guaranteed to be
+ * the same in other libcs.
+ */
+static int
+category_to_mask(int category)
+{
+ switch (category) {
+ case LC_ALL:
+ return LC_ALL_MASK;
+ case LC_COLLATE:
+ return LC_COLLATE_MASK;
+ case LC_CTYPE:
+ return LC_CTYPE_MASK;
+#ifdef LC_MESSAGES
+ case LC_MESSAGES:
+ return LC_MESSAGES_MASK;
+#endif
+ case LC_MONETARY:
+ return LC_MONETARY_MASK;
+ case LC_NUMERIC:
+ return LC_NUMERIC_MASK;
+ case LC_TIME:
+ return LC_TIME_MASK;
+ }
+
+ pg_unreachable();
+}
+
+
+static char *
+pg_getlocale(int category)
+{
+ switch (category) {
+ case LC_COLLATE:
+ return curr_locales.collate;
+ case LC_CTYPE:
+ return curr_locales.ctype;
+ case LC_MESSAGES:
+ return curr_locales.messages;
+ case LC_MONETARY:
+ return curr_locales.monetary;
+ case LC_NUMERIC:
+ return curr_locales.numeric;
+ case LC_TIME:
+ return curr_locales.time;
+ default:
+ pg_unreachable();
+ }
+}
+#endif /* HAVE_USELOCALE */
+
+
/*
- * pg_perm_setlocale
- *
- * This wraps the libc function setlocale(), with two additions. First, when
- * changing LC_CTYPE, update gettext's encoding for the current message
- * domain. GNU gettext automatically tracks LC_CTYPE on most platforms, but
- * not on Windows. Second, if the operation is successful, the corresponding
- * LC_XXX environment variable is set to match. By setting the environment
- * variable, we ensure that any subsequent use of setlocale(..., "") will
- * preserve the settings made through this routine. Of course, LC_ALL must
- * also be unset to fully ensure that, but that has to be done elsewhere after
- * all the individual LC_XXX variables have been set correctly. (Thank you
- * Perl for making this kluge necessary.)
+ * pg_setlocale
+ *
+ * This wraps the libc functions uselocale() or setlocale() (depending on the
+ * platform), with a single addition. When changing LC_CTYPE, update gettext's
+ * encoding for the current message domain. GNU gettext automatically tracks
+ * LC_CTYPE on most platforms, but not on Windows. This function will abort if
+ * locale is the empty string. A workaround is to pass the output of
+ * setlocale(category, "") to the locale argument.
*/
char *
-pg_perm_setlocale(int category, const char *locale)
+pg_setlocale(int category, const char *locale)
{
char *result;
- const char *envvar;
+#ifdef HAVE_USELOCALE
+ char **save;
+ int category_mask;
+ locale_t prev_locale;
+ locale_t temp_locale;
+ locale_t work_locale;
+
+ if (locale == NULL)
+ return pg_getlocale(category);
+
+ Assert(locale[0] != '\0');
+
+ category_mask = category_to_mask(category);
+
+ prev_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
+
+ result = strdup(locale);
+ work_locale = duplocale(prev_locale);
+ if (!result || work_locale == (locale_t) 0) {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+
+ return NULL;
+ }
+
+ temp_locale = newlocale(category_mask, locale, work_locale);
+ if (temp_locale == (locale_t) 0) {
+ freelocale(work_locale);
+
+ return NULL; /* fall out immediately on failure */
+ }
+ work_locale = temp_locale;
+
+ uselocale(work_locale);
+ if (prev_locale != LC_GLOBAL_LOCALE)
+ freelocale(prev_locale);
+
+ /*
+ * Copying the locale must occur before we potentially bind the text domain.
+ */
+ switch (category) {
+ case LC_COLLATE:
+ save = &curr_locales.collate;
+ break;
+ case LC_CTYPE:
+ save = &curr_locales.ctype;
+ break;
+ case LC_MESSAGES:
+ save = &curr_locales.messages;
+ break;
+ case LC_MONETARY:
+ save = &curr_locales.monetary;
+ break;
+ case LC_NUMERIC:
+ save = &curr_locales.numeric;
+ break;
+ case LC_TIME:
+ save = &curr_locales.time;
+ break;
+ default:
+ pg_unreachable();
+ }
+
+ /* Free old locale string, and stash the new one */
+ free(*save);
+ *save = result;
-#ifndef WIN32
- result = setlocale(category, locale);
#else
/*
@@ -248,11 +382,12 @@ pg_perm_setlocale(int category, const char *locale)
else
#endif
result = setlocale(category, locale);
-#endif /* WIN32 */
if (result == NULL)
return result; /* fall out immediately on failure */
+#endif /* HAVE_USELOCALE */
+
/*
* Use the right encoding in translated messages. Under ENABLE_NLS, let
* pg_bind_textdomain_codeset() figure it out. Under !ENABLE_NLS, message
@@ -262,11 +397,13 @@ pg_perm_setlocale(int category, const char *locale)
*/
if (category == LC_CTYPE)
{
+#ifndef HAVE_USELOCALE
static char save_lc_ctype[LOCALE_NAME_BUFLEN];
/* copy setlocale() return value before callee invokes it again */
strlcpy(save_lc_ctype, result, sizeof(save_lc_ctype));
result = save_lc_ctype;
+#endif
#ifdef ENABLE_NLS
SetMessageEncoding(pg_bind_textdomain_codeset(textdomain(NULL)));
@@ -275,42 +412,6 @@ pg_perm_setlocale(int category, const char *locale)
#endif
}
- switch (category)
- {
- case LC_COLLATE:
- envvar = "LC_COLLATE";
- break;
- case LC_CTYPE:
- envvar = "LC_CTYPE";
- break;
-#ifdef LC_MESSAGES
- case LC_MESSAGES:
- envvar = "LC_MESSAGES";
-#ifdef WIN32
- result = IsoLocaleName(locale);
- if (result == NULL)
- result = (char *) locale;
- elog(DEBUG3, "IsoLocaleName() executed; locale: \"%s\"", result);
-#endif /* WIN32 */
- break;
-#endif /* LC_MESSAGES */
- case LC_MONETARY:
- envvar = "LC_MONETARY";
- break;
- case LC_NUMERIC:
- envvar = "LC_NUMERIC";
- break;
- case LC_TIME:
- envvar = "LC_TIME";
- break;
- default:
- elog(FATAL, "unrecognized LC category: %d", category);
- return NULL; /* keep compiler quiet */
- }
-
- if (setenv(envvar, result, 1) != 0)
- return NULL;
-
return result;
}
@@ -328,6 +429,47 @@ pg_perm_setlocale(int category, const char *locale)
bool
check_locale(int category, const char *locale, char **canonname)
{
+#ifdef HAVE_USELOCALE
+ locale_t saved_locale;
+ locale_t temp_locale;
+ locale_t work_locale;
+ int category_mask;
+
+ category_mask = category_to_mask(category);
+
+ if (canonname)
+ *canonname = NULL; /* in case of failure */
+
+ saved_locale = uselocale((locale_t) 0);
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0) {
+ if (errno == ENOMEM)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+
+ return false;
+ }
+
+ temp_locale = newlocale(category_mask, locale, work_locale);
+ if (temp_locale == (locale_t) 0) {
+ freelocale(work_locale);
+
+ return false;
+ }
+ work_locale = temp_locale;
+
+ uselocale(work_locale);
+ if (canonname)
+ *canonname = pstrdup(locale); /* TODO: This does not handle "" as the locale */
+
+ uselocale(saved_locale);
+ freelocale(work_locale);
+
+ return true;
+
+#else
+
char *save;
char *res;
@@ -354,6 +496,7 @@ check_locale(int category, const char *locale, char **canonname)
pfree(save);
return (res != NULL);
+#endif /* HAVE_USELOCALE */
}
@@ -366,7 +509,7 @@ check_locale(int category, const char *locale, char **canonname)
*
* Note: we accept value = "" as selecting the postmaster's environment
* value, whatever it was (so long as the environment setting is legal).
- * This will have been locked down by an earlier call to pg_perm_setlocale.
+ * This will have been locked down by an earlier call to pg_setlocale.
*/
bool
check_locale_monetary(char **newval, void **extra, GucSource source)
@@ -445,7 +588,7 @@ assign_locale_messages(const char *newval, void *extra)
* We ignore failure, as per comment above.
*/
#ifdef LC_MESSAGES
- (void) pg_perm_setlocale(LC_MESSAGES, newval);
+ (void) pg_setlocale(LC_MESSAGES, newval);
#endif
}
@@ -541,11 +684,8 @@ PGLC_localeconv(void)
static bool CurrentLocaleConvAllocated = false;
struct lconv *extlconv;
struct lconv worklconv;
- char *save_lc_monetary;
- char *save_lc_numeric;
-#ifdef WIN32
- char *save_lc_ctype;
-#endif
+ locale_t saved_locale;
+ locale_t work_locale;
/* Did we do it already? */
if (CurrentLocaleConvValid)
@@ -572,16 +712,15 @@ PGLC_localeconv(void)
*/
memset(&worklconv, 0, sizeof(worklconv));
- /* Save prevailing values of monetary and numeric locales */
- save_lc_monetary = setlocale(LC_MONETARY, NULL);
- if (!save_lc_monetary)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_monetary = pstrdup(save_lc_monetary);
+ /* Save prevailing locale */
+ saved_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
- save_lc_numeric = setlocale(LC_NUMERIC, NULL);
- if (!save_lc_numeric)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_numeric = pstrdup(save_lc_numeric);
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
#ifdef WIN32
@@ -599,37 +738,37 @@ PGLC_localeconv(void)
* results. Hence, we must temporarily set that category as well.
*/
- /* Save prevailing value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
- if (!save_lc_ctype)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_ctype = pstrdup(save_lc_ctype);
-
/* Here begins the critical section where we must not throw error */
/* use numeric to set the ctype */
- setlocale(LC_CTYPE, locale_numeric);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_numeric, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
/* Get formatting information for numeric */
- setlocale(LC_NUMERIC, locale_numeric);
+ work_locale = newlocale(LC_NUMERIC_MASK, locale_numeric, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
extlconv = localeconv();
- /* Must copy data now in case setlocale() overwrites it */
+ /* Must copy data now in case updating the locale overwrites it */
worklconv.decimal_point = strdup(extlconv->decimal_point);
worklconv.thousands_sep = strdup(extlconv->thousands_sep);
worklconv.grouping = strdup(extlconv->grouping);
#ifdef WIN32
/* use monetary to set the ctype */
- setlocale(LC_CTYPE, locale_monetary);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_monetary, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
/* Get formatting information for monetary */
- setlocale(LC_MONETARY, locale_monetary);
+ work_locale = newlocale(LC_MONETARY_MASK, locale_monetary, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
extlconv = localeconv();
- /* Must copy data now in case setlocale() overwrites it */
+ /* Must copy data now in case updating the locale overwrites it */
worklconv.int_curr_symbol = strdup(extlconv->int_curr_symbol);
worklconv.currency_symbol = strdup(extlconv->currency_symbol);
worklconv.mon_decimal_point = strdup(extlconv->mon_decimal_point);
@@ -647,22 +786,9 @@ PGLC_localeconv(void)
worklconv.p_sign_posn = extlconv->p_sign_posn;
worklconv.n_sign_posn = extlconv->n_sign_posn;
- /*
- * Restore the prevailing locale settings; failure to do so is fatal.
- * Possibly we could limp along with nondefault LC_MONETARY or LC_NUMERIC,
- * but proceeding with the wrong value of LC_CTYPE would certainly be bad
- * news; and considering that the prevailing LC_MONETARY and LC_NUMERIC
- * are almost certainly "C", there's really no reason that restoring those
- * should fail.
- */
-#ifdef WIN32
- if (!setlocale(LC_CTYPE, save_lc_ctype))
- elog(FATAL, "failed to restore LC_CTYPE to \"%s\"", save_lc_ctype);
-#endif
- if (!setlocale(LC_MONETARY, save_lc_monetary))
- elog(FATAL, "failed to restore LC_MONETARY to \"%s\"", save_lc_monetary);
- if (!setlocale(LC_NUMERIC, save_lc_numeric))
- elog(FATAL, "failed to restore LC_NUMERIC to \"%s\"", save_lc_numeric);
+ /* Restore the prevailing locale settings; failure to do so is fatal. */
+ uselocale(saved_locale);
+ freelocale(work_locale);
/*
* At this point we've done our best to clean up, and can call functions
@@ -673,13 +799,6 @@ PGLC_localeconv(void)
{
int encoding;
- /* Release the pstrdup'd locale names */
- pfree(save_lc_monetary);
- pfree(save_lc_numeric);
-#ifdef WIN32
- pfree(save_lc_ctype);
-#endif
-
/* If any of the preceding strdup calls failed, complain now. */
if (!struct_lconv_is_valid(&worklconv))
ereport(ERROR,
@@ -826,10 +945,8 @@ cache_locale_time(void)
bool strftimefail = false;
int encoding;
int i;
- char *save_lc_time;
-#ifdef WIN32
- char *save_lc_ctype;
-#endif
+ locale_t saved_locale;
+ locale_t work_locale;
/* did we do this already? */
if (CurrentLCTimeValid)
@@ -844,11 +961,15 @@ cache_locale_time(void)
* results afterwards.
*/
- /* Save prevailing value of time locale */
- save_lc_time = setlocale(LC_TIME, NULL);
- if (!save_lc_time)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_time = pstrdup(save_lc_time);
+ /* Save prevailing locale */
+ saved_locale = uselocale((locale_t) 0);
+ Assert(saved_locale != (locale_t) 0);
+
+ work_locale = duplocale(saved_locale);
+ if (work_locale == (locale_t) 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
#ifdef WIN32
@@ -859,17 +980,14 @@ cache_locale_time(void)
* the encoding we'll get back from strftime_win32().
*/
- /* Save prevailing value of ctype locale */
- save_lc_ctype = setlocale(LC_CTYPE, NULL);
- if (!save_lc_ctype)
- elog(ERROR, "setlocale(NULL) failed");
- save_lc_ctype = pstrdup(save_lc_ctype);
-
/* use lc_time to set the ctype */
- setlocale(LC_CTYPE, locale_time);
+ work_locale = newlocale(LC_CTYPE_MASK, locale_time, work_locale);
+ Assert(work_locale != (locale_t) 0);
#endif
- setlocale(LC_TIME, locale_time);
+ work_locale = newlocale(LC_TIME_MASK, locale_time, work_locale);
+ Assert(work_locale != (locale_t) 0);
+ uselocale(work_locale);
/* We use times close to current time as data for strftime(). */
timenow = time(NULL);
@@ -917,12 +1035,8 @@ cache_locale_time(void)
* Restore the prevailing locale settings; as in PGLC_localeconv(),
* failure to do so is fatal.
*/
-#ifdef WIN32
- if (!setlocale(LC_CTYPE, save_lc_ctype))
- elog(FATAL, "failed to restore LC_CTYPE to \"%s\"", save_lc_ctype);
-#endif
- if (!setlocale(LC_TIME, save_lc_time))
- elog(FATAL, "failed to restore LC_TIME to \"%s\"", save_lc_time);
+ uselocale(saved_locale);
+ freelocale(work_locale);
/*
* At this point we've done our best to clean up, and can throw errors, or
@@ -931,12 +1045,6 @@ cache_locale_time(void)
if (strftimefail)
elog(ERROR, "strftime() failed: %m");
- /* Release the pstrdup'd locale names */
- pfree(save_lc_time);
-#ifdef WIN32
- pfree(save_lc_ctype);
-#endif
-
#ifndef WIN32
/*
@@ -1331,18 +1439,14 @@ lc_collate_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
if (default_locale.provider == COLLPROVIDER_ICU)
return false;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_COLLATE setting");
- result = locale_is_c(localeptr, false);
+ result = locale_is_c(curr_locales.collate, false);
return (bool) result;
}
@@ -1375,23 +1479,19 @@ lc_ctype_is_c(Oid collation)
/*
* If we're asked about the default collation, we have to inquire of the C
- * library. Cache the result so we only have to compute it once.
+ * library.
*/
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
if (default_locale.provider == COLLPROVIDER_ICU)
return false;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_CTYPE setting");
- result = locale_is_c(localeptr, false);
+ result = locale_is_c(curr_locales.ctype, false);
return (bool) result;
}
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a92a0c438f..0741f5666b 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -405,14 +405,14 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
datum = SysCacheGetAttrNotNull(DATABASEOID, tup, Anum_pg_database_datctype);
ctype = TextDatumGetCString(datum);
- if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
+ if (pg_setlocale(LC_COLLATE, collate) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_COLLATE \"%s\", "
" which is not recognized by setlocale().", collate),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
+ if (pg_setlocale(LC_CTYPE, ctype) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_CTYPE \"%s\", "
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 997156515c..8dc7da47bd 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -1238,7 +1238,7 @@ pg_bind_textdomain_codeset(const char *domainname)
int new_msgenc;
#ifndef WIN32
- if (locale_is_c(locale_ctype, true))
+ if (locale_is_c(pg_setlocale(LC_CTYPE, NULL), true))
#endif
if (encoding != PG_SQL_ASCII &&
raw_pg_bind_textdomain_codeset(domainname, encoding))
diff --git a/src/common/exec.c b/src/common/exec.c
index f209b934df..f7a9643be4 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -438,22 +438,6 @@ set_pglocale_pgservice(const char *argv0, const char *app)
char path[MAXPGPATH];
char my_exec_path[MAXPGPATH];
- /* don't set LC_ALL in the backend */
- if (strcmp(app, PG_TEXTDOMAIN("postgres")) != 0)
- {
- setlocale(LC_ALL, "");
-
- /*
- * One could make a case for reproducing here PostmasterMain()'s test
- * for whether the process is multithreaded. Unlike the postmaster,
- * no frontend program calls sigprocmask() or otherwise provides for
- * mutual exclusion between signal handlers. While frontends using
- * fork(), if multithreaded, are formally exposed to undefined
- * behavior, we have not witnessed a concrete bug. Therefore,
- * complaining about multithreading here may be mere pedantry.
- */
- }
-
if (find_my_exec(argv0, my_exec_path) < 0)
return;
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index e08d96e099..585cdd3931 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -52,7 +52,7 @@ extern PGDLLIMPORT char *localized_full_months[];
extern PGDLLIMPORT bool database_ctype_is_c;
extern bool check_locale(int category, const char *locale, char **canonname);
-extern char *pg_perm_setlocale(int category, const char *locale);
+extern char *pg_setlocale(int category, const char *locale);
extern bool locale_is_c(const char *locale, bool ignore_case);
extern bool lc_collate_is_c(Oid collation);
--
Tristan Partin
Neon (https://neon.tech)
On 7/3/23 12:25, Tristan Partin wrote:
On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote:
Although I have not looked yet, presumably we could have similar
problems with plpython. I would like to get agreement on this approach
against plperl before diving into that though.Thoughts?
I don't see anything immediately wrong with this.
Any further comments on the posted patch[1]/messages/by-id/ec6fa20d-e691-198a-4a13-e761771b9dec@joeconway.com? I would like to apply/push
this prior to the beta and minor releases next week.
Joe
[1]: /messages/by-id/ec6fa20d-e691-198a-4a13-e761771b9dec@joeconway.com
/messages/by-id/ec6fa20d-e691-198a-4a13-e761771b9dec@joeconway.com
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Tue Aug 1, 2023 at 8:48 AM CDT, Joe Conway wrote:
On 7/3/23 12:25, Tristan Partin wrote:
On Sat Jun 24, 2023 at 8:09 AM CDT, Joe Conway wrote:
Although I have not looked yet, presumably we could have similar
problems with plpython. I would like to get agreement on this approach
against plperl before diving into that though.Thoughts?
I don't see anything immediately wrong with this.
Any further comments on the posted patch[1]? I would like to apply/push
this prior to the beta and minor releases next week.Joe
[1]
/messages/by-id/ec6fa20d-e691-198a-4a13-e761771b9dec@joeconway.com
None from my end.
--
Tristan Partin
Neon (https://neon.tech)
On 01/08/2023 16:48, Joe Conway wrote:
Any further comments on the posted patch[1]? I would like to apply/push
this prior to the beta and minor releases next week.
I'm not sure about the placement of the uselocale() calls. In
plperl_spi_exec(), for example, I think we should switch to the global
locale right after the check_spi_usage_allowed() call. Otherwise, if an
error happens in BeginInternalSubTransaction() or in pg_verifymbstr(),
the error would be processed with the perl locale. Maybe that's
harmless, error processing hardly cares about LC_MONETARY, but seems
wrong in principle.
Hmm, come to think of it, if BeginInternalSubTransaction() throws an
error, we just jump out of the perl interpreter? That doesn't seem cool.
But that's not new with this patch.
If I'm reading correctly, compile_plperl_function() calls
select_perl_context(), which calls plperl_trusted_init(), which calls
uselocale(). So it leaves locale set to the perl locale. Who sets it back?
How about adding a small wrapper around eval_pl() that sets and unsets
the locale(), just when we enter the interpreter? It's easier to see
that we are doing the calls in right places, if we make them as close as
possible to entering/exiting the interpreter. Are there other functions
in addition to eval_pl() that need to be called with the perl locale?
/*
* plperl_xact_callback --- cleanup at main-transaction end.
*/
static void
plperl_xact_callback(XactEvent event, void *arg)
{
/* ensure global locale is the current locale */
if (uselocale((locale_t) 0) != LC_GLOBAL_LOCALE)
perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
}
So the assumption is that the if current locale is not LC_GLOBAL_LOCALE,
then it was the perl locale. Seems true today, but this could confusion
if anything else calls uselocale(). In particular, if another PL
implementation copies this, and you use plperl and the other PL at the
same time, they would get mixed up. I think we need another "bool
perl_locale_obj_in_use" variable to track explicitly whether the perl
locale is currently active.
If we are careful to put the uselocale() calls in the right places so
that we never ereport() while in perl locale, this callback isn't
needed. Maybe it's still a good idea, though, to be extra sure that
things get reset to a sane state if something unexpected happens.
If multiple interpreters are used, is the single perl_locale_obj
variable still enough? Each interpreter can have their own locale I believe.
PS. please pgindent
--
Heikki Linnakangas
Neon (https://neon.tech)
On 8/15/23 10:40, Heikki Linnakangas wrote:
On 01/08/2023 16:48, Joe Conway wrote:
Any further comments on the posted patch[1]? I would like to apply/push
this prior to the beta and minor releases next week.I'm not sure about the placement of the uselocale() calls. In
plperl_spi_exec(), for example, I think we should switch to the global
locale right after the check_spi_usage_allowed() call. Otherwise, if an
error happens in BeginInternalSubTransaction() or in pg_verifymbstr(),
the error would be processed with the perl locale. Maybe that's
harmless, error processing hardly cares about LC_MONETARY, but seems
wrong in principle.
I guess you could probably argue that we should flip this around, and
only enter the perl locale when calling into libperl, and exit the perl
locale every time we reemerge under plperl.c control. That seems pretty
drastic and potentially messy though.
Hmm, come to think of it, if BeginInternalSubTransaction() throws an
error, we just jump out of the perl interpreter? That doesn't seem cool.
But that's not new with this patch.
Hmm, true enough I guess.
If I'm reading correctly, compile_plperl_function() calls
select_perl_context(), which calls plperl_trusted_init(), which calls
uselocale(). So it leaves locale set to the perl locale. Who sets it back?
No one does it seems, at least not currently
How about adding a small wrapper around eval_pl() that sets and unsets
the locale(), just when we enter the interpreter? It's easier to see
that we are doing the calls in right places, if we make them as close as
possible to entering/exiting the interpreter. Are there other functions
in addition to eval_pl() that need to be called with the perl locale?
I can see that as a better strategy, but "other functions in addition to
eval_pv()" (I assume you mean eval_pv rather than eval_pl) is a tricky
one to answer.
I ran the attached script like so (from cwd src/pl/plperl) like so:
```
symbols-used.sh /lib/x86_64-linux-gnu/libperl.so.5.34 plperl.so
```
and get a fairly long list of exported libperl functions that get linked
into plperl.so:
```
Matched symbols:
boot_DynaLoader
perl_alloc
Perl_av_extend
Perl_av_fetch
Perl_av_len
Perl_av_push
*Perl_call_list
*Perl_call_pv
*Perl_call_sv
perl_construct
Perl_croak
Perl_croak_nocontext
Perl_croak_sv
Perl_croak_xs_usage
Perl_die
*Perl_eval_pv
Perl_free_tmps
Perl_get_sv
Perl_gv_add_by_type
Perl_gv_stashpv
Perl_hv_clear
Perl_hv_common
Perl_hv_common_key_len
Perl_hv_iterinit
Perl_hv_iternext
Perl_hv_iternext_flags
Perl_hv_iternextsv
Perl_hv_ksplit
Perl_looks_like_number
Perl_markstack_grow
Perl_mg_get
Perl_newRV
Perl_newRV_noinc
Perl_newSV
Perl_newSViv
Perl_newSVpv
Perl_newSVpvn
Perl_newSVpvn_flags
Perl_newSVsv
Perl_newSVsv_flags
Perl_newSV_type
Perl_newSVuv
Perl_newXS
Perl_newXS_flags
*perl_parse
Perl_pop_scope
Perl_push_scope
*perl_run
Perl_save_item
Perl_savetmps
Perl_stack_grow
Perl_sv_2bool
Perl_sv_2bool_flags
Perl_sv_2iv
Perl_sv_2iv_flags
Perl_sv_2mortal
Perl_sv_2pv
Perl_sv_2pvbyte
Perl_sv_2pvbyte_flags
Perl_sv_2pv_flags
Perl_sv_2pvutf8
Perl_sv_2pvutf8_flags
Perl_sv_bless
Perl_sv_free
Perl_sv_free2
Perl_sv_isa
Perl_sv_newmortal
Perl_sv_setiv
Perl_sv_setiv_mg
Perl_sv_setsv
Perl_sv_setsv_flags
Perl_sys_init
Perl_sys_init3
Perl_xs_boot_epilog
Perl_xs_handshake
```
I marked the ones that look like perhaps we should care about in the
above list with an asterisk:
*Perl_call_list
*Perl_call_pv
*Perl_call_sv
*Perl_eval_pv
*perl_run
but perhaps there are others?
/*
* plperl_xact_callback --- cleanup at main-transaction end.
*/
static void
plperl_xact_callback(XactEvent event, void *arg)
{
/* ensure global locale is the current locale */
if (uselocale((locale_t) 0) != LC_GLOBAL_LOCALE)
perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
}So the assumption is that the if current locale is not LC_GLOBAL_LOCALE,
then it was the perl locale. Seems true today, but this could confusion
if anything else calls uselocale(). In particular, if another PL
implementation copies this, and you use plperl and the other PL at the
same time, they would get mixed up. I think we need another "bool
perl_locale_obj_in_use" variable to track explicitly whether the perl
locale is currently active.
Or perhaps don't assume that we want the global locale and swap between
pg_locale_obj (whatever it is) and perl_locale_obj?
If we are careful to put the uselocale() calls in the right places so
that we never ereport() while in perl locale, this callback isn't
needed. Maybe it's still a good idea, though, to be extra sure that
things get reset to a sane state if something unexpected happens.
I feel more comfortable that we have a "belt and suspenders" method to
restore the locale that was in use by Postgres before entering perl.
If multiple interpreters are used, is the single perl_locale_obj
variable still enough? Each interpreter can have their own locale I believe.
So in other words plperl and plperlu both used in the same query? I
don't see how we could get from one to the other without going through
the outer "postgres" locale first. Or are you thinking something else?
PS. please pgindent
ok
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
On 27/08/2023 16:41, Joe Conway wrote:
On 8/15/23 10:40, Heikki Linnakangas wrote:
If multiple interpreters are used, is the single perl_locale_obj
variable still enough? Each interpreter can have their own locale I believe.So in other words plperl and plperlu both used in the same query? I
don't see how we could get from one to the other without going through
the outer "postgres" locale first. Or are you thinking something else?
I think you got that it backwards. 'perl_locale_obj' is set to the perl
interpreter's locale, whenever we are *outside* the interpreter.
This crashes with the patch:
postgres=# DO LANGUAGE plperlu
$function$
use POSIX qw(setlocale LC_NUMERIC);
use locale;
setlocale LC_NUMERIC, "sv_SE.utf8";
$function$;
DO
postgres=# do language plperl $$ $$;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
I was going to test using plperl and plperl in the same session and
expected the interpreters to mix up the locales they use. Maybe the
crash is because of something like that, although I didn't expect a
crash, just weird confusion on which locale is used.
--
Heikki Linnakangas
Neon (https://neon.tech)
On Sun, Aug 27, 2023 at 4:25 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
I think you got that it backwards. 'perl_locale_obj' is set to the perl
interpreter's locale, whenever we are *outside* the interpreter.
This thread has had no update for more than 4 months, so I'm marking
the CF entry RwF for now.
It can always be reopened, if Joe or Tristan or Heikki or someone else
picks it up again.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 1/5/24 12:56, Robert Haas wrote:
On Sun, Aug 27, 2023 at 4:25 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
I think you got that it backwards. 'perl_locale_obj' is set to the perl
interpreter's locale, whenever we are *outside* the interpreter.This thread has had no update for more than 4 months, so I'm marking
the CF entry RwF for now.It can always be reopened, if Joe or Tristan or Heikki or someone else
picks it up again.
It is definitely a bug, so I do plan to get back to it at some point,
hopefully sooner rather than later...
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com