ICU warnings during make installcheck and text_extensions test

Started by Oleg Tselebrovskiy10 months ago5 messages
#1Oleg Tselebrovskiy
o.tselebrovskiy@postgrespro.ru
2 attachment(s)

Greetings, everyone!

When you try to run installcheck using a cluster that was initialized
with ICU (./initdb -D ../data --locale-provider=icu
--icu-locale='en_US_POSIX') and NO_LOCALE=1 you get a warning:

# +++ regress install-check in src/test/regress +++
# using postmaster on Unix socket, default port
WARNING: could not convert locale name "C" to language tag:
U_ILLEGAL_ARGUMENT_ERROR
WARNING: ICU locale "C" has unknown language "c"
HINT: To disable ICU locale validation, set the parameter
"icu_validation_level" to "disabled".

This happens with PostgreSQL 16, 17 and master

While this case is somewhat superficial (you need ICU and NO_LOCALE),
when you try to run installcheck of test_extensions test module (using a
cluster initialized with ICU) you will get the same warnings due to
NO_LOCALE=1 in Makefile

There could be an argument "if you are using ICU and don't want warnings
just set icu_validation_level=disabled in postgresql.conf", but then
installcheck fails because of collate.icu.utf8 like this:

CREATE COLLATION testx (provider = icu, locale =
'@colStrength=primary;nonsense=yes'); DROP COLLATION testx;
-WARNING: could not convert locale name
"@colStrength=primary;nonsense=yes" to language tag:
U_ILLEGAL_ARGUMENT_ERROR
CREATE COLLATION testx (provider = icu, locale = 'nonsense-nowhere');
DROP COLLATION testx;
-WARNING: ICU locale "nonsense-nowhere" has unknown language
"nonsense"
-HINT: To disable ICU locale validation, set the parameter
"icu_validation_level" to "disabled".
CREATE COLLATION test4 FROM nonsense;

And we definitely want to check warnings there

So for now I propose adding icu_validation_level=disabled to pg_regress
when we were passed NO_LOCALE=1 (patch is attached)

Also since locale 'C' isn't converted to anything with ICU since
f3a01af, maybe we want to somehow handle NO_LOCALE with ICU in a special
way? Maybe only during tests?

On another none, test test_extensions from
src/test/modules/test_extensions fails during installcheck when the
cluster was initialized with ICU locale. It was already reported at [1]/messages/by-id/a2fd691f8d9f1325cec1283b161d6a8e@postgrespro.ru.
This test fails on current REL_17_STABLE (2530367) and master (2a5e709)

One way to fix it is to just skip the test if we find ICU locale
provider (patch attached bellow), but I'm not sure this is the optimal
way. In [1]/messages/by-id/a2fd691f8d9f1325cec1283b161d6a8e@postgrespro.ru there was an attempt to replace \dx+ with function with
collation-independent output

[1]: /messages/by-id/a2fd691f8d9f1325cec1283b161d6a8e@postgrespro.ru
/messages/by-id/a2fd691f8d9f1325cec1283b161d6a8e@postgrespro.ru

Attachments:

icu_validation_level_installcheck.patchtext/x-diff; name=icu_validation_level_installcheck.patchDownload
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 53435c47420..390a10b8239 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -1957,6 +1957,14 @@ create_database(const char *dbname)
 	StringInfo	buf = psql_start_command();
 	_stringlist *sl;
 
+	/*
+	 * Disable warning that ICU locale "C" has unknown language "c". This can
+	 * happen if the cluster was initialized with the icu locale and the new
+	 * database uses the C locale.
+	 */
+	if (nolocale)
+		psql_add_command(buf, "SET icu_validation_level=disabled");
+
 	/*
 	 * We use template0 so that any installation-local cruft in template1 will
 	 * not mess up the tests.
test_extension_with_icu.patchtext/x-diff; name=test_extension_with_icu.patchDownload
diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out
index f357cc21aaa..a74da86a1fb 100644
--- a/src/test/modules/test_extensions/expected/test_extensions.out
+++ b/src/test/modules/test_extensions/expected/test_extensions.out
@@ -1,3 +1,7 @@
+SELECT (SELECT datlocprovider='i' FROM pg_database WHERE datname=current_database()) as skip_test \gset
+\if :skip_test
+\quit
+\endif
 CREATE SCHEMA has$dollar;
 -- test some errors
 CREATE EXTENSION test_ext1;
diff --git a/src/test/modules/test_extensions/expected/test_extensions_0.out b/src/test/modules/test_extensions/expected/test_extensions_0.out
new file mode 100644
index 00000000000..aeb23e2ff38
--- /dev/null
+++ b/src/test/modules/test_extensions/expected/test_extensions_0.out
@@ -0,0 +1,3 @@
+SELECT (SELECT datlocprovider='i' FROM pg_database WHERE datname=current_database()) as skip_test \gset
+\if :skip_test
+\quit
diff --git a/src/test/modules/test_extensions/sql/test_extensions.sql b/src/test/modules/test_extensions/sql/test_extensions.sql
index 642c82ff5d3..eb5868d4c8c 100644
--- a/src/test/modules/test_extensions/sql/test_extensions.sql
+++ b/src/test/modules/test_extensions/sql/test_extensions.sql
@@ -1,3 +1,8 @@
+SELECT (SELECT datlocprovider='i' FROM pg_database WHERE datname=current_database()) as skip_test \gset
+\if :skip_test
+\quit
+\endif
+
 CREATE SCHEMA has$dollar;
 
 -- test some errors
#2Alexander Korotkov
aekorotkov@gmail.com
In reply to: Oleg Tselebrovskiy (#1)
1 attachment(s)
Re: ICU warnings during make installcheck and text_extensions test

Hi, Oleg!

On Mon, Mar 31, 2025 at 9:48 AM Oleg Tselebrovskiy
<o.tselebrovskiy@postgrespro.ru> wrote:

When you try to run installcheck using a cluster that was initialized
with ICU (./initdb -D ../data --locale-provider=icu
--icu-locale='en_US_POSIX') and NO_LOCALE=1 you get a warning:

# +++ regress install-check in src/test/regress +++
# using postmaster on Unix socket, default port
WARNING: could not convert locale name "C" to language tag:
U_ILLEGAL_ARGUMENT_ERROR
WARNING: ICU locale "C" has unknown language "c"
HINT: To disable ICU locale validation, set the parameter
"icu_validation_level" to "disabled".

This happens with PostgreSQL 16, 17 and master

While this case is somewhat superficial (you need ICU and NO_LOCALE),
when you try to run installcheck of test_extensions test module (using a
cluster initialized with ICU) you will get the same warnings due to
NO_LOCALE=1 in Makefile

There could be an argument "if you are using ICU and don't want warnings
just set icu_validation_level=disabled in postgresql.conf", but then
installcheck fails because of collate.icu.utf8 like this:

CREATE COLLATION testx (provider = icu, locale =
'@colStrength=primary;nonsense=yes'); DROP COLLATION testx;
-WARNING: could not convert locale name
"@colStrength=primary;nonsense=yes" to language tag:
U_ILLEGAL_ARGUMENT_ERROR
CREATE COLLATION testx (provider = icu, locale = 'nonsense-nowhere');
DROP COLLATION testx;
-WARNING: ICU locale "nonsense-nowhere" has unknown language
"nonsense"
-HINT: To disable ICU locale validation, set the parameter
"icu_validation_level" to "disabled".
CREATE COLLATION test4 FROM nonsense;

And we definitely want to check warnings there

So for now I propose adding icu_validation_level=disabled to pg_regress
when we were passed NO_LOCALE=1 (patch is attached)

Also since locale 'C' isn't converted to anything with ICU since
f3a01af, maybe we want to somehow handle NO_LOCALE with ICU in a special
way? Maybe only during tests?

On another none, test test_extensions from
src/test/modules/test_extensions fails during installcheck when the
cluster was initialized with ICU locale. It was already reported at [1].
This test fails on current REL_17_STABLE (2530367) and master (2a5e709)

One way to fix it is to just skip the test if we find ICU locale
provider (patch attached bellow), but I'm not sure this is the optimal
way. In [1] there was an attempt to replace \dx+ with function with
collation-independent output

Thank you for raising this issue. I don't think ignoring a warning is
an option. The tests contain locale-sensitive orderings. Thus, if we
don't manage to create a C-like locale, tests fail anyway for me.
Ignoring tests is an unfavorable solution.

I see two (better) options to resolve this issue:
1) Specify LOCALE_PROVIDER='builtin' in the CREATE DATABASE command.
2) Specify 'en-US-u-va-posix' as a locale name when template0 has an
ICU locale provider.

The #1 looks simpler. The patch is attached. What do you think?

------
Regards,
Alexander Korotkov
Supabase

Attachments:

pg_regress_locale_provider.patchapplication/octet-stream; name=pg_regress_locale_provider.patchDownload
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 5d85dcc62f0..61c035a3983 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -1968,10 +1968,10 @@ create_database(const char *dbname)
 	 */
 	if (encoding)
 		psql_add_command(buf, "CREATE DATABASE \"%s\" TEMPLATE=template0 ENCODING='%s'%s", dbname, encoding,
-						 (nolocale) ? " LOCALE='C'" : "");
+						 (nolocale) ? " LOCALE='C' LOCALE_PROVIDER='builtin'" : "");
 	else
 		psql_add_command(buf, "CREATE DATABASE \"%s\" TEMPLATE=template0%s", dbname,
-						 (nolocale) ? " LOCALE='C'" : "");
+						 (nolocale) ? " LOCALE='C' LOCALE_PROVIDER='builtin'" : "");
 	psql_add_command(buf,
 					 "ALTER DATABASE \"%s\" SET lc_messages TO 'C';"
 					 "ALTER DATABASE \"%s\" SET lc_monetary TO 'C';"
#3Oleg Tselebrovskiy
o.tselebrovskiy@postgrespro.ru
In reply to: Alexander Korotkov (#2)
1 attachment(s)
Re: ICU warnings during make installcheck and text_extensions test

Alexander Korotkov wrote at 2025-07-28 20:04:

Hi, Oleg!

Thank you for raising this issue. I don't think ignoring a warning is
an option. The tests contain locale-sensitive orderings. Thus, if we
don't manage to create a C-like locale, tests fail anyway for me.
Ignoring tests is an unfavorable solution.

I see two (better) options to resolve this issue:
1) Specify LOCALE_PROVIDER='builtin' in the CREATE DATABASE command.
2) Specify 'en-US-u-va-posix' as a locale name when template0 has an
ICU locale provider.

The #1 looks simpler. The patch is attached. What do you think?

------
Regards,
Alexander Korotkov
Supabase

Thanks for your response!

Your patch works with REL_17 & master, but not with REL_16, since there
is no builtin provider

So if we're going that route, for PostgreSQL 16 and older we could just
use libc provider instead of builtin for the same effect (see attached)

I've run installcheck-world for both 'builtin' and 'libc', both seem to
work fine

Dunno what about tests like collate.icu.utf8.sql that require icu
databases to run. Will those tests be run if we force non-icu locale
providers in pg_regress? We probaly want them to be run sometimes,
right? Except this, LGTM

Regards, Oleg Tselebrovskiy

Attachments:

pg_regress_locale_provider_pg16.patchtext/x-diff; name=pg_regress_locale_provider_pg16.patchDownload
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index b49ff7913a7..32c908ca6d1 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -1973,10 +1973,10 @@ create_database(const char *dbname)
 	 */
 	if (encoding)
 		psql_add_command(buf, "CREATE DATABASE \"%s\" TEMPLATE=template0 ENCODING='%s'%s", dbname, encoding,
-						 (nolocale) ? " LOCALE='C'" : "");
+						 (nolocale) ? " LOCALE='C' LOCALE_PROVIDER='libc'" : "");
 	else
 		psql_add_command(buf, "CREATE DATABASE \"%s\" TEMPLATE=template0%s", dbname,
-						 (nolocale) ? " LOCALE='C'" : "");
+						 (nolocale) ? " LOCALE='C' LOCALE_PROVIDER='libc'" : "");
 	psql_add_command(buf,
 					 "ALTER DATABASE \"%s\" SET lc_messages TO 'C';"
 					 "ALTER DATABASE \"%s\" SET lc_monetary TO 'C';"
#4Alexander Korotkov
aekorotkov@gmail.com
In reply to: Oleg Tselebrovskiy (#3)
Re: ICU warnings during make installcheck and text_extensions test

On Tue, Jul 29, 2025 at 12:45 PM Oleg Tselebrovskiy
<o.tselebrovskiy@postgrespro.ru> wrote:

Thanks for your response!

Your patch works with REL_17 & master, but not with REL_16, since there
is no builtin provider

So if we're going that route, for PostgreSQL 16 and older we could just
use libc provider instead of builtin for the same effect (see attached)

I've run installcheck-world for both 'builtin' and 'libc', both seem to
work fine

Great. I'm not sure if we want this to be backpatched. Yet pushed
this to master.

Dunno what about tests like collate.icu.utf8.sql that require icu
databases to run. Will those tests be run if we force non-icu locale
providers in pg_regress? We probaly want them to be run sometimes,
right? Except this, LGTM

collate.icu.utf8.sql have a protection against running in
inappropriate locale in the top. It's OK to run it with any locale.
If the locale doesn't match then it will be stopped in the beginning
and alternative output collate.icu.utf8_1.out matched.

------
Regards,
Alexander Korotkov
Supabase

#5Oleg Tselebrovskiy
o.tselebrovskiy@postgrespro.ru
In reply to: Alexander Korotkov (#4)
Re: ICU warnings during make installcheck and text_extensions test

Alexander Korotkov wrote at 2025-09-14 00:59:

collate.icu.utf8.sql have a protection against running in
inappropriate locale in the top. It's OK to run it with any locale.
If the locale doesn't match then it will be stopped in the beginning
and alternative output collate.icu.utf8_1.out matched.

Thanks for the info!

Great. I'm not sure if we want this to be backpatched. Yet pushed
this to master.

Thanks for pushing this to master! But why not backpatch this?
Some of my coworkers use make installcheck extensively, and
have complained about the warnings. Those warnings could possibly
break some automated testing systems/scripts, but I haven't seen it.

And since PG16-17 will still be supported for a while, those warnings
will exist for this exact while :)

Also, to me the patch seems low-risk. Are such annoyances backpatched
often?

Regards, Oleg Tselebrovskiy